Multi-Tenancy Guide

ThreatWeaver uses a schema-per-tenant isolation model in PostgreSQL. Each tenant gets their own database schema (e.g., tenant_blucypher, tenant_acme) while sharing the same PostgreSQL instance and connection pool. This guide covers how it works from a developer perspective.

Architecture Overview

Key Components

File	Purpose
`backend/src/multi-tenant/schema-manager.ts`	Creates/releases tenant-scoped QueryRunners with `SET search_path`
`backend/src/multi-tenant/tenant-local-storage.ts`	AsyncLocalStorage-based context propagation for per-request tenant scope
`backend/src/multi-tenant/tenant-context.ts`	TypeScript interface for `TenantContext`
`backend/src/middleware/resolve-tenant.ts`	Resolves tenant metadata from JWT `tenantId`
`backend/src/middleware/set-tenant-schema.ts`	Creates request-scoped QueryRunner and wraps `next()` in AsyncLocalStorage
`backend/src/middleware/enforce-tenancy.ts`	Validates tenant schema against license

How Schema-Per-Tenant Works

1. Request Arrives

When an HTTP request arrives, the middleware chain in backend/src/index.ts runs in this order:

authenticate -> resolveTenant -> setTenantSchema -> enforceTenancy

2. Tenant Resolution

The authenticate middleware decodes the JWT and extracts tenantId. The resolveTenant middleware uses this to look up the tenant's metadata (schema name, plan, limits) from the TLM (Tenant Lifecycle Manager) or a local cache.

The result is attached to the request as req.tenantContext:

// From backend/src/multi-tenant/tenant-context.ts
export interface TenantContext {
    tenantId: string         // UUID
    tenantSlug: string       // e.g., 'blucypher'
    tenantPlan: string       // 'starter' | 'pro' | 'enterprise'
    schemaName: string       // e.g., 'tenant_blucypher'
    licenseJti: string       // License JWT ID
    planLimits: PlanLimits   // Rate limits, max users, etc.
    status: string           // 'active' | 'suspended'
    allowedModules?: string[]
    maxUsers?: number
}

3. Schema Switching

The setTenantSchema middleware (from backend/src/middleware/set-tenant-schema.ts) does the critical work:

export function setTenantSchema(schemaManager: SchemaManager) {
    return async (req: Request, res: Response, next: NextFunction) => {
        if (!req.tenantContext) return next()
        if (req.tenantContext.schemaName === 'public') return next()

        // Create a dedicated QueryRunner with search_path set to tenant schema
        const queryRunner = await schemaManager.createTenantQueryRunner(
            req.tenantContext.schemaName
        )
        req.tenantQueryRunner = queryRunner

        // Set RLS session variable
        await queryRunner.query(
            'SELECT set_config($1, $2, false)',
            ['app.current_tenant_id', req.tenantContext.tenantId]
        )

        // Auto-release QueryRunner when response finishes
        let released = false
        const cleanup = async () => {
            if (released) return
            released = true
            await schemaManager.releaseTenantQueryRunner(queryRunner)
        }
        res.on('finish', cleanup)
        res.on('close', cleanup)

        // Wrap next() in AsyncLocalStorage
        tenantStorage.run(
            { entityManager: queryRunner.manager, tenantContext: req.tenantContext },
            () => next()
        )
    }
}

4. What `createTenantQueryRunner` Does

From backend/src/multi-tenant/schema-manager.ts:

async createTenantQueryRunner(schemaName: string): Promise<QueryRunner> {
    // Validate schema name (prevent SQL injection)
    if (!/^[a-zA-Z_][a-zA-Z0-9_-]{0,62}$/.test(schemaName)) {
        throw new Error(`Invalid schema name: ${schemaName}`)
    }

    const queryRunner = this.dataSource.createQueryRunner()
    await queryRunner.connect()

    // SET search_path ensures all unqualified table references
    // resolve to the tenant schema. 'public' fallback allows
    // access to PostgreSQL extensions (uuid-ossp, pgcrypto).
    await queryRunner.query(
        `SET search_path TO "${schemaName}", public`
    )

    return queryRunner
}

5. AsyncLocalStorage Propagation

The tenantStorage (an AsyncLocalStorage instance in backend/src/multi-tenant/tenant-local-storage.ts) makes the tenant's EntityManager available to all async code within the request without explicit parameter passing:

export const tenantStorage = new AsyncLocalStorage<TenantStore>()

export function getTenantEntityManager(): EntityManager | null {
    return tenantStorage.getStore()?.entityManager ?? null
}

This means any service called during the request can access the correct tenant's data by simply calling getTenantAwareRepository().

How to Write Tenant-Aware Queries

The Golden Rule

Always use getTenantAwareRepository() instead of AppDataSource.getRepository().

// CORRECT -- tenant-aware
import { getTenantAwareRepository } from '../multi-tenant/tenant-local-storage.js'

async function getUsers(): Promise<User[]> {
    const repo = getTenantAwareRepository(User)
    return repo.find()
}

// WRONG -- queries the public schema, bypasses tenant isolation
import { AppDataSource } from '../config/database.js'

async function getUsers(): Promise<User[]> {
    const repo = AppDataSource.getRepository(User)  // BUG: cross-tenant leak
    return repo.find()
}

Repository vs. Raw Query

// Repository approach (preferred for most operations)
const repo = getTenantAwareRepository(Vulnerability)
const vulns = await repo.find({
    where: { severity: 'critical' },
    order: { createdAt: 'DESC' },
    take: 100,
})

// Raw query approach (for complex SQL)
const query = getTenantAwareQuery()
const stats = await query(
    `SELECT severity, COUNT(*) as count
     FROM vulnerabilities
     WHERE status = $1
     GROUP BY severity`,
    ['open']
)

Both getTenantAwareRepository() and getTenantAwareQuery() follow the same pattern:

Inside a tenant request: uses the tenant-scoped EntityManager (search_path set)
Outside a request (background jobs, startup): falls back to AppDataSource (public schema)

Creating Records

async function createAsset(data: Partial<Asset>): Promise<Asset> {
    const repo = getTenantAwareRepository(Asset)
    const asset = repo.create(data)
    return repo.save(asset)  // Saved to the tenant's schema automatically
}

QueryBuilder

const repo = getTenantAwareRepository(Vulnerability)
const results = await repo.createQueryBuilder('v')
    .select(['v.id', 'v.severity', 'v.pluginName'])
    .where('v.severity IN (:...severities)', { severities: ['critical', 'high'] })
    .andWhere('v.state = :state', { state: 'OPEN' })
    .orderBy('v.createdAt', 'DESC')
    .take(50)
    .getMany()

Common Pitfalls

Pitfall 1: Forgetting to Release QueryRunner

The setTenantSchema middleware handles this automatically via res.on('finish') and res.on('close') cleanup hooks. However, if you create your own QueryRunner manually (e.g., for background jobs), you must release it:

// If you create a QueryRunner manually, ALWAYS release it
const queryRunner = await schemaManager.createTenantQueryRunner('tenant_blucypher')
try {
    // ... do work ...
} finally {
    await schemaManager.releaseTenantQueryRunner(queryRunner)
}

The releaseTenantQueryRunner method resets both the tenant GUC and the search_path before returning the connection to the pool:

async releaseTenantQueryRunner(queryRunner: QueryRunner): Promise<void> {
    try {
        if (queryRunner && !queryRunner.isReleased) {
            await queryRunner.query(`SELECT set_config('app.current_tenant_id', '', false)`)
            await queryRunner.query(`SET search_path TO public`)
            await queryRunner.release()
        }
    } catch (err) {
        console.error('Error releasing tenant queryRunner:', err)
    }
}

If you do not release, the connection pool will eventually be exhausted, causing Cannot acquire connection from pool errors.

Pitfall 2: Cross-Tenant Data Leaks

Using AppDataSource.getRepository() directly bypasses tenant isolation. All queries go to the public schema, potentially exposing or mixing data between tenants.

Detection: Search your code for AppDataSource.getRepository and replace with getTenantAwareRepository. The only valid uses of AppDataSource.getRepository are:

Startup/initialization code (no tenant context exists yet)
Internal/admin routes that operate across all tenants
Background jobs that explicitly manage their own schema context

Pitfall 3: Background Jobs Without Tenant Context

If a background job (cron, queue worker) needs to access tenant data, it must manually set the search_path:

import { SchemaManager } from '../multi-tenant/schema-manager.js'
import { AppDataSource } from '../config/database.js'

const schemaManager = new SchemaManager(AppDataSource)

async function backgroundJob(tenantSlug: string) {
    const schemaName = `tenant_${tenantSlug}`
    const queryRunner = await schemaManager.createTenantQueryRunner(schemaName)

    try {
        // All queries via queryRunner.manager go to the tenant schema
        const users = await queryRunner.manager.getRepository(User).find()
        // ... process users ...
    } finally {
        await schemaManager.releaseTenantQueryRunner(queryRunner)
    }
}

Pitfall 4: Stale Context in Async Callbacks

AsyncLocalStorage propagates through async/await and Promise chains, but it does not propagate through:

setTimeout / setInterval callbacks
Event emitter handlers registered before the AsyncLocalStorage.run() call
Worker threads

If you need tenant context in a timer or event handler, capture it explicitly:

// Capture context before the async boundary
const tenantContext = tenantStorage.getStore()?.tenantContext

setTimeout(() => {
    // tenantContext is available here via closure
    // But getTenantAwareRepository() will NOT work here
    // You must use the captured context to create a QueryRunner manually
}, 5000)

Testing Multi-Tenant Code Locally

Option 1: Single-Tenant Mode (Default)

For most development, you do not need multi-tenancy. Leave DEPLOYMENT_MODE unset in backend/.env. Everything runs on the public schema. getTenantAwareRepository() falls back to AppDataSource.getRepository().

Option 2: Multi-Tenant Mode Locally

To test multi-tenancy:

Set environment variables in backend/.env:

DEPLOYMENT_MODE=dedicated
DEDICATED_TENANT_SLUG=localdev
DEDICATED_TENANT_PLAN=enterprise

The dedicated mode creates a virtual tenant context using the public schema -- no separate schema creation needed. This exercises the middleware chain without requiring TLM setup.
For full SaaS mode testing (multiple schemas):

DEPLOYMENT_MODE=saas
TLM_BASE_URL=http://localhost:4010
TLM_VENDOR_API_KEY=your-test-key

Then provision tenant schemas:

-- Create a tenant schema
CREATE SCHEMA IF NOT EXISTS "tenant_testco";

-- Copy table structure from public schema
-- (or run migrations targeting the schema)

Debugging: "Your Query Is Hitting the Wrong Schema"

Symptoms

You see data from another tenant (or no data when data should exist)
Queries return unexpected results
relation "some_table" does not exist errors in a multi-tenant environment

Diagnostic Steps

Step 1: Check which schema your connection is using

-- Run this from your application (via getTenantAwareQuery)
SHOW search_path;

-- Expected for tenant_blucypher:
-- "tenant_blucypher", public

Step 2: Verify the middleware chain ran

Add temporary logging to setTenantSchema:

console.log('[TENANT] Setting schema to:', req.tenantContext?.schemaName)

If this does not log, the middleware chain was skipped -- check that your route is not excluded in the skip lists in backend/src/index.ts.

Step 3: Check if you are using the correct repository function

Search your file for AppDataSource.getRepository -- this bypasses tenant isolation:

grep -n "AppDataSource.getRepository" backend/src/services/myService.ts

Replace with getTenantAwareRepository.

Step 4: Check the JWT

Decode the JWT token and verify it contains the correct tenantId:

# Decode a JWT (base64 decode the payload)
echo "<jwt-token>" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

Look for:

{
    "userId": "...",
    "tenantId": "expected-tenant-uuid",
    "tenantSlug": "blucypher"
}

If tenantId is missing, the user logged in before multi-tenancy was enabled. They need to log in again to get a tenant-bound token.

Step 5: Check the database directly

-- List all schemas
SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'tenant_%';

-- Check tables in a specific schema
SELECT table_name FROM information_schema.tables WHERE table_schema = 'tenant_blucypher';

-- Query a specific schema directly
SELECT count(*) FROM "tenant_blucypher".users;

Step 6: Check for connection pool contamination

If a QueryRunner is not released properly, a pooled connection may retain the previous tenant's search_path. Look for:

Missing finally blocks that release QueryRunners
Manual QueryRunner creation without corresponding cleanup
Errors in the releaseTenantQueryRunner cleanup (check logs for "Error releasing tenant queryRunner")

The releaseTenantQueryRunner method resets both app.current_tenant_id and search_path to prevent context bleed:

await queryRunner.query(`SELECT set_config('app.current_tenant_id', '', false)`)
await queryRunner.query(`SET search_path TO public`)

Schema Safety Features

Input Validation

Schema names are validated with a strict regex before any SQL execution:

if (!/^[a-zA-Z_][a-zA-Z0-9_-]{0,62}$/.test(schemaName)) {
    throw new Error(`Invalid schema name: ${schemaName}`)
}

Drop Protection

The dropTenantSchema method refuses to drop schemas that do not start with tenant_:

async dropTenantSchema(schemaName: string): Promise<void> {
    if (!schemaName.startsWith('tenant_')) {
        throw new Error('Refusing to drop non-tenant schema: ' + schemaName)
    }
    await this.dataSource.query(`DROP SCHEMA IF EXISTS "${schemaName}" CASCADE`)
}

RLS (Row-Level Security)

As defense-in-depth, the middleware sets app.current_tenant_id at the PostgreSQL session level. RLS policies can use this GUC to restrict row access even if a search_path bug occurs:

-- Example RLS policy (applied during schema creation)
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON users
    USING (tenant_id = current_setting('app.current_tenant_id'));

Plan Limits

Each tenant plan has built-in rate and resource limits defined in tenant-context.ts:

export const DEFAULT_PLAN_LIMITS: Record<string, PlanLimits> = {
    starter:    { requestsPerMinute: 100,  requestsPerDay: 10000,  maxUsers: 10,  maxAgents: 5   },
    pro:        { requestsPerMinute: 500,  requestsPerDay: 50000,  maxUsers: 50,  maxAgents: 20  },
    enterprise: { requestsPerMinute: 2000, requestsPerDay: 200000, maxUsers: 500, maxAgents: 100 },
}

These limits are enforced by the tenantRateLimiter and usageMeter middleware.

Architecture Overview​

Key Components​

How Schema-Per-Tenant Works​

1. Request Arrives​

2. Tenant Resolution​

3. Schema Switching​

4. What createTenantQueryRunner Does​

5. AsyncLocalStorage Propagation​

How to Write Tenant-Aware Queries​

The Golden Rule​

Repository vs. Raw Query​

Creating Records​

QueryBuilder​

Common Pitfalls​

Pitfall 1: Forgetting to Release QueryRunner​

Pitfall 2: Cross-Tenant Data Leaks​

Pitfall 3: Background Jobs Without Tenant Context​

Pitfall 4: Stale Context in Async Callbacks​

Testing Multi-Tenant Code Locally​

Option 1: Single-Tenant Mode (Default)​

Option 2: Multi-Tenant Mode Locally​

Debugging: "Your Query Is Hitting the Wrong Schema"​

Symptoms​

Diagnostic Steps​

Schema Safety Features​

Input Validation​

Drop Protection​

RLS (Row-Level Security)​

Plan Limits​