Skip to main content

Multi-Tenancy Guide

ThreatWeaver uses a schema-per-tenant isolation model in PostgreSQL. Each tenant gets their own database schema (e.g., tenant_blucypher, tenant_acme) while sharing the same PostgreSQL instance and connection pool. This guide covers how it works from a developer perspective.


Architecture Overview​

Key Components​

FilePurpose
backend/src/multi-tenant/schema-manager.tsCreates/releases tenant-scoped QueryRunners with SET search_path
backend/src/multi-tenant/tenant-local-storage.tsAsyncLocalStorage-based context propagation for per-request tenant scope
backend/src/multi-tenant/tenant-context.tsTypeScript interface for TenantContext
backend/src/middleware/resolve-tenant.tsResolves tenant metadata from JWT tenantId
backend/src/middleware/set-tenant-schema.tsCreates request-scoped QueryRunner and wraps next() in AsyncLocalStorage
backend/src/middleware/enforce-tenancy.tsValidates tenant schema against license

How Schema-Per-Tenant Works​

1. Request Arrives​

When an HTTP request arrives, the middleware chain in backend/src/index.ts runs in this order:

authenticate -> resolveTenant -> setTenantSchema -> enforceTenancy

2. Tenant Resolution​

The authenticate middleware decodes the JWT and extracts tenantId. The resolveTenant middleware uses this to look up the tenant's metadata (schema name, plan, limits) from the TLM (Tenant Lifecycle Manager) or a local cache.

The result is attached to the request as req.tenantContext:

// From backend/src/multi-tenant/tenant-context.ts
export interface TenantContext {
tenantId: string // UUID
tenantSlug: string // e.g., 'blucypher'
tenantPlan: string // 'starter' | 'pro' | 'enterprise'
schemaName: string // e.g., 'tenant_blucypher'
licenseJti: string // License JWT ID
planLimits: PlanLimits // Rate limits, max users, etc.
status: string // 'active' | 'suspended'
allowedModules?: string[]
maxUsers?: number
}

3. Schema Switching​

The setTenantSchema middleware (from backend/src/middleware/set-tenant-schema.ts) does the critical work:

export function setTenantSchema(schemaManager: SchemaManager) {
return async (req: Request, res: Response, next: NextFunction) => {
if (!req.tenantContext) return next()
if (req.tenantContext.schemaName === 'public') return next()

// Create a dedicated QueryRunner with search_path set to tenant schema
const queryRunner = await schemaManager.createTenantQueryRunner(
req.tenantContext.schemaName
)
req.tenantQueryRunner = queryRunner

// Set RLS session variable
await queryRunner.query(
'SELECT set_config($1, $2, false)',
['app.current_tenant_id', req.tenantContext.tenantId]
)

// Auto-release QueryRunner when response finishes
let released = false
const cleanup = async () => {
if (released) return
released = true
await schemaManager.releaseTenantQueryRunner(queryRunner)
}
res.on('finish', cleanup)
res.on('close', cleanup)

// Wrap next() in AsyncLocalStorage
tenantStorage.run(
{ entityManager: queryRunner.manager, tenantContext: req.tenantContext },
() => next()
)
}
}

4. What createTenantQueryRunner Does​

From backend/src/multi-tenant/schema-manager.ts:

async createTenantQueryRunner(schemaName: string): Promise<QueryRunner> {
// Validate schema name (prevent SQL injection)
if (!/^[a-zA-Z_][a-zA-Z0-9_-]{0,62}$/.test(schemaName)) {
throw new Error(`Invalid schema name: ${schemaName}`)
}

const queryRunner = this.dataSource.createQueryRunner()
await queryRunner.connect()

// SET search_path ensures all unqualified table references
// resolve to the tenant schema. 'public' fallback allows
// access to PostgreSQL extensions (uuid-ossp, pgcrypto).
await queryRunner.query(
`SET search_path TO "${schemaName}", public`
)

return queryRunner
}

5. AsyncLocalStorage Propagation​

The tenantStorage (an AsyncLocalStorage instance in backend/src/multi-tenant/tenant-local-storage.ts) makes the tenant's EntityManager available to all async code within the request without explicit parameter passing:

export const tenantStorage = new AsyncLocalStorage<TenantStore>()

export function getTenantEntityManager(): EntityManager | null {
return tenantStorage.getStore()?.entityManager ?? null
}

This means any service called during the request can access the correct tenant's data by simply calling getTenantAwareRepository().


How to Write Tenant-Aware Queries​

The Golden Rule​

Always use getTenantAwareRepository() instead of AppDataSource.getRepository().

// CORRECT -- tenant-aware
import { getTenantAwareRepository } from '../multi-tenant/tenant-local-storage.js'

async function getUsers(): Promise<User[]> {
const repo = getTenantAwareRepository(User)
return repo.find()
}

// WRONG -- queries the public schema, bypasses tenant isolation
import { AppDataSource } from '../config/database.js'

async function getUsers(): Promise<User[]> {
const repo = AppDataSource.getRepository(User) // BUG: cross-tenant leak
return repo.find()
}

Repository vs. Raw Query​

// Repository approach (preferred for most operations)
const repo = getTenantAwareRepository(Vulnerability)
const vulns = await repo.find({
where: { severity: 'critical' },
order: { createdAt: 'DESC' },
take: 100,
})

// Raw query approach (for complex SQL)
const query = getTenantAwareQuery()
const stats = await query(
`SELECT severity, COUNT(*) as count
FROM vulnerabilities
WHERE status = $1
GROUP BY severity`,
['open']
)

Both getTenantAwareRepository() and getTenantAwareQuery() follow the same pattern:

  • Inside a tenant request: uses the tenant-scoped EntityManager (search_path set)
  • Outside a request (background jobs, startup): falls back to AppDataSource (public schema)

Creating Records​

async function createAsset(data: Partial<Asset>): Promise<Asset> {
const repo = getTenantAwareRepository(Asset)
const asset = repo.create(data)
return repo.save(asset) // Saved to the tenant's schema automatically
}

QueryBuilder​

const repo = getTenantAwareRepository(Vulnerability)
const results = await repo.createQueryBuilder('v')
.select(['v.id', 'v.severity', 'v.pluginName'])
.where('v.severity IN (:...severities)', { severities: ['critical', 'high'] })
.andWhere('v.state = :state', { state: 'OPEN' })
.orderBy('v.createdAt', 'DESC')
.take(50)
.getMany()

Common Pitfalls​

Pitfall 1: Forgetting to Release QueryRunner​

The setTenantSchema middleware handles this automatically via res.on('finish') and res.on('close') cleanup hooks. However, if you create your own QueryRunner manually (e.g., for background jobs), you must release it:

// If you create a QueryRunner manually, ALWAYS release it
const queryRunner = await schemaManager.createTenantQueryRunner('tenant_blucypher')
try {
// ... do work ...
} finally {
await schemaManager.releaseTenantQueryRunner(queryRunner)
}

The releaseTenantQueryRunner method resets both the tenant GUC and the search_path before returning the connection to the pool:

async releaseTenantQueryRunner(queryRunner: QueryRunner): Promise<void> {
try {
if (queryRunner && !queryRunner.isReleased) {
await queryRunner.query(`SELECT set_config('app.current_tenant_id', '', false)`)
await queryRunner.query(`SET search_path TO public`)
await queryRunner.release()
}
} catch (err) {
console.error('Error releasing tenant queryRunner:', err)
}
}

If you do not release, the connection pool will eventually be exhausted, causing Cannot acquire connection from pool errors.

Pitfall 2: Cross-Tenant Data Leaks​

Using AppDataSource.getRepository() directly bypasses tenant isolation. All queries go to the public schema, potentially exposing or mixing data between tenants.

Detection: Search your code for AppDataSource.getRepository and replace with getTenantAwareRepository. The only valid uses of AppDataSource.getRepository are:

  • Startup/initialization code (no tenant context exists yet)
  • Internal/admin routes that operate across all tenants
  • Background jobs that explicitly manage their own schema context

Pitfall 3: Background Jobs Without Tenant Context​

If a background job (cron, queue worker) needs to access tenant data, it must manually set the search_path:

import { SchemaManager } from '../multi-tenant/schema-manager.js'
import { AppDataSource } from '../config/database.js'

const schemaManager = new SchemaManager(AppDataSource)

async function backgroundJob(tenantSlug: string) {
const schemaName = `tenant_${tenantSlug}`
const queryRunner = await schemaManager.createTenantQueryRunner(schemaName)

try {
// All queries via queryRunner.manager go to the tenant schema
const users = await queryRunner.manager.getRepository(User).find()
// ... process users ...
} finally {
await schemaManager.releaseTenantQueryRunner(queryRunner)
}
}

Pitfall 4: Stale Context in Async Callbacks​

AsyncLocalStorage propagates through async/await and Promise chains, but it does not propagate through:

  • setTimeout / setInterval callbacks
  • Event emitter handlers registered before the AsyncLocalStorage.run() call
  • Worker threads

If you need tenant context in a timer or event handler, capture it explicitly:

// Capture context before the async boundary
const tenantContext = tenantStorage.getStore()?.tenantContext

setTimeout(() => {
// tenantContext is available here via closure
// But getTenantAwareRepository() will NOT work here
// You must use the captured context to create a QueryRunner manually
}, 5000)

Testing Multi-Tenant Code Locally​

Option 1: Single-Tenant Mode (Default)​

For most development, you do not need multi-tenancy. Leave DEPLOYMENT_MODE unset in backend/.env. Everything runs on the public schema. getTenantAwareRepository() falls back to AppDataSource.getRepository().

Option 2: Multi-Tenant Mode Locally​

To test multi-tenancy:

  1. Set environment variables in backend/.env:
DEPLOYMENT_MODE=dedicated
DEDICATED_TENANT_SLUG=localdev
DEDICATED_TENANT_PLAN=enterprise
  1. The dedicated mode creates a virtual tenant context using the public schema -- no separate schema creation needed. This exercises the middleware chain without requiring TLM setup.

  2. For full SaaS mode testing (multiple schemas):

DEPLOYMENT_MODE=saas
TLM_BASE_URL=http://localhost:4010
TLM_VENDOR_API_KEY=your-test-key

Then provision tenant schemas:

-- Create a tenant schema
CREATE SCHEMA IF NOT EXISTS "tenant_testco";

-- Copy table structure from public schema
-- (or run migrations targeting the schema)

Debugging: "Your Query Is Hitting the Wrong Schema"​

Symptoms​

  • You see data from another tenant (or no data when data should exist)
  • Queries return unexpected results
  • relation "some_table" does not exist errors in a multi-tenant environment

Diagnostic Steps​

Step 1: Check which schema your connection is using

-- Run this from your application (via getTenantAwareQuery)
SHOW search_path;

-- Expected for tenant_blucypher:
-- "tenant_blucypher", public

Step 2: Verify the middleware chain ran

Add temporary logging to setTenantSchema:

console.log('[TENANT] Setting schema to:', req.tenantContext?.schemaName)

If this does not log, the middleware chain was skipped -- check that your route is not excluded in the skip lists in backend/src/index.ts.

Step 3: Check if you are using the correct repository function

Search your file for AppDataSource.getRepository -- this bypasses tenant isolation:

grep -n "AppDataSource.getRepository" backend/src/services/myService.ts

Replace with getTenantAwareRepository.

Step 4: Check the JWT

Decode the JWT token and verify it contains the correct tenantId:

# Decode a JWT (base64 decode the payload)
echo "<jwt-token>" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

Look for:

{
"userId": "...",
"tenantId": "expected-tenant-uuid",
"tenantSlug": "blucypher"
}

If tenantId is missing, the user logged in before multi-tenancy was enabled. They need to log in again to get a tenant-bound token.

Step 5: Check the database directly

-- List all schemas
SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'tenant_%';

-- Check tables in a specific schema
SELECT table_name FROM information_schema.tables WHERE table_schema = 'tenant_blucypher';

-- Query a specific schema directly
SELECT count(*) FROM "tenant_blucypher".users;

Step 6: Check for connection pool contamination

If a QueryRunner is not released properly, a pooled connection may retain the previous tenant's search_path. Look for:

  • Missing finally blocks that release QueryRunners
  • Manual QueryRunner creation without corresponding cleanup
  • Errors in the releaseTenantQueryRunner cleanup (check logs for "Error releasing tenant queryRunner")

The releaseTenantQueryRunner method resets both app.current_tenant_id and search_path to prevent context bleed:

await queryRunner.query(`SELECT set_config('app.current_tenant_id', '', false)`)
await queryRunner.query(`SET search_path TO public`)

Schema Safety Features​

Input Validation​

Schema names are validated with a strict regex before any SQL execution:

if (!/^[a-zA-Z_][a-zA-Z0-9_-]{0,62}$/.test(schemaName)) {
throw new Error(`Invalid schema name: ${schemaName}`)
}

Drop Protection​

The dropTenantSchema method refuses to drop schemas that do not start with tenant_:

async dropTenantSchema(schemaName: string): Promise<void> {
if (!schemaName.startsWith('tenant_')) {
throw new Error('Refusing to drop non-tenant schema: ' + schemaName)
}
await this.dataSource.query(`DROP SCHEMA IF EXISTS "${schemaName}" CASCADE`)
}

RLS (Row-Level Security)​

As defense-in-depth, the middleware sets app.current_tenant_id at the PostgreSQL session level. RLS policies can use this GUC to restrict row access even if a search_path bug occurs:

-- Example RLS policy (applied during schema creation)
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON users
USING (tenant_id = current_setting('app.current_tenant_id'));

Plan Limits​

Each tenant plan has built-in rate and resource limits defined in tenant-context.ts:

export const DEFAULT_PLAN_LIMITS: Record<string, PlanLimits> = {
starter: { requestsPerMinute: 100, requestsPerDay: 10000, maxUsers: 10, maxAgents: 5 },
pro: { requestsPerMinute: 500, requestsPerDay: 50000, maxUsers: 50, maxAgents: 20 },
enterprise: { requestsPerMinute: 2000, requestsPerDay: 200000, maxUsers: 500, maxAgents: 100 },
}

These limits are enforced by the tenantRateLimiter and usageMeter middleware.