Scaling

This guide covers scaling Enclave for high-throughput workloads, from single-server optimization to distributed multi-pod deployments.

Scaling Dimensions

Dimension	Single Server	Distributed
Concurrency	Worker pool	Multiple runtime pods
Sessions	In-memory	Redis-backed
Tool execution	Local	Distributed via broker
State	Process memory	Redis/external store

Single-Server Scaling

Worker Pool

For CPU-bound workloads, use the worker pool adapter:

import { Enclave, createWorkerPoolAdapter } from '@enclave-vm/core';

const enclave = new Enclave({
  adapter: createWorkerPoolAdapter({
    // Number of worker processes
    poolSize: 4,

    // Memory limit per worker
    maxWorkerMemory: 128 * 1024 * 1024, // 128MB

    // Restart workers after N executions (memory leak prevention)
    maxExecutionsPerWorker: 1000,

    // Worker idle timeout
    idleTimeout: 60000,
  }),

  toolHandler: async (name, args) => {
    return executeTool(name, args);
  },
});

Pool Sizing Guidelines

Workload Type	Recommended Pool Size
CPU-heavy scripts	CPU cores
I/O-heavy (tool calls)	CPU cores × 2
Mixed	CPU cores × 1.5

import os from 'os';

const cpuCount = os.cpus().length;
const poolSize = Math.max(2, Math.floor(cpuCount * 1.5));

Enclave Pooling

Reuse Enclave instances to avoid initialization overhead:

class EnclavePool {
  private pool: Enclave[] = [];
  private available: Set<Enclave> = new Set();
  private waiters: ((enclave: Enclave) => void)[] = [];

  constructor(
    private size: number,
    private config: EnclaveConfig
  ) {
    for (let i = 0; i < size; i++) {
      const enclave = new Enclave(config);
      this.pool.push(enclave);
      this.available.add(enclave);
    }
  }

  async acquire(): Promise<Enclave> {
    // Try to get an available instance
    for (const enclave of this.available) {
      this.available.delete(enclave);
      return enclave;
    }

    // Wait for one to become available
    return new Promise((resolve) => {
      this.waiters.push(resolve);
    });
  }

  release(enclave: Enclave) {
    if (this.waiters.length > 0) {
      const waiter = this.waiters.shift()!;
      waiter(enclave);
    } else {
      this.available.add(enclave);
    }
  }

  async execute(code: string): Promise<ExecutionResult> {
    const enclave = await this.acquire();
    try {
      return await enclave.run(code);
    } finally {
      this.release(enclave);
    }
  }

  dispose() {
    this.pool.forEach(e => e.dispose());
    this.pool = [];
    this.available.clear();
  }
}

Distributed Scaling

Architecture

Broker Configuration

// broker/server.ts
import { EnclaveBroker } from '@enclave-vm/broker';
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

const broker = new EnclaveBroker({
  // Redis for session state
  redis,

  // Tool registry
  tools: toolRegistry,

  // Session limits
  maxConcurrentSessions: 1000,
  sessionTimeout: 300000, // 5 minutes

  // Runtime pool management
  runtimePool: {
    minSize: 2,
    maxSize: 10,
    scaleUpThreshold: 0.8, // Scale up at 80% utilization
    scaleDownThreshold: 0.3, // Scale down at 30% utilization
    scaleInterval: 30000, // Check every 30 seconds
  },
});

broker.listen(3001);

Runtime Configuration

// runtime/server.ts
import { EnclaveRuntime } from '@enclave-vm/runtime';

const runtime = new EnclaveRuntime({
  // Connect to broker
  brokerUrl: process.env.BROKER_URL,

  // Worker configuration
  workerPool: {
    size: 4,
    maxMemory: 128 * 1024 * 1024,
  },

  // Heartbeat
  heartbeatInterval: 5000,

  // Labels for routing
  labels: {
    region: process.env.REGION,
    tier: 'standard',
  },
});

runtime.connect();

Client Configuration

// api/client.ts
import { EnclaveClient } from '@enclave-vm/client';

const client = new EnclaveClient({
  serverUrl: process.env.BROKER_URL,

  // Reconnection
  reconnection: {
    enabled: true,
    maxAttempts: 5,
    backoffMultiplier: 2,
  },

  // Request timeout
  timeout: 60000,
});

Redis Configuration

Session State

// Redis key structure
// session:{sessionId} - Session metadata
// session:{sessionId}:events - Event stream
// session:{sessionId}:tools - Pending tool calls

const redis = new Redis({
  host: process.env.REDIS_HOST,
  port: parseInt(process.env.REDIS_PORT || '6379'),
  password: process.env.REDIS_PASSWORD,

  // Connection pool
  maxRetriesPerRequest: 3,
  enableReadyCheck: true,
  lazyConnect: true,

  // Cluster mode (if using Redis Cluster)
  // cluster: [{ host: 'node1', port: 6379 }],
});

Redis Cluster

For high availability:

import Redis from 'ioredis';

const redis = new Redis.Cluster([
  { host: 'redis-node-1', port: 6379 },
  { host: 'redis-node-2', port: 6379 },
  { host: 'redis-node-3', port: 6379 },
], {
  redisOptions: {
    password: process.env.REDIS_PASSWORD,
  },
  scaleReads: 'slave', // Read from replicas
});

Kubernetes Scaling

Horizontal Pod Autoscaler

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: enclave-runtime-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: enclave-runtime
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Pods
      pods:
        metric:
          name: enclave_active_executions
        target:
          type: AverageValue
          averageValue: 10

Pod Disruption Budget

# pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: enclave-runtime-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: enclave-runtime

Resource Quotas

# quota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: enclave-quota
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "50"

Load Balancing

Sticky Sessions

For WebSocket connections:

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: enclave-broker
  annotations:
    nginx.ingress.kubernetes.io/affinity: "cookie"
    nginx.ingress.kubernetes.io/session-cookie-name: "enclave-session"
    nginx.ingress.kubernetes.io/session-cookie-max-age: "3600"

Runtime Routing

Route to specific runtimes based on workload:

const broker = new EnclaveBroker({
  // Route by labels
  routingStrategy: 'label-match',

  // Routing rules
  routingRules: [
    {
      match: { priority: 'high' },
      target: { tier: 'premium' },
    },
    {
      match: { region: 'us-east' },
      target: { region: 'us-east' },
    },
  ],
});

Performance Benchmarks

Single Server (8 cores, 16GB RAM)

Metric	Value
Simple scripts	500 req/s
With tool calls	200 req/s
P50 latency	15ms
P99 latency	85ms

Distributed (3 runtime pods)

Metric	Value
Simple scripts	1200 req/s
With tool calls	500 req/s
P50 latency	25ms
P99 latency	120ms

Monitoring at Scale

Key Metrics

// Prometheus metrics
const metrics = {
  // Throughput
  executionsPerSecond: new Counter({
    name: 'enclave_executions_per_second',
    help: 'Executions per second',
  }),

  // Latency
  executionLatency: new Histogram({
    name: 'enclave_execution_latency_seconds',
    help: 'Execution latency in seconds',
    buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10],
  }),

  // Queue depth
  queueDepth: new Gauge({
    name: 'enclave_queue_depth',
    help: 'Number of pending executions',
    labelNames: ['runtime'],
  }),

  // Runtime health
  runtimeHealth: new Gauge({
    name: 'enclave_runtime_health',
    help: 'Runtime health status (1=healthy, 0=unhealthy)',
    labelNames: ['runtime_id'],
  }),

  // Memory usage
  workerMemory: new Gauge({
    name: 'enclave_worker_memory_bytes',
    help: 'Worker memory usage in bytes',
    labelNames: ['worker_id'],
  }),
};

Alerting Rules

# prometheus-rules.yaml
groups:
  - name: enclave
    rules:
      - alert: HighExecutionLatency
        expr: histogram_quantile(0.99, enclave_execution_latency_seconds) > 5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: High execution latency detected

      - alert: QueueBacklog
        expr: enclave_queue_depth > 100
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: Execution queue backlog

      - alert: RuntimeUnhealthy
        expr: enclave_runtime_health == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: Runtime pod unhealthy

Best Practices

Start simple - Use worker pool before going distributed
Monitor queue depth - Scale based on pending executions
Set memory limits - Prevent runaway scripts
Use connection pooling - Reuse Redis/DB connections
Implement backpressure - Reject requests when overloaded
Regional deployment - Deploy runtimes close to users
Graceful degradation - Fallback when components fail

Production Deployment - Deployment guide
Worker Pool - Worker pool details
EnclaveJS Broker - Broker configuration

Get Started

Concepts

@enclave-vm/core

@enclave-vm/browser

@enclave-vm/ast

EnclaveJS Streaming

Guides

Examples

Troubleshooting

Scaling Dimensions

Single-Server Scaling

Worker Pool

Pool Sizing Guidelines

Enclave Pooling

Distributed Scaling

Architecture

Broker Configuration

Runtime Configuration

Client Configuration

Redis Configuration

Session State

Redis Cluster

Kubernetes Scaling

Horizontal Pod Autoscaler

Pod Disruption Budget

Resource Quotas

Load Balancing

Sticky Sessions

Runtime Routing

Performance Benchmarks

Single Server (8 cores, 16GB RAM)

Distributed (3 runtime pods)

Monitoring at Scale

Key Metrics

Alerting Rules

Best Practices

Get Started

Concepts

@enclave-vm/core

@enclave-vm/browser

@enclave-vm/ast

EnclaveJS Streaming

Guides

Examples

Troubleshooting

Documentation Index

​Scaling Dimensions

​Single-Server Scaling

​Worker Pool

​Pool Sizing Guidelines

​Enclave Pooling

​Distributed Scaling

​Architecture

​Broker Configuration

​Runtime Configuration

​Client Configuration

​Redis Configuration

​Session State

​Redis Cluster

​Kubernetes Scaling

​Horizontal Pod Autoscaler

​Pod Disruption Budget

​Resource Quotas

​Load Balancing

​Sticky Sessions

​Runtime Routing

​Performance Benchmarks

​Single Server (8 cores, 16GB RAM)

​Distributed (3 runtime pods)

​Monitoring at Scale

​Key Metrics

​Alerting Rules

​Best Practices

​Related

Scaling Dimensions

Single-Server Scaling

Worker Pool

Pool Sizing Guidelines

Enclave Pooling

Distributed Scaling

Architecture

Broker Configuration

Runtime Configuration

Client Configuration

Redis Configuration

Session State

Redis Cluster

Kubernetes Scaling

Horizontal Pod Autoscaler

Pod Disruption Budget

Resource Quotas

Load Balancing

Sticky Sessions

Runtime Routing

Performance Benchmarks

Single Server (8 cores, 16GB RAM)

Distributed (3 runtime pods)

Monitoring at Scale

Key Metrics

Alerting Rules

Best Practices

Related