Skip to main content
This guide covers everything you need to deploy CodeCall in production: performance optimization, monitoring, multi-instance deployment, and operational best practices.

Performance Characteristics

Latency Breakdown

StageTypical TimeNotes
AST Parsing1-5msScales with code size
AST Validation2-10msDepends on rule count
Code Transformation1-3msOne-time per script
AI Scoring Gate~1msRule-based (cached)
VM ExecutionVariableDepends on script complexity
Tool CallsVariableNetwork/database bound
Output Sanitization1-5msScales with output size
Total overhead (excluding tool calls): ~10-25ms for typical scripts.

Worker Pool Mode

When using Worker Pool adapter for OS-level isolation, latency changes slightly:
MetricStandard VMWorker Pool (4 workers)
Cold start~5ms~50ms (pool warm-up)
Warm execution~1ms~3ms (message passing)
Concurrent capacity14 (parallel isolation)
Memory per executionSharedIsolated per worker
Worker Pool adds ~2-3ms latency per execution due to message passing overhead, but provides OS-level isolation and hard halt capability.

Throughput

ConfigurationRequests/secNotes
Single instance, TF-IDF~500Bottleneck: VM isolation
Single instance, Embeddings~200Bottleneck: Model inference
Single instance, Worker Pool (4 workers)~800Parallel isolation
Multi-instance (4 pods)~1,500+Near-linear scaling
Multi-instance + Worker Pool~3,000+Best isolation + throughput
Throughput depends heavily on script complexity and tool call latency. These numbers assume simple scripts with 1-3 tool calls.

Worker Pool Scaling Guidelines

WorkloadminWorkersmaxWorkersmemoryLimitUse Case
Low volume12128MB<10 req/min
Medium volume24256MB10-100 req/min
High volume48256MB100-500 req/min
Burst traffic216128MBSpiky workloads
// Worker Pool for production
CodeCallPlugin.init({
  vm: {
    adapter: 'worker_threads',
    preset: 'secure',
    workerPoolConfig: {
      minWorkers: 2,
      maxWorkers: 8,
      memoryLimitPerWorker: 256 * 1024 * 1024,  // 256MB
      maxExecutionsPerWorker: 1000,              // Recycle after 1000 executions
    },
  },
});

Performance Optimization

1. Use TF-IDF for Most Cases

Unless you have 100+ tools with similar descriptions, TF-IDF provides excellent relevance with minimal overhead:
CodeCallPlugin.init({
  embedding: {
    strategy: 'tfidf',  // 10x faster than embedding
    similarityThreshold: 0.25,
  },
});

2. Enable HNSW for Large Toolsets

For 1000+ tools with embedding strategy, enable HNSW indexing:
CodeCallPlugin.init({
  embedding: {
    strategy: 'embedding',
    useHNSW: true,
    hnsw: {
      M: 16,              // Connections per node
      efConstruction: 200, // Build-time accuracy
      efSearch: 64,        // Query-time accuracy
    },
  },
});

3. Warm the Search Index on Startup

Pre-index tools during server initialization:
import { CodeCallPlugin } from '@frontmcp/plugins';

@App({
  plugins: [
    CodeCallPlugin.init({
      // ... config
    }),
  ],
})
export default class MyApp {
  async onReady() {
    // Index is built automatically, but you can force warmup
    const searchService = this.scope.get(ToolSearchService);
    await searchService.initialize();
    console.log(`Indexed ${searchService.getTotalCount()} tools`);
  }
}

4. Use Direct Invoke for Simple Calls

Bypass VM overhead for single-tool operations:
// Instead of
{
  "tool": "codecall:execute",
  "input": {
    "script": "return await callTool('users:getById', { id: '123' });"
  }
}

// Use
{
  "tool": "codecall:invoke",
  "input": {
    "tool": "users:getById",
    "input": { "id": "123" }
  }
}
Savings: ~15-20ms per call.

5. Cache Describe Results

Tool schemas rarely change. Enable caching:
import { CachePlugin } from '@frontmcp/plugins';

@App({
  plugins: [
    CachePlugin.init({
      tools: {
        'codecall:describe': { ttl: 300 },  // 5 minutes
        'codecall:search': { ttl: 60 },     // 1 minute
      },
    }),
    CodeCallPlugin.init({ ... }),
  ],
})

Multi-Instance Deployment

CodeCall is stateless and scales horizontally.

Architecture

Shared Cache (Redis)

For consistent search results across instances:
import { RedisStorageAdapter } from 'vectoriadb';

CodeCallPlugin.init({
  embedding: {
    strategy: 'embedding',
    storageAdapter: new RedisStorageAdapter({
      client: redisClient,
      namespace: 'codecall:tools',
      ttl: 3600,  // 1 hour
    }),
  },
});

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-server
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: mcp-server
          image: your-mcp-server:latest
          resources:
            requests:
              memory: "512Mi"
              cpu: "500m"
            limits:
              memory: "1Gi"
              cpu: "1000m"
          env:
            - name: CODECALL_VM_PRESET
              value: "secure"
            - name: CODECALL_EMBEDDING_STRATEGY
              value: "tfidf"
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 30

Resource Recommendations

WorkloadCPUMemoryInstances
Light (<100 req/min)0.5 core512MB1-2
Medium (100-500 req/min)1 core1GB2-4
Heavy (500+ req/min)2 cores2GB4+
Embedding strategy requires additional memory (~200MB) for the transformer model. Account for this in resource limits.

Monitoring

Metrics to Track

Execution Latency

Track p50, p95, p99 of codecall:execute duration

Error Rate

Monitor validation errors, timeouts, and tool failures

Tool Call Count

Average tool calls per script execution

Search Latency

Track search response times for index health

Logging

CodeCall emits structured logs for observability:
// Enable structured logging
import { LoggingPlugin } from '@frontmcp/plugins';

@App({
  plugins: [
    LoggingPlugin.init({
      level: 'info',
      format: 'json',
      includeToolCalls: true,
    }),
    CodeCallPlugin.init({ ... }),
  ],
})
Log events:
// Script execution start
{ "event": "codecall:execute:start", "executionId": "abc123", "scriptSize": 245 }

// Tool call
{ "event": "codecall:tool:call", "executionId": "abc123", "tool": "users:list", "duration": 45 }

// Script execution complete
{ "event": "codecall:execute:complete", "executionId": "abc123", "status": "ok", "duration": 234, "toolCalls": 3 }

// Security event
{ "event": "codecall:security:blocked", "reason": "self_reference", "tool": "codecall:execute" }

Health Checks

Expose CodeCall health via your health endpoint:
@Tool({
  name: 'health:codecall',
  description: 'Check CodeCall plugin health',
})
class CodeCallHealthTool extends ToolContext {
  async execute() {
    const searchService = this.scope.get(ToolSearchService);
    const config = this.scope.get(CodeCallConfig);

    return {
      status: 'healthy',
      indexedTools: searchService.getTotalCount(),
      embeddingStrategy: config.get('embedding.strategy'),
      vmPreset: config.get('vm.preset'),
    };
  }
}

Alerting Recommendations

MetricWarningCritical
Execute p99 latency> 2s> 5s
Error rate> 5%> 15%
Timeout rate> 1%> 5%
Security blocksAnyHigh volume

Cost Optimization

Token Savings

CodeCall dramatically reduces token usage:
ScenarioWithout CodeCallWith CodeCallSavings
100 tools in context~25,000 tokens~3,000 tokens88%
Multi-tool workflow (5 calls)~50,000 tokens~5,000 tokens90%
Complex filtering~100,000 tokens~8,000 tokens92%

Compute Costs

FactorImpactOptimization
VM isolation~10ms overheadUse codecall:invoke for simple calls
Embedding inference~50ms/queryUse TF-IDF for fewer than 100 tools
Tool callsDominant costOptimize underlying tools

Cost vs. Performance Tradeoffs

  • Use TF-IDF search
  • Enable caching for describe/search
  • Use direct invoke for simple calls
  • Increase VM timeout for complex scripts
  • Use locked_down preset (shorter timeouts)
  • Limit maxToolCalls aggressively
  • Cache aggressively
  • Use fewer instances with more resources
  • Use codecall_only mode
  • Hide all tools from list_tools
  • Return minimal data from tools
  • Let scripts filter server-side

Security in Production

Checklist

1

Use secure or locked_down preset

Never use experimental in production.
2

Enable audit logging

Log all script executions and security events.
3

Configure rate limiting

Prevent abuse via aggressive rate limits.
4

Monitor security events

Alert on validation failures and self-reference attempts.
5

Regular security reviews

Review tool allowlists and filter rules quarterly.

Rate Limiting

import { RateLimitPlugin } from '@frontmcp/plugins';

@App({
  plugins: [
    RateLimitPlugin.init({
      rules: [
        // Strict limits on execute
        { tool: 'codecall:execute', limit: 10, window: '1m', per: 'user' },
        { tool: 'codecall:execute', limit: 100, window: '1m', per: 'global' },

        // More generous for search/describe
        { tool: 'codecall:search', limit: 100, window: '1m', per: 'user' },
        { tool: 'codecall:describe', limit: 50, window: '1m', per: 'user' },
      ],
    }),
    CodeCallPlugin.init({ ... }),
  ],
})

Audit Logging

CodeCall provides comprehensive audit logging for compliance and security monitoring.

What Gets Logged

EventData CapturedPurpose
Script execution startexecutionId, scriptHash, timestampTrack execution lifecycle
Tool callstoolName, args (sanitized), duration, resultAudit trail of actions
Security eventsblocked construct, rule triggeredSecurity monitoring
Scoring gate resultsriskLevel, signals, scoreRisk assessment audit
Script completionstatus, duration, toolCallCountPerformance tracking

Enabling Audit Logging

import { AuditLogPlugin } from '@frontmcp/plugins';

CodeCallPlugin.init({
  audit: {
    enabled: true,

    // What to log
    logScripts: true,          // Log script content (hashed by default)
    logToolArgs: true,         // Log tool call arguments
    logResults: false,         // Don't log result data (PII concerns)

    // Redaction
    redactFields: ['password', 'token', 'apiKey', 'secret'],

    // External integration
    sink: async (event) => {
      await auditService.log(event);
    },
  },
});

Audit Event Schema

interface AuditEvent {
  eventType: 'execution:start' | 'tool:call' | 'security:block' | 'execution:complete';
  timestamp: string;          // ISO 8601
  executionId: string;        // Unique execution identifier
  tenantId?: string;          // From codecallContext
  userId?: string;            // From codecallContext

  // For tool:call events
  tool?: {
    name: string;
    argsHash: string;         // SHA-256 of arguments (not raw args)
    durationMs: number;
    success: boolean;
  };

  // For security:block events
  security?: {
    rule: string;
    construct: string;
    severity: 'low' | 'medium' | 'high' | 'critical';
  };

  // For execution:complete events
  execution?: {
    status: 'ok' | 'error' | 'timeout';
    durationMs: number;
    toolCallCount: number;
    scoringResult?: {
      riskLevel: string;
      score: number;
    };
  };
}

Integration Examples

audit: {
  sink: async (event) => {
    dogstatsd.increment('codecall.execution', 1, {
      status: event.execution?.status,
      tenant: event.tenantId,
    });
    dogstatsd.histogram('codecall.duration', event.execution?.durationMs);
  },
}
import { CloudWatchLogsClient, PutLogEventsCommand } from '@aws-sdk/client-cloudwatch-logs';

audit: {
  sink: async (event) => {
    await cloudwatch.send(new PutLogEventsCommand({
      logGroupName: 'codecall-audit',
      logStreamName: event.tenantId || 'default',
      logEvents: [{ timestamp: Date.now(), message: JSON.stringify(event) }],
    }));
  },
}
audit: {
  sink: async (event) => {
    await db.auditEvents.insert({
      ...event,
      createdAt: new Date(),
    });
  },
}

Multi-Tenancy Patterns

CodeCall supports multiple isolation strategies for multi-tenant deployments.

Tenant Context

Pass tenant information via codecallContext:
{
  "tool": "codecall:execute",
  "input": {
    "script": "...",
    "context": {
      "tenantId": "acme-corp",
      "userId": "user-123",
      "permissions": ["read", "write"]
    }
  }
}

Per-Tenant Tool Filtering

Restrict tools based on tenant:
CodeCallPlugin.init({
  includeTools: (tool, context) => {
    // Filter by tenant subscription
    const tenantPlan = getTenantPlan(context.tenantId);

    if (tenantPlan === 'free' && tool.metadata?.premium) {
      return false;
    }

    // Filter by app ownership
    if (tool.metadata?.codecall?.appId) {
      return context.tenantApps?.includes(tool.metadata.codecall.appId);
    }

    return true;
  },
});

Per-Tenant Configuration

Different security levels per tenant:
CodeCallPlugin.init({
  vm: {
    // Dynamic preset based on tenant
    preset: (context) => {
      const trust = getTenantTrustLevel(context.tenantId);
      return trust === 'enterprise' ? 'balanced' : 'secure';
    },

    // Dynamic timeout based on subscription
    timeoutMs: (context) => {
      return context.subscription === 'premium' ? 10000 : 3500;
    },
  },
});

Isolation Strategies

StrategyIsolation LevelCostUse Case
Shared poolLow$Dev/staging
Tenant-specific limitsMedium$$SaaS standard
Separate worker poolsHigh$$$Enterprise
Dedicated instancesMaximum$$$$Compliance-heavy

Per-Tenant Resource Quotas

CodeCallPlugin.init({
  quotas: {
    enabled: true,

    // Per-tenant limits
    maxExecutionsPerMinute: (context) => {
      const quotas = { free: 10, pro: 100, enterprise: 1000 };
      return quotas[context.subscription] || 10;
    },

    maxToolCallsPerExecution: (context) => {
      return context.subscription === 'enterprise' ? 500 : 100;
    },

    // Quota exceeded handler
    onQuotaExceeded: (tenant, quotaType) => {
      notifyTenant(tenant, `CodeCall quota exceeded: ${quotaType}`);
      metrics.increment('codecall.quota.exceeded', { tenant, quotaType });
    },
  },
});

Audit Trail Separation

Separate audit logs by tenant:
audit: {
  sink: async (event) => {
    // Route to tenant-specific log stream
    const streamName = `codecall/${event.tenantId || 'default'}`;

    await auditService.log(streamName, event);
  },
}

Troubleshooting

Common Issues

Symptoms: Frequent TIMEOUT errorsCauses:
  • Script too complex
  • Tool calls too slow
  • Timeout too aggressive
Solutions:
  1. Profile tool call latency
  2. Increase vm.timeoutMs if tools are slow
  3. Break complex scripts into smaller pieces
  4. Use Promise.all() for independent tool calls
Symptoms: Low relevance scores, wrong tools returnedCauses:
  • Poor tool descriptions
  • Threshold too low
  • TF-IDF limitations
Solutions:
  1. Improve tool descriptions
  2. Increase similarityThreshold
  3. Switch to embedding strategy for semantic matching
  4. Add more specific keywords to descriptions
Symptoms: OOM errors, pod restartsCauses:
  • Embedding model loaded
  • Large tool index
  • Scripts returning large data
Solutions:
  1. Use TF-IDF instead of embeddings
  2. Increase memory limits
  3. Configure output sanitization limits
  4. Enable HNSW for large indexes
Symptoms: Scripts rejected that should workCauses:
  • Using blocked constructs
  • Reserved prefix collision
  • Unicode issues
Solutions:
  1. Check for eval, Function, etc.
  2. Avoid __ag_ and __safe_ prefixes
  3. Use ASCII identifiers
  4. Review AST Guard rules

Migration & Rollback

Gradual Rollout

  1. Phase 1: Deploy with mode: 'metadata_driven'
    • All tools visible normally
    • Mark select tools for CodeCall
    • Monitor for issues
  2. Phase 2: Switch to mode: 'codecall_opt_in'
    • Tools opt into CodeCall
    • Both access methods work
    • Measure token savings
  3. Phase 3: Move to mode: 'codecall_only'
    • Hide tools from list_tools
    • Full CodeCall experience
    • Maximum token savings

Rollback Plan

// Emergency rollback: disable CodeCall
@App({
  plugins: process.env.CODECALL_ENABLED === 'false'
    ? []
    : [CodeCallPlugin.init({ ... })],
})
Feature flag CodeCall to enable instant rollback without redeployment.