Skip to main content
⏱️ Estimated reading time: 12 minutes
CodeCall meta-API for large toolsets Here’s a scenario you might recognize: You built an MCP server. It started with 10 tools. Clean. Simple. Your AI agent found the right tool instantly and everything worked. Then you connected your REST API. 47 endpoints became 47 tools. You added billing, user management, and analytics. Now you’re at 150+ tools. And suddenly:
  • Your context window fills with schemas before the first query
  • Token costs explode—you’re paying to list tools the agent won’t use
  • The model picks the wrong tool or misses the right one entirely
  • Multi-step workflows require endless round-trips through the LLM
You’re not alone. This is the dirty secret of scaling MCP servers—and it’s why Anthropic published “Code execution with MCP” to solve it. But here’s what they didn’t tell you: their approach only works with Claude, requires their infrastructure, and isn’t available yet. CodeCall gives you the same pattern—today, self-hosted, with any LLM.

The Real Cost of Tool Explosion

Let’s do the math on a real scenario. You have 100 tools. Each tool definition averages 200 tokens (name, description, input schema). That’s 20,000 tokens just to list your tools—before the agent does anything. Now the agent needs to:
  1. Find users whose first name starts with “me”
  2. Who logged in within the past 10 days
  3. Return the top 10
Without filtering on the server, here’s what happens:
StepActionTokens
1List 100 tools20,000
2Call users:list (returns 1000 users)50,000+
3Model filters in context10,000
4Model returns top 10500
Total~80,000 tokens
At $0.015 per 1K input tokens (GPT-4 pricing), that’s $1.20 for one query. But worse than cost: the model starts losing context. It drops earlier instructions. It forgets what it was doing. It hallucinates. Sound familiar?

”Just Build Better REST Endpoints”

The obvious solution: add filtering parameters to your API.
GET /users?firstName_startsWith=me&lastLogin_gt=10d&limit=10
But now you need:
  • Query parameter validation
  • Database query building
  • Pagination handling
  • Error handling for invalid filters
  • Documentation updates
  • Tests for every combination
For every possible filter. You’re not building an AI tool anymore. You’re building a query language. And tomorrow the model will need a filter you didn’t anticipate.

The Code Execution Pattern

Anthropic’s insight was simple: let the model write code. Instead of exposing 100 tools, expose a meta-API:
  • Search for relevant tools
  • Describe their schemas
  • Execute JavaScript that orchestrates them
The model writes:
const users = await callTool('users:list', { limit: 1000 });
const tenDaysAgo = new Date();
tenDaysAgo.setDate(tenDaysAgo.getDate() - 10);

const filtered = users.filter(u =>
  u.firstName.toLowerCase().startsWith('me') &&
  new Date(u.lastLogin) > tenDaysAgo
);

return filtered.slice(0, 10);
The filtering happens on your server, not in the LLM context. The model only sees 10 results.
StepActionTokens
1List 4 meta-tools2,000
2Search for “users”500
3Describe users:list500
4Execute script (returns 10 users)1,000
Total~4,000 tokens
That’s 95% fewer tokens. For the same result.

Anthropic Gets It—Their Advanced Tool Use Proves the Pattern

Anthropic recently published “Advanced Tool Use” introducing three beta features that validate exactly what CodeCall does:
Anthropic FeatureWhat It DoesCodeCall Equivalent
Tool Search ToolDynamic discovery instead of loading all toolscodecall:search with VectoriaDB
Programmatic Tool CallingExecute code that orchestrates tools in a sandboxcodecall:execute with Enclave
Tool Use ExamplesProvide concrete usage patternsSchema + examples via codecall:describe
Their results? 85% reduction in token usage with Tool Search. 37% fewer tokens with Programmatic Tool Calling. Accuracy improvements from 72% to 90%. CodeCall delivers the same patterns—but there’s a catch with Anthropic’s approach:

Claude Only

Tied to Anthropic’s infrastructure. Using GPT-4, Llama, or Gemini? You can’t use it.

Not Self-Hosted

Your code runs on their servers. Your data leaves your VPC.

Closed Source

You can’t audit the sandbox, customize security rules, or extend it.

Beta Only

These features are in beta—availability and pricing may change.
If you’re building production systems, you need:
  • Self-hosted infrastructure
  • Any LLM compatibility
  • Auditable security
  • Full control
That’s CodeCall.

CodeCall: Self-Hosted Code Execution for Any LLM

CodeCall is a FrontMCP plugin that brings code execution to your MCP server:

Any LLM

Works with Claude, GPT-4, Gemini, Llama, or any MCP-compatible client

Self-Hosted

Runs entirely on your infrastructure. Data never leaves your VPC.

Open Source

Audit the code. Customize the security. Extend the functionality.

Available Now

npm install @frontmcp/plugins. Deploy today.

How CodeCall Works

CodeCall collapses your entire toolset into 4 meta-tools:
Meta-ToolPurpose
codecall:searchFind tools by natural language query
codecall:describeGet schemas for selected tools
codecall:executeRun JavaScript that orchestrates tools
codecall:invokeDirect single-tool calls (optional)
LLM calling the CodeCall meta-API A typical flow: One round-trip. The LLM never sees raw tool lists. It never processes bulk data. It gets exactly what it asked for.

Bank-Grade Security (Not an Afterthought)

“But running LLM-generated code sounds terrifying.” It should. That’s why CodeCall implements defense-in-depth with battle-tested security libraries:

Layer 0: Pre-Scanner

Before the JavaScript parser even runs, AST Guard’s Pre-Scanner catches attacks that could DoS the parser itself:
  • /(a+)+$/ - Nested quantifiers ❌
  • /(a|a)+$/ - Overlapping alternation ❌
  • Catastrophic backtracking patterns ❌
  • Unicode BiDi override characters ❌
  • Right-to-Left text direction attacks ❌
  • Code that appears different than it executes ❌
  • Deep nesting attacks ((((((x))))))
  • 100MB+ input payloads ❌
  • Null byte injection ❌

Layer 1: AST Guard (Static Analysis)

After pre-scanning, AST Guard parses the JavaScript into an Abstract Syntax Tree and validates every node:
  • eval('malicious code')
  • new Function('return process')()
  • setTimeout, setInterval
  • process.env.SECRET
  • require('fs')
  • global, globalThis
  • obj.__proto__ = {}
  • Object.prototype.hack = true
  • while (true) {}
  • Recursive functions ❌
  • Unbounded loops ❌
100+ attack vectors blocked before the code even runs.

Layer 2: Code Transformation

Code is rewritten for safe execution:
  • Wraps in async function __ag_main() for top-level await
  • Transforms callTool__safe_callTool for tracking
  • Adds iteration limits to all loops
  • Whitelists only safe globals

Layer 3: AI Scoring Gate (NEW)

The AI Scoring Gate detects semantic attack patterns that syntactic validation can’t catch:
Detects fetch→send patterns: scripts that list sensitive data then send it externally
// ❌ BLOCKED - Exfiltration pattern detected
const users = await callTool('users:list', { fields: ['password', 'apiKey'] });
await callTool('webhooks:send', { data: users });
// Score: 100 (SENSITIVE_FIELD + EXFIL_PATTERN)
Flags excessive limits and bulk operation patterns
  • limit: 100000 → +25 risk score
  • bulkDelete, batchProcess → +20 risk score
Detects tool calls inside loops that could multiply requests
// ⚠️ WARNING - Fan-out pattern
for (const user of users) {
  await callTool('emails:send', { to: user.email });
}
8 detection rules with configurable block/warn thresholds (default: block at 70, warn at 40).

Layer 4: Runtime Sandbox

Enclave executes validated code in an isolated Node.js vm context:
  • Fresh context per execution (no state leakage)
  • Whitelist-only globals (Math, JSON, Array—nothing dangerous)
  • Configurable timeouts (default 3.5s)
  • Iteration limits (default 5,000)
  • Tool call caps (default 100)
  • I/O flood protection (console rate limiting)

Layer 5: Self-Reference Guard

Scripts cannot call CodeCall tools from within scripts. This prevents:
  • Recursive execution attacks
  • Sandbox escape via nested calls
  • Resource multiplication
// Inside codecall:execute script
await callTool('codecall:execute', { script: '...' }); // ❌ BLOCKED

Layer 6: Output Sanitization

All results are cleaned before returning:
  • Stack traces removed
  • File paths scrubbed
  • Circular references handled
  • Oversized outputs truncated
Six layers of security ensure that even if one layer is bypassed, others catch the attack.

Semantic Search with Local Embeddings

When you have 150 tools, keyword search isn’t enough. “Get user billing” should find invoices:listForUser even though the words don’t match. CodeCall uses VectoriaDB for semantic tool search:

Local Embeddings

No external API calls. No data leaves your server.

Fast

Sub-millisecond queries. Works offline.

TF-IDF Fallback

For simpler deployments, use TF-IDF with zero model loading.

HNSW for Scale

Enable HNSW indexing for 1000+ tools.
CodeCallPlugin.init({
  embedding: {
    strategy: 'tfidf',           // Fast, no model needed
    // Or: strategy: 'embedding', // Semantic search
    similarityThreshold: 0.3,
  },
});

5 Minutes to Production

npm install @frontmcp/plugins
import { App, Tool, ToolContext } from '@frontmcp/sdk';
import { CodeCallPlugin } from '@frontmcp/plugins';

@Tool({
  name: 'users:list',
  description: 'List users with optional filtering',
})
class ListUsersTool extends ToolContext {
  async execute(input: { status?: string; limit?: number }) {
    // Your existing implementation
    return { users: await db.users.find(input) };
  }
}

@App({
  id: 'my-api',
  name: 'My API',
  tools: [ListUsersTool, /* ...47 more tools */],
  plugins: [
    CodeCallPlugin.init({
      mode: 'codecall_only',      // Hide tools from list_tools
      vm: { preset: 'secure' },   // Bank-grade security
      embedding: { strategy: 'tfidf' },
    }),
  ],
})
export default class MyApp {}
Your MCP client now sees 4 tools instead of 50. Your agent uses code to orchestrate them. Your token costs drop by 90%.

The Plugin Ecosystem Advantage

CodeCall is just one plugin. FrontMCP’s plugin system lets you stack capabilities:
@App({
  plugins: [
    CodeCallPlugin.init({ mode: 'codecall_only' }),
    RateLimitPlugin.init({
      rules: [{ tool: 'codecall:execute', limit: 10, window: '1m' }]
    }),
    CachePlugin.init({
      tools: { 'codecall:describe': { ttl: 300 } }
    }),
    LoggingPlugin.init({
      format: 'json',
      includeToolCalls: true
    }),
  ],
})
Every plugin works with CodeCall. Rate limiting applies to script execution. Caching speeds up tool discovery. Logging captures the full execution trace. This composability is why FrontMCP exists: build production MCP servers without reinventing infrastructure.

Real-World Impact

88% Token Reduction

100 tools: 25K tokens → 3K tokens in context

90% Workflow Savings

Multi-tool workflows: 50K tokens → 5K tokens

~15ms Overhead

AST validation + sandbox execution
Before CodeCall:
  • Endless REST endpoint development
  • Context window overflow
  • Model confusion with large tool lists
  • High token costs for simple operations
After CodeCall:
  • Let the model write the query logic
  • Minimal context usage
  • Clean 4-tool interface
  • Pay only for results, not intermediate data

When to Use CodeCall

Use CodeCall When

  • You have 20+ tools (or anticipate growth)
  • OpenAPI adapters generate dozens of endpoints
  • Workflows require multi-tool orchestration
  • You need in-tool filtering without building REST queries
  • You want any-LLM compatibility
  • Security and compliance require audit trails

Skip CodeCall When

  • You have < 10 simple tools
  • Workflows are single-tool operations
  • You’re building a quick prototype
  • Tools already have comprehensive filtering APIs

The Future of MCP Is Code-First

Anthropic was right: the future isn’t bigger context windows or smarter models picking from longer tool lists. It’s models that write code to orchestrate tools. But that future shouldn’t be locked to one provider, one model, one cloud. CodeCall makes code execution for MCP:
  • Open source and auditable
  • Self-hosted in your VPC
  • Compatible with any LLM
  • Production-ready today
Your MCP server has 100 tools. They don’t have to break your agent anymore.

Get Started


CodeCall is part of FrontMCP, the open-source framework for building production MCP servers. Star us on GitHub to follow development.