Offline embeddings
Embeddings run locally via transformers.js, so your users’ data never leaves the server and you avoid API quotas.
Type-safe metadata
Strong generics ensure every document you index keeps the same shape as your tool metadata.
Operational guardrails
Built-in rate limits, batch validation, HNSW indexing, and storage adapters keep the index production ready.
What you’ll build
- A typed document shape for every tool, app, or resource you want to search
- An indexing routine that stays in sync with
toolRegistry.getTools(true) - Semantic queries with metadata filters, score thresholds, and pagination controls
- Persistent caches (file or Redis) so restarts do not require re-embedding everything
- Tunable HNSW search for large inventories
The default Xenova
all-MiniLM-L6-v2 model is ~22 MB. The first initialization downloads and caches it under
cacheDir; subsequent boots reuse the local copy.Prerequisites
- Node.js 22 or later (Node 24 is recommended and is what FrontMCP tests against)
- An existing FrontMCP server with at least one app and tool registry
- Ability to install npm packages in the server workspace
- Optional: writable disk or Redis if you plan to persist embeddings between restarts
Step 1: Install & initialize VectoriaDB
Install the package alongside your server:FrontMCP 0.5 depends on
vectoriadb@^2.0.1. Use the ^2 range to stay compatible with the SDK and automatically pick up 2.x optimizations and fixes.initialize() must run before add, search, or update. Calling it twice is safe because VectoriaDB short-circuits if it is already ready.
Step 2: Index your tools
Collect metadata from the tool registry (apps, plugins, adapters, or scopes) and write it into the database. Each document needs a uniqueid, the natural-language text you want to vectorize, and metadata that extends DocumentMetadata.
addMany validates every document, enforces maxBatchSize, and prevents duplicates. Use it after deployments or whenever your tool inventory changes.
Step 3: Run semantic search
Query the index anywhere you can run async code (for example inside a custom MCP tool that recommends next actions):search returns the best matches sorted by cosine similarity. Use filter to enforce authorization, includeVector to inspect raw vectors, and threshold to drop low-confidence hits.
Persist embeddings between restarts
Avoid re-indexing on every boot by swapping the default in-memory adapter with the provided file or Redis adapters plus a deterministic tools hash.toolsHash automatically invalidates the cache when your tool list or descriptions change. Call saveToStorage() after indexing; initialize() transparently loads the cache on the next boot.
Need a shared cache across pods? Swap in
RedisStorageAdapter with your preferred Redis client and namespace. TTLs and
key prefixes are configurable per adapter.Scale & tune search
- Enable
useHNSWfor datasets above roughly ten thousand documents. HNSW provides sub-millisecond queries with more than 95% recall. - Adjust
thresholdandtopKper query to trade recall for precision. - Guard resource usage with
maxDocuments,maxDocumentSize, andmaxBatchSize(VectoriaDB enforces these automatically). - Set a custom
cacheDirif your runtime has strict filesystem policies.
Complete Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
modelName | string | 'Xenova/all-MiniLM-L6-v2' | Embedding model to use |
cacheDir | string | './.cache/transformers' | Model cache directory |
dimensions | number | Auto-detected | Vector dimensions |
defaultSimilarityThreshold | number | 0.3 | Minimum similarity score |
defaultTopK | number | 10 | Default results limit |
useHNSW | boolean | false | Enable HNSW index |
maxDocuments | number | 100000 | Max documents (DoS protection) |
maxDocumentSize | number | 1000000 | Max document size in chars |
maxBatchSize | number | 1000 | Max batch operation size |
verboseErrors | boolean | true | Enable detailed errors |
HNSW Configuration
| Option | Default | Description |
|---|---|---|
M | 16 | Connections per node in layer > 0 (higher = better recall) |
M0 | 32 | Connections for layer 0 (typically M * 2) |
efConstruction | 200 | Candidate list size during construction |
efSearch | 50 | Candidate list size during search |
TF-IDF Variant (Zero Dependencies)
For scenarios where ML model downloads aren’t acceptable, use the TF-IDF variant:When to Use TF-IDF vs ML Embeddings
| Feature | TFIDFVectoria | VectoriaDB |
|---|---|---|
| Dependencies | Zero | transformers.js (~22MB model) |
| Initialization | Synchronous | Async (model download) |
| Semantic understanding | Keyword-based | Full semantic |
| Best for | Small corpora (under 10K docs) | Any size |
| Reindex required | Yes, after changes | No |
Storage Adapters
File Adapter
Redis Adapter
For multi-pod environments, use Redis to share embeddings:Memory Adapter (Default)
No persistence - embeddings are lost on restart:Handle errors and monitor health
All errors extendVectoriaError and ship with machine-readable code values so you can branch on them.
Error Types
| Error | Code | Description |
|---|---|---|
VectoriaNotInitializedError | NOT_INITIALIZED | Call initialize() first |
DocumentValidationError | DOCUMENT_VALIDATION_ERROR | Invalid document data |
DocumentNotFoundError | DOCUMENT_NOT_FOUND | Document ID doesn’t exist |
DocumentExistsError | DOCUMENT_EXISTS | Document ID already exists |
DuplicateDocumentError | DUPLICATE_DOCUMENT | Duplicate in batch |
QueryValidationError | QUERY_VALIDATION_ERROR | Invalid search query |
EmbeddingError | EMBEDDING_ERROR | Model embedding failed |
StorageError | STORAGE_ERROR | Storage operation failed |
ConfigurationError | CONFIGURATION_ERROR | Invalid config |
toolIndex.getStats() to feed dashboards or health endpoints:
Pair stats with
toolIndex.size(), toolIndex.clear(), and toolIndex.clearStorage() to expose maintenance commands or
admin tooling.
