Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agentfront.dev/llms.txt

Use this file to discover all available pages before exploring further.

Learn the fundamentals of document indexing in VectoriaDB.
VectoriaDB uses embedding vectors to enable semantic search. Each document’s text is converted to a vector representation that captures its meaning.

How Indexing Works

  1. Text Input: You provide a document with text and metadata
  2. Embedding Generation: VectoriaDB generates a vector embedding from the text
  3. Storage: The embedding is stored in memory (and optionally persisted)
  4. Searchable: The document becomes searchable via semantic queries

Document Structure

Each document requires three pieces:
src/document-structure.ts
await db.add(
  'document-id',          // Unique identifier
  'Document text here',   // Text to embed
  {                       // Type-safe metadata
    id: 'document-id',
    // ... your custom fields
  }
);

ID Requirements

  • Must be unique within the database
  • Used to retrieve, update, or remove documents
  • Should match metadata.id for consistency

Text Guidelines

  • Descriptive, natural language text works best
  • Include relevant keywords and context
  • Maximum size controlled by maxDocumentSize config

Metadata

  • Must extend DocumentMetadata interface
  • id field is required and must match document ID
  • Add any custom fields for filtering and display

Type-Safe Metadata

Define your metadata interface for compile-time safety:
src/types.ts
import { VectoriaDB, DocumentMetadata } from 'vectoriadb';

interface ToolDocument extends DocumentMetadata {
  toolName: string;
  owner: string;
  tags: string[];
  risk: 'safe' | 'destructive';
  deprecated?: boolean;
}

const db = new VectoriaDB<ToolDocument>();

// TypeScript ensures metadata matches interface
await db.add('id', 'text', {
  id: 'id',
  toolName: 'test',
  owner: 'system',
  tags: [],
  risk: 'safe',
  // TypeScript error if you add wrong fields
});

Embedding Generation

Embeddings are generated automatically when you add or update documents. The process:
  1. Text is tokenized using the configured model
  2. Embeddings are generated (~100-200 documents/second)
  3. Embeddings are stored in memory (and optionally persisted)
For large imports, use addMany with appropriate maxBatchSize to avoid memory spikes.

Document Limits

VectoriaDB enforces limits to prevent DoS attacks:
src/config-limits.ts
const db = new VectoriaDB({
  maxDocuments: 100000,    // Maximum documents in index
  maxDocumentSize: 1000000, // Maximum text size in characters
  maxBatchSize: 1000,      // Maximum documents per batch operation
});

Adding Documents

Add single and batch documents

Updating Documents

Update metadata and text

Search

Query the index