Files
searchbox/README.md
2025-09-26 15:45:13 +00:00

11 KiB

Vector Search MCP - Documentation

A comprehensive Model Context Protocol (MCP) server for vector similarity search operations with pluggable backend support.

📋 Table of Contents

🔍 Overview

This package provides a production-ready MCP server that enables semantic search capabilities through a unified interface. It supports multiple vector database backends while maintaining type safety and comprehensive test coverage.

Key Features

  • 🔌 Pluggable Backends: Abstract engine interface for easy backend integration
  • 🛡️ Type Safety: Full generic typing with Rust-like associated types pattern
  • Performance: Caching and async/await throughout
  • 🧪 Well Tested: 62+ tests with 100% critical path coverage
  • 📚 Comprehensive Docs: Detailed docstrings and examples

Supported Backends

  • Qdrant Fully implemented with async client
  • Cosmos DB 🚧 Planned (interface ready)

🏗️ Architecture

Core Components

graph TB
    A[MCP Server] --> B[BaseEngine Abstract Class]
    B --> C[QdrantEngine]
    B --> D[CosmosEngine - Future]
    C --> E[Qdrant AsyncClient]
    F[Factory with Overloads] --> B
    G[Generic Type System] --> B

Design Patterns

1. Abstract Factory with Overloaded Types

# Type checker knows exact return type for literals
engine = get_engine(Backend.QDRANT)  # Returns: QdrantEngine

# Generic typing for variables
backend: Backend = some_variable
engine = get_engine(backend)  # Returns: BaseEngine

2. Generic Interface (Rust-like Associated Types)

class BaseEngine(ABC, Generic[ResponseType, ConditionType]):
    # ResponseType: Backend-specific raw response (e.g., list[ScoredPoint])
    # ConditionType: Backend-specific filter type (e.g., models.Filter)

class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    # Concrete implementation with Qdrant types

3. Template Method Pattern

async def semantic_search(self, ...):
    """Public interface orchestrates the workflow"""
    conditions = self.transform_conditions(...)  # Abstract
    response = await self.run_similarity_query(...)  # Abstract
    return self.transform_response(response)  # Abstract

📖 API Documentation

Main Entry Points

run(transport: Transport = "sse")

Starts the MCP server with specified transport protocol.

Parameters:

  • transport: Either "sse" (Server-Sent Events) or "stdio"

Example:

from vector_search_mcp import run
run("sse")  # Start server

get_engine(backend: Backend) -> BaseEngine

Factory function creating cached engine instances.

Parameters:

  • backend: Backend enum value (Backend.QDRANT, Backend.COSMOS)

Returns:

  • Typed engine instance (QdrantEngine for QDRANT)

Example:

from vector_search_mcp.engine import get_engine, Backend

engine = get_engine(Backend.QDRANT)
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3],
    collection="documents",
    limit=10
)

Core Classes

BaseEngine[ResponseType, ConditionType]

Abstract base class defining the engine interface.

Generic Parameters:

  • ResponseType: Backend's native response format
  • ConditionType: Backend's native filter format

Key Methods:

  • semantic_search(): Main public interface
  • transform_conditions(): Convert generic to backend conditions
  • transform_response(): Convert backend to generic results
  • run_similarity_query(): Execute backend-specific search

QdrantEngine(BaseEngine[list[ScoredPoint], Filter])

Concrete Qdrant implementation.

Features:

  • Async Qdrant client with connection pooling
  • Automatic payload filtering (excludes null payloads)
  • Support for Match, MatchAny, MatchExclude conditions
  • Named vector support

Data Models

SearchRow

Standardized search result format.

SearchRow(
    chunk_id="doc_123",           # Document identifier
    score=0.95,                   # Similarity score (0.0-1.0)
    payload={"text": "...", ...}  # Metadata dictionary
)

Condition Types

Match - Exact field matching

Match(key="category", value="technology")

MatchAny - Match any of provided values

MatchAny(key="tags", any=["python", "rust", "go"])

MatchExclude - Exclude specified values

MatchExclude(key="status", exclude=["draft", "deleted"])

🛡️ Type Safety

Generic Type System

The package uses a sophisticated generic type system that provides compile-time type safety while maintaining flexibility:

# Engine implementations specify their exact types
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    def transform_response(self, response: list[models.ScoredPoint]) -> list[SearchRow]:
        # Type checker validates response parameter type

    async def run_similarity_query(...) -> list[models.ScoredPoint]:
        # Type checker validates return type matches generic parameter

Factory Type Overloads

@overload
def get_engine(backend: Literal[Backend.QDRANT]) -> QdrantEngine: ...

@overload
def get_engine(backend: Backend) -> BaseEngine: ...

# Usage provides different type information:
engine1 = get_engine(Backend.QDRANT)      # Type: QdrantEngine
engine2 = get_engine(some_variable)       # Type: BaseEngine

🧪 Testing

Test Coverage

  • 62 Tests Total across 4 test modules
  • 100% Critical Path Coverage for search workflows
  • Integration Testing with full mock environments
  • Type Safety Validation with runtime checks

Test Structure

tests/test_engine/
├── test_base_engine.py      # Abstract interface tests (12 tests)
├── test_qdrant_engine.py    # Qdrant implementation (20 tests)
├── test_factory.py          # Factory and typing tests (17 tests)
├── test_integration.py      # End-to-end workflows (13 tests)
├── conftest.py              # Shared fixtures and mocks
└── README.md                # Testing documentation

Running Tests

# Run all engine tests
uv run pytest tests/test_engine/ -v

# Run with coverage
uv run pytest tests/test_engine/ --cov=src/vector_search_mcp/engine --cov-report=html

# Run specific test categories
uv run pytest tests/test_engine/test_integration.py -v

Key Testing Features

  • Cache Management: Auto-clearing fixtures prevent test interference
  • Mock Isolation: Comprehensive mocking prevents real network calls
  • Async Testing: Full async/await support with proper event loops
  • Type Validation: Runtime checks for generic type correctness

🛠️ Development

Prerequisites

# Install with uv
uv install

# Or with pip
pip install -e .

Code Quality

The package maintains high code quality standards:

# Linting and formatting
uv run ruff check          # Check for issues
uv run ruff check --fix    # Auto-fix issues
uv run ruff format         # Format code

# Type checking
uv run mypy src/

# Run tests
uv run pytest

Adding New Backends

  1. Define Types: Determine ResponseType and ConditionType for your backend
  2. Implement Engine: Create class extending BaseEngine[ResponseType, ConditionType]
  3. Add to Factory: Update Backend enum and get_engine() function
  4. Write Tests: Follow existing test patterns
  5. Update Documentation: Add examples and API docs

Example template:

class MyEngine(BaseEngine[MyResponseType, MyConditionType]):
    def transform_conditions(self, conditions: list[Condition] | None) -> MyConditionType | None:
        # Convert generic conditions to backend format

    def transform_response(self, response: MyResponseType) -> list[SearchRow]:
        # Convert backend response to SearchRow objects

    async def run_similarity_query(...) -> MyResponseType:
        # Execute backend-specific search

💡 Examples

Basic Usage

from vector_search_mcp.engine import get_engine, Backend
from vector_search_mcp.models import Match, MatchAny

# Create engine
engine = get_engine(Backend.QDRANT)

# Simple search
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
    collection="documents",
    limit=10
)

for result in results:
    print(f"Score: {result.score:.3f} - {result.payload['text'][:50]}...")

Advanced Filtering

# Complex conditions
conditions = [
    Match(key="category", value="technology"),
    MatchAny(key="language", any=["python", "rust", "go"]),
    MatchExclude(key="status", exclude=["draft", "archived"])
]

results = await engine.semantic_search(
    embedding=query_vector,
    collection="tech_docs",
    limit=20,
    conditions=conditions,
    threshold=0.75  # Minimum similarity score
)

Custom Backend Implementation

from vector_search_mcp.engine.base_engine import BaseEngine
from vector_search_mcp.models import SearchRow, Condition

class CustomEngine(BaseEngine[dict, str]):
    """Example custom backend implementation."""

    def transform_conditions(self, conditions: list[Condition] | None) -> str | None:
        if not conditions:
            return None
        # Convert to custom query string format
        return " AND ".join([f"{c.key}:{c.value}" for c in conditions])

    def transform_response(self, response: dict) -> list[SearchRow]:
        # Convert custom response to SearchRow objects
        return [
            SearchRow(
                chunk_id=str(item['id']),
                score=item['similarity'],
                payload=item['metadata']
            )
            for item in response.get('results', [])
        ]

    async def run_similarity_query(self, embedding, collection, limit=10,
                                 conditions=None, threshold=None) -> dict:
        # Custom backend API call
        return await self.custom_client.search(
            vector=embedding,
            index=collection,
            limit=limit,
            filter=conditions,
            min_score=threshold
        )

MCP Server Integration

# Start the MCP server
from vector_search_mcp import run

# With Server-Sent Events (web-based clients)
run("sse")

# With stdio (terminal/CLI clients)
run("stdio")

📚 Additional Resources

  • Source Code: Fully documented with comprehensive docstrings
  • Test Suite: Located in tests/test_engine/ with detailed README
  • Type Definitions: All public APIs have complete type annotations
  • Examples: See examples/ directory (if available) for more use cases

This documentation covers the current state of the Vector Search MCP package. The architecture is designed for extensibility, type safety, and production use.