2025-09-30 18:12:29 +00:00
2025-09-30 18:12:29 +00:00
2025-09-27 19:32:04 +00:00
2025-09-30 17:21:02 +00:00
2025-09-30 17:30:46 +00:00
2025-09-27 16:26:03 +00:00
2025-09-26 01:29:19 +00:00
2025-09-25 23:39:12 +00:00
2025-09-27 05:03:49 +00:00
2025-09-27 03:33:29 +00:00
2025-09-27 19:13:19 +00:00
2025-09-26 15:45:13 +00:00
2025-09-27 19:13:19 +00:00

Vector Search MCP - Documentation

A comprehensive Model Context Protocol (MCP) server for vector similarity search operations with pluggable backend support.

📋 Table of Contents

🔍 Overview

This package provides a production-ready MCP server that enables semantic search capabilities through a unified interface. It supports multiple vector database backends while maintaining type safety and comprehensive test coverage.

Key Features

  • 🔌 Pluggable Backends: Abstract engine interface for easy backend integration
  • 🛡️ Type Safety: Full generic typing with Rust-like associated types pattern
  • Performance: Caching and async/await throughout
  • 🧪 Well Tested: 62+ tests with 100% critical path coverage
  • 📚 Comprehensive Docs: Detailed docstrings and examples

Supported Backends

  • Qdrant Fully implemented with async client
  • Cosmos DB 🚧 Planned (interface ready)

🏗️ Architecture

Core Components

graph TB
    A[MCP Server] --> B[BaseEngine Abstract Class]
    B --> C[QdrantEngine]
    B --> D[CosmosEngine - Future]
    C --> E[Qdrant AsyncClient]
    F[Factory with Overloads] --> B
    G[Generic Type System] --> B

Design Patterns

1. Abstract Factory with Overloaded Types

# Type checker knows exact return type for literals
engine = get_engine(Backend.QDRANT)  # Returns: QdrantEngine

# Generic typing for variables
backend: Backend = some_variable
engine = get_engine(backend)  # Returns: BaseEngine

2. Generic Interface (Rust-like Associated Types)

class BaseEngine(ABC, Generic[ResponseType, ConditionType]):
    # ResponseType: Backend-specific raw response (e.g., list[ScoredPoint])
    # ConditionType: Backend-specific filter type (e.g., models.Filter)

class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    # Concrete implementation with Qdrant types

3. Template Method Pattern

async def semantic_search(self, ...):
    """Public interface orchestrates the workflow"""
    conditions = self.transform_conditions(...)  # Abstract
    response = await self.run_similarity_query(...)  # Abstract
    return self.transform_response(response)  # Abstract

📖 API Documentation

Main Entry Points

run(transport: Transport = "sse")

Starts the MCP server with specified transport protocol.

Parameters:

  • transport: Either "sse" (Server-Sent Events) or "stdio"

Example:

from vector_search_mcp import run
run("sse")  # Start server

get_engine(backend: Backend) -> BaseEngine

Factory function creating cached engine instances.

Parameters:

  • backend: Backend enum value (Backend.QDRANT, Backend.COSMOS)

Returns:

  • Typed engine instance (QdrantEngine for QDRANT)

Example:

from vector_search_mcp.engine import get_engine, Backend

engine = get_engine(Backend.QDRANT)
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3],
    collection="documents",
    limit=10
)

Core Classes

BaseEngine[ResponseType, ConditionType]

Abstract base class defining the engine interface.

Generic Parameters:

  • ResponseType: Backend's native response format
  • ConditionType: Backend's native filter format

Key Methods:

  • semantic_search(): Main public interface
  • transform_conditions(): Convert generic to backend conditions
  • transform_response(): Convert backend to generic results
  • run_similarity_query(): Execute backend-specific search

QdrantEngine(BaseEngine[list[ScoredPoint], Filter])

Concrete Qdrant implementation.

Features:

  • Async Qdrant client with connection pooling
  • Automatic payload filtering (excludes null payloads)
  • Support for Match, MatchAny, MatchExclude conditions
  • Named vector support

Data Models

SearchRow

Standardized search result format.

SearchRow(
    chunk_id="doc_123",           # Document identifier
    score=0.95,                   # Similarity score (0.0-1.0)
    payload={"text": "...", ...}  # Metadata dictionary
)

Condition Types

Match - Exact field matching

Match(key="category", value="technology")

MatchAny - Match any of provided values

MatchAny(key="tags", any=["python", "rust", "go"])

MatchExclude - Exclude specified values

MatchExclude(key="status", exclude=["draft", "deleted"])

🛡️ Type Safety

Generic Type System

The package uses a sophisticated generic type system that provides compile-time type safety while maintaining flexibility:

# Engine implementations specify their exact types
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    def transform_response(self, response: list[models.ScoredPoint]) -> list[SearchRow]:
        # Type checker validates response parameter type

    async def run_similarity_query(...) -> list[models.ScoredPoint]:
        # Type checker validates return type matches generic parameter

Factory Type Overloads

@overload
def get_engine(backend: Literal[Backend.QDRANT]) -> QdrantEngine: ...

@overload
def get_engine(backend: Backend) -> BaseEngine: ...

# Usage provides different type information:
engine1 = get_engine(Backend.QDRANT)      # Type: QdrantEngine
engine2 = get_engine(some_variable)       # Type: BaseEngine

🧪 Testing

Test Coverage

  • 62 Tests Total across 4 test modules
  • 100% Critical Path Coverage for search workflows
  • Integration Testing with full mock environments
  • Type Safety Validation with runtime checks

Test Structure

tests/test_engine/
├── test_base_engine.py      # Abstract interface tests (12 tests)
├── test_qdrant_engine.py    # Qdrant implementation (20 tests)
├── test_factory.py          # Factory and typing tests (17 tests)
├── test_integration.py      # End-to-end workflows (13 tests)
├── conftest.py              # Shared fixtures and mocks
└── README.md                # Testing documentation

Running Tests

# Run all engine tests
uv run pytest tests/test_engine/ -v

# Run with coverage
uv run pytest tests/test_engine/ --cov=src/vector_search_mcp/engine --cov-report=html

# Run specific test categories
uv run pytest tests/test_engine/test_integration.py -v

Key Testing Features

  • Cache Management: Auto-clearing fixtures prevent test interference
  • Mock Isolation: Comprehensive mocking prevents real network calls
  • Async Testing: Full async/await support with proper event loops
  • Type Validation: Runtime checks for generic type correctness

🛠️ Development

Prerequisites

# Install with uv
uv install

# Or with pip
pip install -e .

Code Quality

The package maintains high code quality standards:

# Linting and formatting
uv run ruff check          # Check for issues
uv run ruff check --fix    # Auto-fix issues
uv run ruff format         # Format code

# Type checking
uv run mypy src/

# Run tests
uv run pytest

Adding New Backends

  1. Define Types: Determine ResponseType and ConditionType for your backend
  2. Implement Engine: Create class extending BaseEngine[ResponseType, ConditionType]
  3. Add to Factory: Update Backend enum and get_engine() function
  4. Write Tests: Follow existing test patterns
  5. Update Documentation: Add examples and API docs

Example template:

class MyEngine(BaseEngine[MyResponseType, MyConditionType]):
    def transform_conditions(self, conditions: list[Condition] | None) -> MyConditionType | None:
        # Convert generic conditions to backend format

    def transform_response(self, response: MyResponseType) -> list[SearchRow]:
        # Convert backend response to SearchRow objects

    async def run_similarity_query(...) -> MyResponseType:
        # Execute backend-specific search

💡 Examples

Basic Usage

from vector_search_mcp.engine import get_engine, Backend
from vector_search_mcp.models import Match, MatchAny

# Create engine
engine = get_engine(Backend.QDRANT)

# Simple search
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
    collection="documents",
    limit=10
)

for result in results:
    print(f"Score: {result.score:.3f} - {result.payload['text'][:50]}...")

Advanced Filtering

# Complex conditions
conditions = [
    Match(key="category", value="technology"),
    MatchAny(key="language", any=["python", "rust", "go"]),
    MatchExclude(key="status", exclude=["draft", "archived"])
]

results = await engine.semantic_search(
    embedding=query_vector,
    collection="tech_docs",
    limit=20,
    conditions=conditions,
    threshold=0.75  # Minimum similarity score
)

Custom Backend Implementation

from vector_search_mcp.engine.base_engine import BaseEngine
from vector_search_mcp.models import SearchRow, Condition

class CustomEngine(BaseEngine[dict, str]):
    """Example custom backend implementation."""

    def transform_conditions(self, conditions: list[Condition] | None) -> str | None:
        if not conditions:
            return None
        # Convert to custom query string format
        return " AND ".join([f"{c.key}:{c.value}" for c in conditions])

    def transform_response(self, response: dict) -> list[SearchRow]:
        # Convert custom response to SearchRow objects
        return [
            SearchRow(
                chunk_id=str(item['id']),
                score=item['similarity'],
                payload=item['metadata']
            )
            for item in response.get('results', [])
        ]

    async def run_similarity_query(self, embedding, collection, limit=10,
                                 conditions=None, threshold=None) -> dict:
        # Custom backend API call
        return await self.custom_client.search(
            vector=embedding,
            index=collection,
            limit=limit,
            filter=conditions,
            min_score=threshold
        )

MCP Server Integration

# Start the MCP server
from vector_search_mcp import run

# With Server-Sent Events (web-based clients)
run("sse")

# With stdio (terminal/CLI clients)
run("stdio")

📚 Additional Resources

  • Source Code: Fully documented with comprehensive docstrings
  • Test Suite: Located in tests/test_engine/ with detailed README
  • Type Definitions: All public APIs have complete type annotations
  • Examples: See examples/ directory (if available) for more use cases

This documentation covers the current state of the Vector Search MCP package. The architecture is designed for extensibility, type safety, and production use.

Description
Common API for multiple vector search backends
Readme 579 KiB
Languages
Python 98.2%
Dockerfile 1.8%