latticelm/README.md

# Go LLM Gateway

## Overview

A lightweight LLM proxy gateway written in Go that provides a unified API interface for multiple LLM providers. Similar to LiteLLM, but built natively in Go using each provider's official SDK.

## Purpose

Simplify LLM integration by exposing a single, consistent API that routes requests to different providers:
- **Google Generative AI** (Gemini)
- **Anthropic** (Claude)
- **OpenAI** (GPT models)

Instead of managing multiple SDK integrations in your application, call one endpoint and let the gateway handle provider-specific implementations.

## Architecture

```
Client Request
    ↓
Go LLM Gateway (unified API)
    ↓
├─→ Google Gen AI SDK
├─→ Anthropic SDK
└─→ OpenAI SDK
```

## Key Features

- **Single API interface** for multiple LLM providers
- **Native Go SDKs** for optimal performance and type safety
- **Provider abstraction** - switch providers without changing client code
- **Lightweight** - minimal overhead, fast routing
- **Easy configuration** - manage API keys and provider settings centrally

## Use Cases

- Applications that need multi-provider LLM support
- Cost optimization (route to cheapest provider for specific tasks)
- Failover and redundancy (fallback to alternative providers)
- A/B testing across different models
- Centralized LLM access for microservices

## 🎉 Status: **WORKING!**

✅ **All three providers integrated with official Go SDKs:**
- OpenAI → `github.com/openai/openai-go`
- Anthropic → `github.com/anthropics/anthropic-sdk-go`
- Google → `google.golang.org/genai`

✅ **Compiles successfully** (36MB binary)
✅ **Provider auto-selection** (gpt→OpenAI, claude→Anthropic, gemini→Google)
✅ **Configuration system** (YAML with env var support)
✅ **Streaming support** (Server-Sent Events for all providers)

## Quick Start

```bash
# 1. Set API keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export GOOGLE_API_KEY="your-key"

# 2. Build
cd go-llm-gateway
go build -o gateway ./cmd/gateway

# 3. Run
./gateway

# 4. Test (non-streaming)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Hello!"}]
      }
    ]
  }'

# 5. Test streaming
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "stream": true,
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Write a haiku about Go"}]
      }
    ]
  }'
```

## API Standard

This gateway implements the **[Open Responses](https://www.openresponses.org)** specification — an open-source, multi-provider API standard for LLM interfaces based on OpenAI's Responses API.

**Why Open Responses:**
- **Multi-provider by default** - one schema that maps cleanly across providers
- **Agentic workflow support** - consistent streaming events, tool invocation patterns, and "items" as atomic units
- **Extensible** - stable core with room for provider-specific features

By following the Open Responses spec, this gateway ensures:
- Interoperability across different LLM providers
- Standard request/response formats (messages, tool calls, streaming)
- Compatibility with existing Open Responses tooling and ecosystem

For full specification details, see: **https://www.openresponses.org**

## Tech Stack

- **Language:** Go
- **API Specification:** [Open Responses](https://www.openresponses.org)
- **SDKs:**
  - `google.golang.org/genai` (Google Generative AI)
  - Anthropic Go SDK
  - OpenAI Go SDK
- **Transport:** RESTful HTTP (potentially gRPC in the future)

## Status

🚧 **In Development** - Project specification and initial setup phase.

## Getting Started

1. **Copy the example config** and fill in provider API keys:

   ```bash
   cp config.example.yaml config.yaml
   ```

   You can also override API keys via environment variables (`GOOGLE_API_KEY`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`).

2. **Run the gateway** using the default configuration path:

   ```bash
   go run ./cmd/gateway --config config.yaml
   ```

   The server listens on the address configured under `server.address` (defaults to `:8080`).

3. **Call the Open Responses endpoint**:

   ```bash
   curl -X POST http://localhost:8080/v1/responses \
     -H 'Content-Type: application/json' \
     -d '{
           "model": "gpt-4o-mini",
           "input": [
             {"role": "user", "content": [{"type": "input_text", "text": "Hello!"}]}
           ]
         }'
   ```

   Include `"provider": "anthropic"` (or `google`, `openai`) to pin a provider; otherwise the gateway infers it from the model name.

## Project Structure

- `cmd/gateway`: Entry point that loads configuration, wires providers, and starts the HTTP server.
- `internal/config`: YAML configuration loader with environment overrides for API keys.
- `internal/api`: Open Responses request/response types and validation helpers.
- `internal/server`: HTTP handlers that expose `/v1/responses`.
- `internal/providers`: Provider abstractions plus provider-specific scaffolding in `google`, `anthropic`, and `openai` subpackages.

## Next Steps

- Implement the actual SDK calls inside each provider using the official Go clients.
- Support streaming responses and tool invocation per the broader Open Responses spec.
- Add structured logging, tracing, and request-level metrics.
- Expand configuration to support routing policies (cost, latency, failover, etc.).