A lightweight LLM proxy gateway written in Go that provides a unified API interface for multiple LLM providers. Similar to LiteLLM, but built natively in Go using each provider's official SDK.

Purpose

Simplify LLM integration by exposing a single, consistent API that routes requests to different providers:

OpenAI (GPT models)
Azure OpenAI (Azure-deployed models)
Anthropic (Claude)
Google Generative AI (Gemini)

Instead of managing multiple SDK integrations in your application, call one endpoint and let the gateway handle provider-specific implementations.

Architecture

Client Request
    ↓
Go LLM Gateway (unified API)
    ↓
├─→ OpenAI SDK
├─→ Azure OpenAI (OpenAI SDK + Azure auth)
├─→ Anthropic SDK
└─→ Google Gen AI SDK

Key Features

Single API interface for multiple LLM providers
Native Go SDKs for optimal performance and type safety
Provider abstraction - switch providers without changing client code
Lightweight - minimal overhead, fast routing
Easy configuration - manage API keys and provider settings centrally

Use Cases

Applications that need multi-provider LLM support
Cost optimization (route to cheapest provider for specific tasks)
Failover and redundancy (fallback to alternative providers)
A/B testing across different models
Centralized LLM access for microservices

🎉 Status: WORKING!

✅ All four providers integrated with official Go SDKs:

OpenAI → github.com/openai/openai-go
Azure OpenAI → github.com/openai/openai-go (with Azure auth)
Anthropic → github.com/anthropics/anthropic-sdk-go
Google → google.golang.org/genai

✅ Compiles successfully (36MB binary) ✅ Provider auto-selection (gpt→Azure/OpenAI, claude→Anthropic, gemini→Google) ✅ Configuration system (YAML with env var support) ✅ Streaming support (Server-Sent Events for all providers) ✅ OAuth2/OIDC authentication (Google, Auth0, any OIDC provider) ✅ Terminal chat client (Python with Rich UI, PEP 723) ✅ Conversation tracking (previous_response_id for efficient context)

Quick Start

# 1. Set API keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export GOOGLE_API_KEY="your-key"

# 2. Build
cd go-llm-gateway
go build -o gateway ./cmd/gateway

# 3. Run
./gateway

# 4. Test (non-streaming)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Hello!"}]
      }
    ]
  }'

# 5. Test streaming
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "stream": true,
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Write a haiku about Go"}]
      }
    ]
  }'

API Standard

This gateway implements the Open Responses specification — an open-source, multi-provider API standard for LLM interfaces based on OpenAI's Responses API.

Why Open Responses:

Multi-provider by default - one schema that maps cleanly across providers
Agentic workflow support - consistent streaming events, tool invocation patterns, and "items" as atomic units
Extensible - stable core with room for provider-specific features

By following the Open Responses spec, this gateway ensures:

Interoperability across different LLM providers
Standard request/response formats (messages, tool calls, streaming)
Compatibility with existing Open Responses tooling and ecosystem

For full specification details, see: https://www.openresponses.org

Tech Stack

Language: Go
API Specification: Open Responses
SDKs:
- google.golang.org/genai (Google Generative AI)
- Anthropic Go SDK
- OpenAI Go SDK
Transport: RESTful HTTP (potentially gRPC in the future)

Status

🚧 In Development - Project specification and initial setup phase.

Getting Started

Copy the example config and fill in provider API keys:
```
cp config.example.yaml config.yaml
```
You can also override API keys via environment variables (GOOGLE_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY).
Run the gateway using the default configuration path:
```
go run ./cmd/gateway --config config.yaml
```
The server listens on the address configured under server.address (defaults to :8080).

Call the Open Responses endpoint:

curl -X POST http://localhost:8080/v1/responses \
  -H 'Content-Type: application/json' \
  -d '{
        "model": "gpt-4o-mini",
        "input": [
          {"role": "user", "content": [{"type": "input_text", "text": "Hello!"}]}
        ]
      }'

Include "provider": "anthropic" (or google, openai) to pin a provider; otherwise the gateway infers it from the model name.

Project Structure

cmd/gateway: Entry point that loads configuration, wires providers, and starts the HTTP server.
internal/config: YAML configuration loader with environment overrides for API keys.
internal/api: Open Responses request/response types and validation helpers.
internal/server: HTTP handlers that expose /v1/responses.
internal/providers: Provider abstractions plus provider-specific scaffolding in google, anthropic, and openai subpackages.

Chat Client

Interactive terminal chat interface with beautiful Rich UI:

# Basic usage
uv run chat.py

# With authentication
uv run chat.py --token "$(gcloud auth print-identity-token)"

# Switch models on the fly
You> /model claude
You> /models  # List all available models

The chat client automatically uses previous_response_id to reduce token usage by only sending new messages instead of the full conversation history.

See CHAT_CLIENT.md for full documentation.

Conversation Management

The gateway implements conversation tracking using previous_response_id from the Open Responses spec:

📉 Reduced token usage - Only send new messages
⚡ Smaller requests - Less bandwidth
🧠 Server-side context - Gateway maintains history
⏰ Auto-expire - Conversations expire after 1 hour

See CONVERSATIONS.md for details.

Azure OpenAI

The gateway supports Azure OpenAI with the same interface as standard OpenAI:

providers:
  azureopenai:
    api_key: "${AZURE_OPENAI_API_KEY}"
    endpoint: "https://your-resource.openai.azure.com"
    deployment_id: "your-deployment-name"

export AZURE_OPENAI_API_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"
export AZURE_OPENAI_DEPLOYMENT_ID="gpt-4o"

./gateway

The gateway prefers Azure OpenAI for gpt-* models if configured. See AZURE_OPENAI.md for complete setup guide.

Authentication

The gateway supports OAuth2/OIDC authentication. See AUTH.md for setup instructions.

Quick example with Google OAuth:

auth:
  enabled: true
  issuer: "https://accounts.google.com"
  audience: "YOUR-CLIENT-ID.apps.googleusercontent.com"

# Get token
TOKEN=$(gcloud auth print-identity-token)

# Make authenticated request
curl -X POST http://localhost:8080/v1/responses \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"model": "gemini-2.0-flash-exp", ...}'

Next Steps

✅ ~~Implement streaming responses~~
✅ ~~Add OAuth2/OIDC authentication~~
✅ ~~Implement conversation tracking with previous_response_id~~
⬜ Add structured logging, tracing, and request-level metrics
⬜ Support tool/function calling
⬜ Persistent conversation storage (Redis/database)
⬜ Expand configuration to support routing policies (cost, latency, failover)

Languages

Go 83.6%

TypeScript 6.1%

Vue 4.2%

Python 4%

Makefile 1%

Other 1%