Go LLM Gateway
Overview
A lightweight LLM proxy gateway written in Go that provides a unified API interface for multiple LLM providers. Similar to LiteLLM, but built natively in Go using each provider's official SDK.
Purpose
Simplify LLM integration by exposing a single, consistent API that routes requests to different providers:
- OpenAI (GPT models)
- Azure OpenAI (Azure-deployed models)
- Anthropic (Claude)
- Google Generative AI (Gemini)
Instead of managing multiple SDK integrations in your application, call one endpoint and let the gateway handle provider-specific implementations.
Architecture
Client Request
↓
Go LLM Gateway (unified API)
↓
├─→ OpenAI SDK
├─→ Azure OpenAI (OpenAI SDK + Azure auth)
├─→ Anthropic SDK
└─→ Google Gen AI SDK
Key Features
- Single API interface for multiple LLM providers
- Native Go SDKs for optimal performance and type safety
- Provider abstraction - switch providers without changing client code
- Lightweight - minimal overhead, fast routing
- Easy configuration - manage API keys and provider settings centrally
Use Cases
- Applications that need multi-provider LLM support
- Cost optimization (route to cheapest provider for specific tasks)
- Failover and redundancy (fallback to alternative providers)
- A/B testing across different models
- Centralized LLM access for microservices
🎉 Status: WORKING!
✅ All four providers integrated with official Go SDKs:
- OpenAI →
github.com/openai/openai-go - Azure OpenAI →
github.com/openai/openai-go(with Azure auth) - Anthropic →
github.com/anthropics/anthropic-sdk-go - Google →
google.golang.org/genai
✅ Compiles successfully (36MB binary) ✅ Provider auto-selection (gpt→Azure/OpenAI, claude→Anthropic, gemini→Google) ✅ Configuration system (YAML with env var support) ✅ Streaming support (Server-Sent Events for all providers) ✅ OAuth2/OIDC authentication (Google, Auth0, any OIDC provider) ✅ Terminal chat client (Python with Rich UI, PEP 723) ✅ Conversation tracking (previous_response_id for efficient context)
Quick Start
# 1. Set API keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export GOOGLE_API_KEY="your-key"
# 2. Build
cd go-llm-gateway
go build -o gateway ./cmd/gateway
# 3. Run
./gateway
# 4. Test (non-streaming)
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": [{"type": "input_text", "text": "Hello!"}]
}
]
}'
# 5. Test streaming
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "claude-3-5-sonnet-20241022",
"stream": true,
"input": [
{
"role": "user",
"content": [{"type": "input_text", "text": "Write a haiku about Go"}]
}
]
}'
API Standard
This gateway implements the Open Responses specification — an open-source, multi-provider API standard for LLM interfaces based on OpenAI's Responses API.
Why Open Responses:
- Multi-provider by default - one schema that maps cleanly across providers
- Agentic workflow support - consistent streaming events, tool invocation patterns, and "items" as atomic units
- Extensible - stable core with room for provider-specific features
By following the Open Responses spec, this gateway ensures:
- Interoperability across different LLM providers
- Standard request/response formats (messages, tool calls, streaming)
- Compatibility with existing Open Responses tooling and ecosystem
For full specification details, see: https://www.openresponses.org
Tech Stack
- Language: Go
- API Specification: Open Responses
- SDKs:
google.golang.org/genai(Google Generative AI)- Anthropic Go SDK
- OpenAI Go SDK
- Transport: RESTful HTTP (potentially gRPC in the future)
Status
🚧 In Development - Project specification and initial setup phase.
Getting Started
-
Copy the example config and fill in provider API keys:
cp config.example.yaml config.yamlYou can also override API keys via environment variables (
GOOGLE_API_KEY,ANTHROPIC_API_KEY,OPENAI_API_KEY). -
Run the gateway using the default configuration path:
go run ./cmd/gateway --config config.yamlThe server listens on the address configured under
server.address(defaults to:8080). -
Call the Open Responses endpoint:
curl -X POST http://localhost:8080/v1/responses \ -H 'Content-Type: application/json' \ -d '{ "model": "gpt-4o-mini", "input": [ {"role": "user", "content": [{"type": "input_text", "text": "Hello!"}]} ] }'Include
"provider": "anthropic"(orgoogle,openai) to pin a provider; otherwise the gateway infers it from the model name.
Project Structure
cmd/gateway: Entry point that loads configuration, wires providers, and starts the HTTP server.internal/config: YAML configuration loader with environment overrides for API keys.internal/api: Open Responses request/response types and validation helpers.internal/server: HTTP handlers that expose/v1/responses.internal/providers: Provider abstractions plus provider-specific scaffolding ingoogle,anthropic, andopenaisubpackages.
Chat Client
Interactive terminal chat interface with beautiful Rich UI:
# Basic usage
uv run chat.py
# With authentication
uv run chat.py --token "$(gcloud auth print-identity-token)"
# Switch models on the fly
You> /model claude
You> /models # List all available models
The chat client automatically uses previous_response_id to reduce token usage by only sending new messages instead of the full conversation history.
See CHAT_CLIENT.md for full documentation.
Conversation Management
The gateway implements conversation tracking using previous_response_id from the Open Responses spec:
- 📉 Reduced token usage - Only send new messages
- ⚡ Smaller requests - Less bandwidth
- 🧠 Server-side context - Gateway maintains history
- ⏰ Auto-expire - Conversations expire after 1 hour
See CONVERSATIONS.md for details.
Azure OpenAI
The gateway supports Azure OpenAI with the same interface as standard OpenAI:
providers:
azureopenai:
api_key: "${AZURE_OPENAI_API_KEY}"
endpoint: "https://your-resource.openai.azure.com"
deployment_id: "your-deployment-name"
export AZURE_OPENAI_API_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"
export AZURE_OPENAI_DEPLOYMENT_ID="gpt-4o"
./gateway
The gateway prefers Azure OpenAI for gpt-* models if configured. See AZURE_OPENAI.md for complete setup guide.
Authentication
The gateway supports OAuth2/OIDC authentication. See AUTH.md for setup instructions.
Quick example with Google OAuth:
auth:
enabled: true
issuer: "https://accounts.google.com"
audience: "YOUR-CLIENT-ID.apps.googleusercontent.com"
# Get token
TOKEN=$(gcloud auth print-identity-token)
# Make authenticated request
curl -X POST http://localhost:8080/v1/responses \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"model": "gemini-2.0-flash-exp", ...}'
Next Steps
- ✅
Implement streaming responses - ✅
Add OAuth2/OIDC authentication - ✅
Implement conversation tracking with previous_response_id - ⬜ Add structured logging, tracing, and request-level metrics
- ⬜ Support tool/function calling
- ⬜ Persistent conversation storage (Redis/database)
- ⬜ Expand configuration to support routing policies (cost, latency, failover)