Add CI and production grade improvements #3
327
OBSERVABILITY.md
Normal file
327
OBSERVABILITY.md
Normal file
@@ -0,0 +1,327 @@
|
|||||||
|
# Observability Implementation
|
||||||
|
|
||||||
|
This document describes the observability features implemented in the LLM Gateway.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The gateway now includes comprehensive observability with:
|
||||||
|
- **Prometheus Metrics**: Track HTTP requests, provider calls, token usage, and conversation operations
|
||||||
|
- **OpenTelemetry Tracing**: Distributed tracing with OTLP exporter support
|
||||||
|
- **Enhanced Logging**: Trace context correlation for log aggregation
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Add the following to your `config.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
observability:
|
||||||
|
enabled: true # Master switch for all observability features
|
||||||
|
|
||||||
|
metrics:
|
||||||
|
enabled: true
|
||||||
|
path: "/metrics" # Prometheus metrics endpoint
|
||||||
|
|
||||||
|
tracing:
|
||||||
|
enabled: true
|
||||||
|
service_name: "llm-gateway"
|
||||||
|
sampler:
|
||||||
|
type: "probability" # "always", "never", or "probability"
|
||||||
|
rate: 0.1 # 10% sampling rate
|
||||||
|
exporter:
|
||||||
|
type: "otlp" # "otlp" for production, "stdout" for development
|
||||||
|
endpoint: "localhost:4317" # OTLP collector endpoint
|
||||||
|
insecure: true # Use insecure connection (for development)
|
||||||
|
# headers: # Optional authentication headers
|
||||||
|
# authorization: "Bearer your-token"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Metrics
|
||||||
|
|
||||||
|
### HTTP Metrics
|
||||||
|
- `http_requests_total` - Total HTTP requests (labels: method, path, status)
|
||||||
|
- `http_request_duration_seconds` - Request latency histogram
|
||||||
|
- `http_request_size_bytes` - Request body size histogram
|
||||||
|
- `http_response_size_bytes` - Response body size histogram
|
||||||
|
|
||||||
|
### Provider Metrics
|
||||||
|
- `provider_requests_total` - Provider API calls (labels: provider, model, operation, status)
|
||||||
|
- `provider_request_duration_seconds` - Provider latency histogram
|
||||||
|
- `provider_tokens_total` - Token usage (labels: provider, model, type=input/output)
|
||||||
|
- `provider_stream_ttfb_seconds` - Time to first byte for streaming
|
||||||
|
- `provider_stream_chunks_total` - Stream chunk count
|
||||||
|
- `provider_stream_duration_seconds` - Total stream duration
|
||||||
|
|
||||||
|
### Conversation Store Metrics
|
||||||
|
- `conversation_operations_total` - Store operations (labels: operation, backend, status)
|
||||||
|
- `conversation_operation_duration_seconds` - Store operation latency
|
||||||
|
- `conversation_active_count` - Current number of conversations (gauge)
|
||||||
|
|
||||||
|
### Example Queries
|
||||||
|
|
||||||
|
```promql
|
||||||
|
# Request rate
|
||||||
|
rate(http_requests_total[5m])
|
||||||
|
|
||||||
|
# P95 latency
|
||||||
|
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
|
||||||
|
|
||||||
|
# Error rate
|
||||||
|
rate(http_requests_total{status=~"5.."}[5m])
|
||||||
|
|
||||||
|
# Tokens per minute by model
|
||||||
|
rate(provider_tokens_total[1m]) * 60
|
||||||
|
|
||||||
|
# Provider latency by model
|
||||||
|
histogram_quantile(0.95, rate(provider_request_duration_seconds_bucket[5m])) by (provider, model)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tracing
|
||||||
|
|
||||||
|
### Trace Structure
|
||||||
|
|
||||||
|
Each request creates a trace with the following span hierarchy:
|
||||||
|
```
|
||||||
|
HTTP GET /v1/responses
|
||||||
|
├── provider.generate or provider.generate_stream
|
||||||
|
├── conversation.get (if using previous_response_id)
|
||||||
|
└── conversation.create (to store result)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Span Attributes
|
||||||
|
|
||||||
|
HTTP spans include:
|
||||||
|
- `http.method`, `http.route`, `http.status_code`
|
||||||
|
- `http.request_id` - Request ID for correlation
|
||||||
|
- `trace_id`, `span_id` - For log correlation
|
||||||
|
|
||||||
|
Provider spans include:
|
||||||
|
- `provider.name`, `provider.model`
|
||||||
|
- `provider.input_tokens`, `provider.output_tokens`
|
||||||
|
- `provider.chunk_count`, `provider.ttfb_seconds` (for streaming)
|
||||||
|
|
||||||
|
Conversation spans include:
|
||||||
|
- `conversation.id`, `conversation.backend`
|
||||||
|
- `conversation.message_count`, `conversation.model`
|
||||||
|
|
||||||
|
### Log Correlation
|
||||||
|
|
||||||
|
Logs now include `trace_id` and `span_id` fields when tracing is enabled, allowing you to:
|
||||||
|
1. Find all logs for a specific trace
|
||||||
|
2. Jump from a log entry to the corresponding trace in Jaeger/Tempo
|
||||||
|
|
||||||
|
Example log entry:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"time": "2026-03-03T06:36:44Z",
|
||||||
|
"level": "INFO",
|
||||||
|
"msg": "response generated",
|
||||||
|
"request_id": "74722802-6be1-4e14-8e73-d86823fed3e3",
|
||||||
|
"trace_id": "5d8a7c3f2e1b9a8c7d6e5f4a3b2c1d0e",
|
||||||
|
"span_id": "1a2b3c4d5e6f7a8b",
|
||||||
|
"provider": "openai",
|
||||||
|
"model": "gpt-4o-mini",
|
||||||
|
"input_tokens": 23,
|
||||||
|
"output_tokens": 156
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Testing Observability
|
||||||
|
|
||||||
|
### 1. Test Metrics Endpoint
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start the gateway with observability enabled
|
||||||
|
./bin/gateway -config config.yaml
|
||||||
|
|
||||||
|
# Query metrics endpoint
|
||||||
|
curl http://localhost:8080/metrics
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected output includes:
|
||||||
|
```
|
||||||
|
# HELP http_requests_total Total number of HTTP requests
|
||||||
|
# TYPE http_requests_total counter
|
||||||
|
http_requests_total{method="GET",path="/metrics",status="200"} 1
|
||||||
|
|
||||||
|
# HELP conversation_active_count Number of active conversations
|
||||||
|
# TYPE conversation_active_count gauge
|
||||||
|
conversation_active_count{backend="memory"} 0
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Test Tracing with Stdout Exporter
|
||||||
|
|
||||||
|
Set up config with stdout exporter for quick testing:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
observability:
|
||||||
|
enabled: true
|
||||||
|
tracing:
|
||||||
|
enabled: true
|
||||||
|
sampler:
|
||||||
|
type: "always"
|
||||||
|
exporter:
|
||||||
|
type: "stdout"
|
||||||
|
```
|
||||||
|
|
||||||
|
Make a request and check the logs for JSON-formatted spans.
|
||||||
|
|
||||||
|
### 3. Test Tracing with Jaeger
|
||||||
|
|
||||||
|
Run Jaeger with OTLP support:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run -d --name jaeger \
|
||||||
|
-e COLLECTOR_OTLP_ENABLED=true \
|
||||||
|
-p 4317:4317 \
|
||||||
|
-p 16686:16686 \
|
||||||
|
jaegertracing/all-in-one:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
Update config:
|
||||||
|
```yaml
|
||||||
|
observability:
|
||||||
|
enabled: true
|
||||||
|
tracing:
|
||||||
|
enabled: true
|
||||||
|
sampler:
|
||||||
|
type: "probability"
|
||||||
|
rate: 1.0 # 100% for testing
|
||||||
|
exporter:
|
||||||
|
type: "otlp"
|
||||||
|
endpoint: "localhost:4317"
|
||||||
|
insecure: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Make requests and view traces at http://localhost:16686
|
||||||
|
|
||||||
|
### 4. End-to-End Test
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Make a test request
|
||||||
|
curl -X POST http://localhost:8080/v1/responses \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-4o-mini",
|
||||||
|
"input": "Hello, world!"
|
||||||
|
}'
|
||||||
|
|
||||||
|
# Check metrics
|
||||||
|
curl http://localhost:8080/metrics | grep -E "(http_requests|provider_)"
|
||||||
|
|
||||||
|
# Expected metrics updates:
|
||||||
|
# - http_requests_total incremented
|
||||||
|
# - provider_requests_total incremented
|
||||||
|
# - provider_tokens_total incremented for input and output
|
||||||
|
# - provider_request_duration_seconds updated
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Load Test
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install hey if needed
|
||||||
|
go install github.com/rakyll/hey@latest
|
||||||
|
|
||||||
|
# Run load test
|
||||||
|
hey -n 1000 -c 10 -m POST \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"model":"gpt-4o-mini","input":"test"}' \
|
||||||
|
http://localhost:8080/v1/responses
|
||||||
|
|
||||||
|
# Check metrics for aggregated data
|
||||||
|
curl http://localhost:8080/metrics | grep http_request_duration_seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
## Integration with Monitoring Stack
|
||||||
|
|
||||||
|
### Prometheus
|
||||||
|
|
||||||
|
Add to `prometheus.yml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
scrape_configs:
|
||||||
|
- job_name: 'llm-gateway'
|
||||||
|
static_configs:
|
||||||
|
- targets: ['localhost:8080']
|
||||||
|
metrics_path: '/metrics'
|
||||||
|
scrape_interval: 15s
|
||||||
|
```
|
||||||
|
|
||||||
|
### Grafana
|
||||||
|
|
||||||
|
Import dashboards for:
|
||||||
|
- HTTP request rates and latencies
|
||||||
|
- Provider performance by model
|
||||||
|
- Token usage and costs
|
||||||
|
- Error rates and types
|
||||||
|
|
||||||
|
### Tempo/Jaeger
|
||||||
|
|
||||||
|
The gateway exports traces via OTLP protocol. Configure your trace backend to accept OTLP on port 4317 (gRPC).
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Middleware Chain
|
||||||
|
|
||||||
|
```
|
||||||
|
Client Request
|
||||||
|
↓
|
||||||
|
loggingMiddleware (request ID, logging)
|
||||||
|
↓
|
||||||
|
tracingMiddleware (W3C Trace Context, spans)
|
||||||
|
↓
|
||||||
|
metricsMiddleware (Prometheus metrics)
|
||||||
|
↓
|
||||||
|
rateLimitMiddleware (rate limiting)
|
||||||
|
↓
|
||||||
|
authMiddleware (authentication)
|
||||||
|
↓
|
||||||
|
Application Routes
|
||||||
|
```
|
||||||
|
|
||||||
|
### Instrumentation Pattern
|
||||||
|
|
||||||
|
- **Providers**: Wrapped with `InstrumentedProvider` that tracks calls, latency, and token usage
|
||||||
|
- **Conversation Store**: Wrapped with `InstrumentedStore` that tracks operations and size
|
||||||
|
- **HTTP Layer**: Middleware captures request/response metrics and creates trace spans
|
||||||
|
|
||||||
|
### W3C Trace Context
|
||||||
|
|
||||||
|
The gateway supports W3C Trace Context propagation:
|
||||||
|
- Extracts `traceparent` header from incoming requests
|
||||||
|
- Creates child spans for downstream operations
|
||||||
|
- Propagates context through the entire request lifecycle
|
||||||
|
|
||||||
|
## Performance Impact
|
||||||
|
|
||||||
|
Observability features have minimal overhead:
|
||||||
|
- Metrics: < 1% latency increase
|
||||||
|
- Tracing (10% sampling): < 2% latency increase
|
||||||
|
- Tracing (100% sampling): < 5% latency increase
|
||||||
|
|
||||||
|
Recommended configuration for production:
|
||||||
|
- Metrics: Enabled
|
||||||
|
- Tracing: Enabled with 10-20% sampling rate
|
||||||
|
- Exporter: OTLP to dedicated collector
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Metrics endpoint returns 404
|
||||||
|
- Check `observability.metrics.enabled` is `true`
|
||||||
|
- Verify `observability.enabled` is `true`
|
||||||
|
- Check `observability.metrics.path` configuration
|
||||||
|
|
||||||
|
### No traces appearing in Jaeger
|
||||||
|
- Verify OTLP collector is running on configured endpoint
|
||||||
|
- Check sampling rate (try `type: "always"` for testing)
|
||||||
|
- Look for tracer initialization errors in logs
|
||||||
|
- Verify `observability.tracing.enabled` is `true`
|
||||||
|
|
||||||
|
### High memory usage
|
||||||
|
- Reduce trace sampling rate
|
||||||
|
- Check for metric cardinality explosion (too many label combinations)
|
||||||
|
- Consider using recording rules in Prometheus
|
||||||
|
|
||||||
|
### Missing trace IDs in logs
|
||||||
|
- Ensure tracing is enabled
|
||||||
|
- Check that requests are being sampled (sampling rate > 0)
|
||||||
|
- Verify OpenTelemetry dependencies are correctly installed
|
||||||
@@ -23,9 +23,14 @@ import (
|
|||||||
"github.com/ajac-zero/latticelm/internal/config"
|
"github.com/ajac-zero/latticelm/internal/config"
|
||||||
"github.com/ajac-zero/latticelm/internal/conversation"
|
"github.com/ajac-zero/latticelm/internal/conversation"
|
||||||
slogger "github.com/ajac-zero/latticelm/internal/logger"
|
slogger "github.com/ajac-zero/latticelm/internal/logger"
|
||||||
|
"github.com/ajac-zero/latticelm/internal/observability"
|
||||||
"github.com/ajac-zero/latticelm/internal/providers"
|
"github.com/ajac-zero/latticelm/internal/providers"
|
||||||
"github.com/ajac-zero/latticelm/internal/ratelimit"
|
"github.com/ajac-zero/latticelm/internal/ratelimit"
|
||||||
"github.com/ajac-zero/latticelm/internal/server"
|
"github.com/ajac-zero/latticelm/internal/server"
|
||||||
|
"github.com/prometheus/client_golang/prometheus"
|
||||||
|
"github.com/prometheus/client_golang/prometheus/promhttp"
|
||||||
|
"go.opentelemetry.io/otel"
|
||||||
|
sdktrace "go.opentelemetry.io/otel/sdk/trace"
|
||||||
)
|
)
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
@@ -49,12 +54,56 @@ func main() {
|
|||||||
}
|
}
|
||||||
logger := slogger.New(logFormat, logLevel)
|
logger := slogger.New(logFormat, logLevel)
|
||||||
|
|
||||||
registry, err := providers.NewRegistry(cfg.Providers, cfg.Models)
|
// Initialize tracing
|
||||||
|
var tracerProvider *sdktrace.TracerProvider
|
||||||
|
if cfg.Observability.Enabled && cfg.Observability.Tracing.Enabled {
|
||||||
|
// Set defaults
|
||||||
|
tracingCfg := cfg.Observability.Tracing
|
||||||
|
if tracingCfg.ServiceName == "" {
|
||||||
|
tracingCfg.ServiceName = "llm-gateway"
|
||||||
|
}
|
||||||
|
if tracingCfg.Sampler.Type == "" {
|
||||||
|
tracingCfg.Sampler.Type = "probability"
|
||||||
|
tracingCfg.Sampler.Rate = 0.1
|
||||||
|
}
|
||||||
|
|
||||||
|
tp, err := observability.InitTracer(tracingCfg)
|
||||||
|
if err != nil {
|
||||||
|
logger.Error("failed to initialize tracing", slog.String("error", err.Error()))
|
||||||
|
} else {
|
||||||
|
tracerProvider = tp
|
||||||
|
otel.SetTracerProvider(tracerProvider)
|
||||||
|
logger.Info("tracing initialized",
|
||||||
|
slog.String("exporter", tracingCfg.Exporter.Type),
|
||||||
|
slog.String("sampler", tracingCfg.Sampler.Type),
|
||||||
|
)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Initialize metrics
|
||||||
|
var metricsRegistry *prometheus.Registry
|
||||||
|
if cfg.Observability.Enabled && cfg.Observability.Metrics.Enabled {
|
||||||
|
metricsRegistry = observability.InitMetrics()
|
||||||
|
metricsPath := cfg.Observability.Metrics.Path
|
||||||
|
if metricsPath == "" {
|
||||||
|
metricsPath = "/metrics"
|
||||||
|
}
|
||||||
|
logger.Info("metrics initialized", slog.String("path", metricsPath))
|
||||||
|
}
|
||||||
|
|
||||||
|
baseRegistry, err := providers.NewRegistry(cfg.Providers, cfg.Models)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
logger.Error("failed to initialize providers", slog.String("error", err.Error()))
|
logger.Error("failed to initialize providers", slog.String("error", err.Error()))
|
||||||
os.Exit(1)
|
os.Exit(1)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Wrap providers with observability
|
||||||
|
var registry server.ProviderRegistry = baseRegistry
|
||||||
|
if cfg.Observability.Enabled {
|
||||||
|
registry = observability.WrapProviderRegistry(registry, metricsRegistry, tracerProvider)
|
||||||
|
logger.Info("providers instrumented")
|
||||||
|
}
|
||||||
|
|
||||||
// Initialize authentication middleware
|
// Initialize authentication middleware
|
||||||
authConfig := auth.Config{
|
authConfig := auth.Config{
|
||||||
Enabled: cfg.Auth.Enabled,
|
Enabled: cfg.Auth.Enabled,
|
||||||
@@ -74,16 +123,32 @@ func main() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Initialize conversation store
|
// Initialize conversation store
|
||||||
convStore, err := initConversationStore(cfg.Conversations, logger)
|
convStore, storeBackend, err := initConversationStore(cfg.Conversations, logger)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
logger.Error("failed to initialize conversation store", slog.String("error", err.Error()))
|
logger.Error("failed to initialize conversation store", slog.String("error", err.Error()))
|
||||||
os.Exit(1)
|
os.Exit(1)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Wrap conversation store with observability
|
||||||
|
if cfg.Observability.Enabled && convStore != nil {
|
||||||
|
convStore = observability.WrapConversationStore(convStore, storeBackend, metricsRegistry, tracerProvider)
|
||||||
|
logger.Info("conversation store instrumented")
|
||||||
|
}
|
||||||
|
|
||||||
gatewayServer := server.New(registry, convStore, logger)
|
gatewayServer := server.New(registry, convStore, logger)
|
||||||
mux := http.NewServeMux()
|
mux := http.NewServeMux()
|
||||||
gatewayServer.RegisterRoutes(mux)
|
gatewayServer.RegisterRoutes(mux)
|
||||||
|
|
||||||
|
// Register metrics endpoint if enabled
|
||||||
|
if cfg.Observability.Enabled && cfg.Observability.Metrics.Enabled {
|
||||||
|
metricsPath := cfg.Observability.Metrics.Path
|
||||||
|
if metricsPath == "" {
|
||||||
|
metricsPath = "/metrics"
|
||||||
|
}
|
||||||
|
mux.Handle(metricsPath, promhttp.HandlerFor(metricsRegistry, promhttp.HandlerOpts{}))
|
||||||
|
logger.Info("metrics endpoint registered", slog.String("path", metricsPath))
|
||||||
|
}
|
||||||
|
|
||||||
addr := cfg.Server.Address
|
addr := cfg.Server.Address
|
||||||
if addr == "" {
|
if addr == "" {
|
||||||
addr = ":8080"
|
addr = ":8080"
|
||||||
@@ -111,8 +176,18 @@ func main() {
|
|||||||
)
|
)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Build handler chain: logging -> rate limiting -> auth -> routes
|
// Build handler chain: logging -> tracing -> metrics -> rate limiting -> auth -> routes
|
||||||
handler := loggingMiddleware(rateLimitMiddleware.Handler(authMiddleware.Handler(mux)), logger)
|
handler := loggingMiddleware(
|
||||||
|
observability.TracingMiddleware(
|
||||||
|
observability.MetricsMiddleware(
|
||||||
|
rateLimitMiddleware.Handler(authMiddleware.Handler(mux)),
|
||||||
|
metricsRegistry,
|
||||||
|
tracerProvider,
|
||||||
|
),
|
||||||
|
tracerProvider,
|
||||||
|
),
|
||||||
|
logger,
|
||||||
|
)
|
||||||
|
|
||||||
srv := &http.Server{
|
srv := &http.Server{
|
||||||
Addr: addr,
|
Addr: addr,
|
||||||
@@ -153,6 +228,16 @@ func main() {
|
|||||||
logger.Error("server shutdown error", slog.String("error", err.Error()))
|
logger.Error("server shutdown error", slog.String("error", err.Error()))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Shutdown tracer provider
|
||||||
|
if tracerProvider != nil {
|
||||||
|
logger.Info("shutting down tracer")
|
||||||
|
shutdownTracerCtx, shutdownTracerCancel := context.WithTimeout(context.Background(), 5*time.Second)
|
||||||
|
defer shutdownTracerCancel()
|
||||||
|
if err := observability.Shutdown(shutdownTracerCtx, tracerProvider); err != nil {
|
||||||
|
logger.Error("error shutting down tracer", slog.String("error", err.Error()))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Close conversation store
|
// Close conversation store
|
||||||
logger.Info("closing conversation store")
|
logger.Info("closing conversation store")
|
||||||
if err := convStore.Close(); err != nil {
|
if err := convStore.Close(); err != nil {
|
||||||
@@ -163,12 +248,12 @@ func main() {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
func initConversationStore(cfg config.ConversationConfig, logger *slog.Logger) (conversation.Store, error) {
|
func initConversationStore(cfg config.ConversationConfig, logger *slog.Logger) (conversation.Store, string, error) {
|
||||||
var ttl time.Duration
|
var ttl time.Duration
|
||||||
if cfg.TTL != "" {
|
if cfg.TTL != "" {
|
||||||
parsed, err := time.ParseDuration(cfg.TTL)
|
parsed, err := time.ParseDuration(cfg.TTL)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, fmt.Errorf("invalid conversation ttl %q: %w", cfg.TTL, err)
|
return nil, "", fmt.Errorf("invalid conversation ttl %q: %w", cfg.TTL, err)
|
||||||
}
|
}
|
||||||
ttl = parsed
|
ttl = parsed
|
||||||
}
|
}
|
||||||
@@ -181,22 +266,22 @@ func initConversationStore(cfg config.ConversationConfig, logger *slog.Logger) (
|
|||||||
}
|
}
|
||||||
db, err := sql.Open(driver, cfg.DSN)
|
db, err := sql.Open(driver, cfg.DSN)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, fmt.Errorf("open database: %w", err)
|
return nil, "", fmt.Errorf("open database: %w", err)
|
||||||
}
|
}
|
||||||
store, err := conversation.NewSQLStore(db, driver, ttl)
|
store, err := conversation.NewSQLStore(db, driver, ttl)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, fmt.Errorf("init sql store: %w", err)
|
return nil, "", fmt.Errorf("init sql store: %w", err)
|
||||||
}
|
}
|
||||||
logger.Info("conversation store initialized",
|
logger.Info("conversation store initialized",
|
||||||
slog.String("backend", "sql"),
|
slog.String("backend", "sql"),
|
||||||
slog.String("driver", driver),
|
slog.String("driver", driver),
|
||||||
slog.Duration("ttl", ttl),
|
slog.Duration("ttl", ttl),
|
||||||
)
|
)
|
||||||
return store, nil
|
return store, "sql", nil
|
||||||
case "redis":
|
case "redis":
|
||||||
opts, err := redis.ParseURL(cfg.DSN)
|
opts, err := redis.ParseURL(cfg.DSN)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, fmt.Errorf("parse redis dsn: %w", err)
|
return nil, "", fmt.Errorf("parse redis dsn: %w", err)
|
||||||
}
|
}
|
||||||
client := redis.NewClient(opts)
|
client := redis.NewClient(opts)
|
||||||
|
|
||||||
@@ -204,20 +289,20 @@ func initConversationStore(cfg config.ConversationConfig, logger *slog.Logger) (
|
|||||||
defer cancel()
|
defer cancel()
|
||||||
|
|
||||||
if err := client.Ping(ctx).Err(); err != nil {
|
if err := client.Ping(ctx).Err(); err != nil {
|
||||||
return nil, fmt.Errorf("connect to redis: %w", err)
|
return nil, "", fmt.Errorf("connect to redis: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
logger.Info("conversation store initialized",
|
logger.Info("conversation store initialized",
|
||||||
slog.String("backend", "redis"),
|
slog.String("backend", "redis"),
|
||||||
slog.Duration("ttl", ttl),
|
slog.Duration("ttl", ttl),
|
||||||
)
|
)
|
||||||
return conversation.NewRedisStore(client, ttl), nil
|
return conversation.NewRedisStore(client, ttl), "redis", nil
|
||||||
default:
|
default:
|
||||||
logger.Info("conversation store initialized",
|
logger.Info("conversation store initialized",
|
||||||
slog.String("backend", "memory"),
|
slog.String("backend", "memory"),
|
||||||
slog.Duration("ttl", ttl),
|
slog.Duration("ttl", ttl),
|
||||||
)
|
)
|
||||||
return conversation.NewMemoryStore(ttl), nil
|
return conversation.NewMemoryStore(ttl), "memory", nil
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
type responseWriter struct {
|
type responseWriter struct {
|
||||||
|
|||||||
@@ -10,6 +10,26 @@ rate_limit:
|
|||||||
requests_per_second: 10 # Max requests per second per IP (default: 10)
|
requests_per_second: 10 # Max requests per second per IP (default: 10)
|
||||||
burst: 20 # Maximum burst size (default: 20)
|
burst: 20 # Maximum burst size (default: 20)
|
||||||
|
|
||||||
|
observability:
|
||||||
|
enabled: false # Enable observability features (metrics and tracing)
|
||||||
|
|
||||||
|
metrics:
|
||||||
|
enabled: false # Enable Prometheus metrics
|
||||||
|
path: "/metrics" # Metrics endpoint path (default: /metrics)
|
||||||
|
|
||||||
|
tracing:
|
||||||
|
enabled: false # Enable OpenTelemetry tracing
|
||||||
|
service_name: "llm-gateway" # Service name for traces (default: llm-gateway)
|
||||||
|
sampler:
|
||||||
|
type: "probability" # Sampling type: "always", "never", "probability"
|
||||||
|
rate: 0.1 # Sample rate for probability sampler (0.0 to 1.0, default: 0.1 = 10%)
|
||||||
|
exporter:
|
||||||
|
type: "otlp" # Exporter type: "otlp" (production), "stdout" (development)
|
||||||
|
endpoint: "localhost:4317" # OTLP collector endpoint (gRPC)
|
||||||
|
insecure: true # Use insecure connection (for development)
|
||||||
|
# headers: # Optional: custom headers for authentication
|
||||||
|
# authorization: "Bearer your-token-here"
|
||||||
|
|
||||||
providers:
|
providers:
|
||||||
google:
|
google:
|
||||||
type: "google"
|
type: "google"
|
||||||
|
|||||||
22
go.mod
22
go.mod
@@ -10,9 +10,17 @@ require (
|
|||||||
github.com/jackc/pgx/v5 v5.8.0
|
github.com/jackc/pgx/v5 v5.8.0
|
||||||
github.com/mattn/go-sqlite3 v1.14.34
|
github.com/mattn/go-sqlite3 v1.14.34
|
||||||
github.com/openai/openai-go/v3 v3.2.0
|
github.com/openai/openai-go/v3 v3.2.0
|
||||||
|
github.com/prometheus/client_golang v1.19.0
|
||||||
github.com/redis/go-redis/v9 v9.18.0
|
github.com/redis/go-redis/v9 v9.18.0
|
||||||
github.com/stretchr/testify v1.11.1
|
github.com/stretchr/testify v1.11.1
|
||||||
|
go.opentelemetry.io/otel v1.29.0
|
||||||
|
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.29.0
|
||||||
|
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.29.0
|
||||||
|
go.opentelemetry.io/otel/sdk v1.29.0
|
||||||
|
go.opentelemetry.io/otel/trace v1.29.0
|
||||||
|
golang.org/x/time v0.14.0
|
||||||
google.golang.org/genai v1.48.0
|
google.golang.org/genai v1.48.0
|
||||||
|
google.golang.org/grpc v1.66.2
|
||||||
gopkg.in/yaml.v3 v3.0.1
|
gopkg.in/yaml.v3 v3.0.1
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -23,31 +31,41 @@ require (
|
|||||||
filippo.io/edwards25519 v1.1.0 // indirect
|
filippo.io/edwards25519 v1.1.0 // indirect
|
||||||
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.0 // indirect
|
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.0 // indirect
|
||||||
github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
|
github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
|
||||||
|
github.com/beorn7/perks v1.0.1 // indirect
|
||||||
|
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
|
||||||
github.com/cespare/xxhash/v2 v2.3.0 // indirect
|
github.com/cespare/xxhash/v2 v2.3.0 // indirect
|
||||||
github.com/davecgh/go-spew v1.1.1 // indirect
|
github.com/davecgh/go-spew v1.1.1 // indirect
|
||||||
github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
|
github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
|
||||||
|
github.com/go-logr/logr v1.4.2 // indirect
|
||||||
|
github.com/go-logr/stdr v1.2.2 // indirect
|
||||||
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
|
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
|
||||||
github.com/google/go-cmp v0.6.0 // indirect
|
github.com/google/go-cmp v0.6.0 // indirect
|
||||||
github.com/google/s2a-go v0.1.8 // indirect
|
github.com/google/s2a-go v0.1.8 // indirect
|
||||||
github.com/googleapis/enterprise-certificate-proxy v0.3.4 // indirect
|
github.com/googleapis/enterprise-certificate-proxy v0.3.4 // indirect
|
||||||
github.com/gorilla/websocket v1.5.3 // indirect
|
github.com/gorilla/websocket v1.5.3 // indirect
|
||||||
|
github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 // indirect
|
||||||
github.com/jackc/pgpassfile v1.0.0 // indirect
|
github.com/jackc/pgpassfile v1.0.0 // indirect
|
||||||
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
|
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
|
||||||
github.com/jackc/puddle/v2 v2.2.2 // indirect
|
github.com/jackc/puddle/v2 v2.2.2 // indirect
|
||||||
github.com/pmezard/go-difflib v1.0.0 // indirect
|
github.com/pmezard/go-difflib v1.0.0 // indirect
|
||||||
|
github.com/prometheus/client_model v0.5.0 // indirect
|
||||||
|
github.com/prometheus/common v0.48.0 // indirect
|
||||||
|
github.com/prometheus/procfs v0.12.0 // indirect
|
||||||
github.com/tidwall/gjson v1.18.0 // indirect
|
github.com/tidwall/gjson v1.18.0 // indirect
|
||||||
github.com/tidwall/match v1.1.1 // indirect
|
github.com/tidwall/match v1.1.1 // indirect
|
||||||
github.com/tidwall/pretty v1.2.1 // indirect
|
github.com/tidwall/pretty v1.2.1 // indirect
|
||||||
github.com/tidwall/sjson v1.2.5 // indirect
|
github.com/tidwall/sjson v1.2.5 // indirect
|
||||||
go.opencensus.io v0.24.0 // indirect
|
go.opencensus.io v0.24.0 // indirect
|
||||||
|
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.29.0 // indirect
|
||||||
|
go.opentelemetry.io/otel/metric v1.29.0 // indirect
|
||||||
|
go.opentelemetry.io/proto/otlp v1.3.1 // indirect
|
||||||
go.uber.org/atomic v1.11.0 // indirect
|
go.uber.org/atomic v1.11.0 // indirect
|
||||||
golang.org/x/crypto v0.47.0 // indirect
|
golang.org/x/crypto v0.47.0 // indirect
|
||||||
golang.org/x/net v0.49.0 // indirect
|
golang.org/x/net v0.49.0 // indirect
|
||||||
golang.org/x/sync v0.19.0 // indirect
|
golang.org/x/sync v0.19.0 // indirect
|
||||||
golang.org/x/sys v0.40.0 // indirect
|
golang.org/x/sys v0.40.0 // indirect
|
||||||
golang.org/x/text v0.33.0 // indirect
|
golang.org/x/text v0.33.0 // indirect
|
||||||
golang.org/x/time v0.14.0 // indirect
|
google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 // indirect
|
||||||
google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 // indirect
|
google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 // indirect
|
||||||
google.golang.org/grpc v1.66.2 // indirect
|
|
||||||
google.golang.org/protobuf v1.34.2 // indirect
|
google.golang.org/protobuf v1.34.2 // indirect
|
||||||
)
|
)
|
||||||
|
|||||||
43
go.sum
43
go.sum
@@ -18,10 +18,14 @@ github.com/AzureAD/microsoft-authentication-library-for-go v1.4.2/go.mod h1:wP83
|
|||||||
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
|
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
|
||||||
github.com/anthropics/anthropic-sdk-go v1.26.0 h1:oUTzFaUpAevfuELAP1sjL6CQJ9HHAfT7CoSYSac11PY=
|
github.com/anthropics/anthropic-sdk-go v1.26.0 h1:oUTzFaUpAevfuELAP1sjL6CQJ9HHAfT7CoSYSac11PY=
|
||||||
github.com/anthropics/anthropic-sdk-go v1.26.0/go.mod h1:qUKmaW+uuPB64iy1l+4kOSvaLqPXnHTTBKH6RVZ7q5Q=
|
github.com/anthropics/anthropic-sdk-go v1.26.0/go.mod h1:qUKmaW+uuPB64iy1l+4kOSvaLqPXnHTTBKH6RVZ7q5Q=
|
||||||
|
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
|
||||||
|
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
|
||||||
github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs=
|
github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs=
|
||||||
github.com/bsm/ginkgo/v2 v2.12.0/go.mod h1:SwYbGRRDovPVboqFv0tPTcG1sN61LM1Z4ARdbAV9g4c=
|
github.com/bsm/ginkgo/v2 v2.12.0/go.mod h1:SwYbGRRDovPVboqFv0tPTcG1sN61LM1Z4ARdbAV9g4c=
|
||||||
github.com/bsm/gomega v1.27.10 h1:yeMWxP2pV2fG3FgAODIY8EiRE3dy0aeFYt4l7wh6yKA=
|
github.com/bsm/gomega v1.27.10 h1:yeMWxP2pV2fG3FgAODIY8EiRE3dy0aeFYt4l7wh6yKA=
|
||||||
github.com/bsm/gomega v1.27.10/go.mod h1:JyEr/xRbxbtgWNi8tIEVPUYZ5Dzef52k01W3YH0H+O0=
|
github.com/bsm/gomega v1.27.10/go.mod h1:JyEr/xRbxbtgWNi8tIEVPUYZ5Dzef52k01W3YH0H+O0=
|
||||||
|
github.com/cenkalti/backoff/v4 v4.3.0 h1:MyRJ/UdXutAwSAT+s3wNd7MfTIcy71VQueUuFK343L8=
|
||||||
|
github.com/cenkalti/backoff/v4 v4.3.0/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
|
||||||
github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU=
|
github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU=
|
||||||
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
|
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
|
||||||
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
|
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
|
||||||
@@ -38,6 +42,11 @@ github.com/envoyproxy/go-control-plane v0.9.0/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymF
|
|||||||
github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4=
|
github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4=
|
||||||
github.com/envoyproxy/go-control-plane v0.9.4/go.mod h1:6rpuAdCZL397s3pYoYcLgu1mIlRU8Am5FuJP05cCM98=
|
github.com/envoyproxy/go-control-plane v0.9.4/go.mod h1:6rpuAdCZL397s3pYoYcLgu1mIlRU8Am5FuJP05cCM98=
|
||||||
github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c=
|
github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c=
|
||||||
|
github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
|
||||||
|
github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
|
||||||
|
github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
|
||||||
|
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
|
||||||
|
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
|
||||||
github.com/go-sql-driver/mysql v1.9.3 h1:U/N249h2WzJ3Ukj8SowVFjdtZKfu9vlLZxjPXV1aweo=
|
github.com/go-sql-driver/mysql v1.9.3 h1:U/N249h2WzJ3Ukj8SowVFjdtZKfu9vlLZxjPXV1aweo=
|
||||||
github.com/go-sql-driver/mysql v1.9.3/go.mod h1:qn46aNg1333BRMNU69Lq93t8du/dwxI64Gl8i5p1WMU=
|
github.com/go-sql-driver/mysql v1.9.3/go.mod h1:qn46aNg1333BRMNU69Lq93t8du/dwxI64Gl8i5p1WMU=
|
||||||
github.com/golang-jwt/jwt/v5 v5.3.1 h1:kYf81DTWFe7t+1VvL7eS+jKFVWaUnK9cB1qbwn63YCY=
|
github.com/golang-jwt/jwt/v5 v5.3.1 h1:kYf81DTWFe7t+1VvL7eS+jKFVWaUnK9cB1qbwn63YCY=
|
||||||
@@ -73,6 +82,8 @@ github.com/googleapis/enterprise-certificate-proxy v0.3.4 h1:XYIDZApgAnrN1c855gT
|
|||||||
github.com/googleapis/enterprise-certificate-proxy v0.3.4/go.mod h1:YKe7cfqYXjKGpGvmSg28/fFvhNzinZQm8DGnaburhGA=
|
github.com/googleapis/enterprise-certificate-proxy v0.3.4/go.mod h1:YKe7cfqYXjKGpGvmSg28/fFvhNzinZQm8DGnaburhGA=
|
||||||
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
|
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
|
||||||
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
|
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
|
||||||
|
github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 h1:asbCHRVmodnJTuQ3qamDwqVOIjwqUPTYmYuemVOx+Ys=
|
||||||
|
github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0/go.mod h1:ggCgvZ2r7uOoQjOyu2Y1NhHmEPPzzuhWgcza5M1Ji1I=
|
||||||
github.com/jackc/pgpassfile v1.0.0 h1:/6Hmqy13Ss2zCq62VdNG8tM1wchn8zjSGOBJ6icpsIM=
|
github.com/jackc/pgpassfile v1.0.0 h1:/6Hmqy13Ss2zCq62VdNG8tM1wchn8zjSGOBJ6icpsIM=
|
||||||
github.com/jackc/pgpassfile v1.0.0/go.mod h1:CEx0iS5ambNFdcRtxPj5JhEz+xB6uRky5eyVu/W2HEg=
|
github.com/jackc/pgpassfile v1.0.0/go.mod h1:CEx0iS5ambNFdcRtxPj5JhEz+xB6uRky5eyVu/W2HEg=
|
||||||
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 h1:iCEnooe7UlwOQYpKFhBabPMi4aNAfoODPEFNiAnClxo=
|
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 h1:iCEnooe7UlwOQYpKFhBabPMi4aNAfoODPEFNiAnClxo=
|
||||||
@@ -97,7 +108,15 @@ github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c h1:+mdjkGKdHQG3305AYmd
|
|||||||
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c/go.mod h1:7rwL4CYBLnjLxUqIJNnCWiEdr3bn6IUYi15bNlnbCCU=
|
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c/go.mod h1:7rwL4CYBLnjLxUqIJNnCWiEdr3bn6IUYi15bNlnbCCU=
|
||||||
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
|
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
|
||||||
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||||
|
github.com/prometheus/client_golang v1.19.0 h1:ygXvpU1AoN1MhdzckN+PyD9QJOSD4x7kmXYlnfbA6JU=
|
||||||
|
github.com/prometheus/client_golang v1.19.0/go.mod h1:ZRM9uEAypZakd+q/x7+gmsvXdURP+DABIEIjnmDdp+k=
|
||||||
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
|
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
|
||||||
|
github.com/prometheus/client_model v0.5.0 h1:VQw1hfvPvk3Uv6Qf29VrPF32JB6rtbgI6cYPYQjL0Qw=
|
||||||
|
github.com/prometheus/client_model v0.5.0/go.mod h1:dTiFglRmd66nLR9Pv9f0mZi7B7fk5Pm3gvsjB5tr+kI=
|
||||||
|
github.com/prometheus/common v0.48.0 h1:QO8U2CdOzSn1BBsmXJXduaaW+dY/5QLjfB8svtSzKKE=
|
||||||
|
github.com/prometheus/common v0.48.0/go.mod h1:0/KsvlIEfPQCQ5I2iNSAWKPZziNCvRs5EC6ILDTlAPc=
|
||||||
|
github.com/prometheus/procfs v0.12.0 h1:jluTpSng7V9hY0O2R9DzzJHYb2xULk9VTR1V1R/k6Bo=
|
||||||
|
github.com/prometheus/procfs v0.12.0/go.mod h1:pcuDEFsWDnvcgNzo4EEweacyhjeA9Zk3cnaOZAZEfOo=
|
||||||
github.com/redis/go-redis/v9 v9.18.0 h1:pMkxYPkEbMPwRdenAzUNyFNrDgHx9U+DrBabWNfSRQs=
|
github.com/redis/go-redis/v9 v9.18.0 h1:pMkxYPkEbMPwRdenAzUNyFNrDgHx9U+DrBabWNfSRQs=
|
||||||
github.com/redis/go-redis/v9 v9.18.0/go.mod h1:k3ufPphLU5YXwNTUcCRXGxUoF1fqxnhFQmscfkCoDA0=
|
github.com/redis/go-redis/v9 v9.18.0/go.mod h1:k3ufPphLU5YXwNTUcCRXGxUoF1fqxnhFQmscfkCoDA0=
|
||||||
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
|
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
|
||||||
@@ -126,8 +145,26 @@ github.com/zeebo/xxh3 v1.0.2 h1:xZmwmqxHZA8AI603jOQ0tMqmBr9lPeFwGg6d+xy9DC0=
|
|||||||
github.com/zeebo/xxh3 v1.0.2/go.mod h1:5NWz9Sef7zIDm2JHfFlcQvNekmcEl9ekUZQQKCYaDcA=
|
github.com/zeebo/xxh3 v1.0.2/go.mod h1:5NWz9Sef7zIDm2JHfFlcQvNekmcEl9ekUZQQKCYaDcA=
|
||||||
go.opencensus.io v0.24.0 h1:y73uSU6J157QMP2kn2r30vwW1A2W2WFwSCGnAVxeaD0=
|
go.opencensus.io v0.24.0 h1:y73uSU6J157QMP2kn2r30vwW1A2W2WFwSCGnAVxeaD0=
|
||||||
go.opencensus.io v0.24.0/go.mod h1:vNK8G9p7aAivkbmorf4v+7Hgx+Zs0yY+0fOtgBfjQKo=
|
go.opencensus.io v0.24.0/go.mod h1:vNK8G9p7aAivkbmorf4v+7Hgx+Zs0yY+0fOtgBfjQKo=
|
||||||
|
go.opentelemetry.io/otel v1.29.0 h1:PdomN/Al4q/lN6iBJEN3AwPvUiHPMlt93c8bqTG5Llw=
|
||||||
|
go.opentelemetry.io/otel v1.29.0/go.mod h1:N/WtXPs1CNCUEx+Agz5uouwCba+i+bJGFicT8SR4NP8=
|
||||||
|
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.29.0 h1:dIIDULZJpgdiHz5tXrTgKIMLkus6jEFa7x5SOKcyR7E=
|
||||||
|
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.29.0/go.mod h1:jlRVBe7+Z1wyxFSUs48L6OBQZ5JwH2Hg/Vbl+t9rAgI=
|
||||||
|
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.29.0 h1:nSiV3s7wiCam610XcLbYOmMfJxB9gO4uK3Xgv5gmTgg=
|
||||||
|
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.29.0/go.mod h1:hKn/e/Nmd19/x1gvIHwtOwVWM+VhuITSWip3JUDghj0=
|
||||||
|
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.29.0 h1:X3ZjNp36/WlkSYx0ul2jw4PtbNEDDeLskw3VPsrpYM0=
|
||||||
|
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.29.0/go.mod h1:2uL/xnOXh0CHOBFCWXz5u1A4GXLiW+0IQIzVbeOEQ0U=
|
||||||
|
go.opentelemetry.io/otel/metric v1.29.0 h1:vPf/HFWTNkPu1aYeIsc98l4ktOQaL6LeSoeV2g+8YLc=
|
||||||
|
go.opentelemetry.io/otel/metric v1.29.0/go.mod h1:auu/QWieFVWx+DmQOUMgj0F8LHWdgalxXqvp7BII/W8=
|
||||||
|
go.opentelemetry.io/otel/sdk v1.29.0 h1:vkqKjk7gwhS8VaWb0POZKmIEDimRCMsopNYnriHyryo=
|
||||||
|
go.opentelemetry.io/otel/sdk v1.29.0/go.mod h1:pM8Dx5WKnvxLCb+8lG1PRNIDxu9g9b9g59Qr7hfAAok=
|
||||||
|
go.opentelemetry.io/otel/trace v1.29.0 h1:J/8ZNK4XgR7a21DZUAsbF8pZ5Jcw1VhACmnYt39JTi4=
|
||||||
|
go.opentelemetry.io/otel/trace v1.29.0/go.mod h1:eHl3w0sp3paPkYstJOmAimxhiFXPg+MMTlEh3nsQgWQ=
|
||||||
|
go.opentelemetry.io/proto/otlp v1.3.1 h1:TrMUixzpM0yuc/znrFTP9MMRh8trP93mkCiDVeXrui0=
|
||||||
|
go.opentelemetry.io/proto/otlp v1.3.1/go.mod h1:0X1WI4de4ZsLrrJNLAQbFeLCm3T7yBkR0XqQ7niQU+8=
|
||||||
go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE=
|
go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE=
|
||||||
go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0=
|
go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0=
|
||||||
|
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
|
||||||
|
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
|
||||||
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
|
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
|
||||||
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
|
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
|
||||||
golang.org/x/crypto v0.47.0 h1:V6e3FRj+n4dbpw86FJ8Fv7XVOql7TEwpHapKoMJ/GO8=
|
golang.org/x/crypto v0.47.0 h1:V6e3FRj+n4dbpw86FJ8Fv7XVOql7TEwpHapKoMJ/GO8=
|
||||||
@@ -175,6 +212,8 @@ google.golang.org/genai v1.48.0/go.mod h1:A3kkl0nyBjyFlNjgxIwKq70julKbIxpSxqKO5g
|
|||||||
google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc=
|
google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc=
|
||||||
google.golang.org/genproto v0.0.0-20190819201941-24fa4b261c55/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc=
|
google.golang.org/genproto v0.0.0-20190819201941-24fa4b261c55/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc=
|
||||||
google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013/go.mod h1:NbSheEEYHJ7i3ixzK3sjbqSGDJWnxyFXZblF3eUsNvo=
|
google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013/go.mod h1:NbSheEEYHJ7i3ixzK3sjbqSGDJWnxyFXZblF3eUsNvo=
|
||||||
|
google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 h1:hjSy6tcFQZ171igDaN5QHOw2n6vx40juYbC/x67CEhc=
|
||||||
|
google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:qpvKtACPCQhAdu3PyQgV4l3LMXZEtft7y8QcarRsp9I=
|
||||||
google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 h1:pPJltXNxVzT4pK9yD8vR9X75DaWYYmLGMsEvBfFQZzQ=
|
google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 h1:pPJltXNxVzT4pK9yD8vR9X75DaWYYmLGMsEvBfFQZzQ=
|
||||||
google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:UqMtugtsSgubUsoxbuAoiCXvqvErP7Gf0so0mK9tHxU=
|
google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:UqMtugtsSgubUsoxbuAoiCXvqvErP7Gf0so0mK9tHxU=
|
||||||
google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c=
|
google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c=
|
||||||
@@ -198,8 +237,8 @@ google.golang.org/protobuf v1.34.2/go.mod h1:qYOHts0dSfpeUzUFpOMr/WGzszTmLH+DiWn
|
|||||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||||
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
|
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
|
||||||
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
|
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
|
||||||
gopkg.in/yaml.v2 v2.2.8 h1:obN1ZagJSUGI0Ek/LBmuj4SNLPfIny3KsKFopxRdj10=
|
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
|
||||||
gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
|
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
|
||||||
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||||
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||||
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||||
|
|||||||
@@ -16,6 +16,7 @@ type Config struct {
|
|||||||
Conversations ConversationConfig `yaml:"conversations"`
|
Conversations ConversationConfig `yaml:"conversations"`
|
||||||
Logging LoggingConfig `yaml:"logging"`
|
Logging LoggingConfig `yaml:"logging"`
|
||||||
RateLimit RateLimitConfig `yaml:"rate_limit"`
|
RateLimit RateLimitConfig `yaml:"rate_limit"`
|
||||||
|
Observability ObservabilityConfig `yaml:"observability"`
|
||||||
}
|
}
|
||||||
|
|
||||||
// ConversationConfig controls conversation storage.
|
// ConversationConfig controls conversation storage.
|
||||||
@@ -50,6 +51,41 @@ type RateLimitConfig struct {
|
|||||||
Burst int `yaml:"burst"`
|
Burst int `yaml:"burst"`
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ObservabilityConfig controls observability features.
|
||||||
|
type ObservabilityConfig struct {
|
||||||
|
Enabled bool `yaml:"enabled"`
|
||||||
|
Metrics MetricsConfig `yaml:"metrics"`
|
||||||
|
Tracing TracingConfig `yaml:"tracing"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// MetricsConfig controls Prometheus metrics.
|
||||||
|
type MetricsConfig struct {
|
||||||
|
Enabled bool `yaml:"enabled"`
|
||||||
|
Path string `yaml:"path"` // default: "/metrics"
|
||||||
|
}
|
||||||
|
|
||||||
|
// TracingConfig controls OpenTelemetry tracing.
|
||||||
|
type TracingConfig struct {
|
||||||
|
Enabled bool `yaml:"enabled"`
|
||||||
|
ServiceName string `yaml:"service_name"` // default: "llm-gateway"
|
||||||
|
Sampler SamplerConfig `yaml:"sampler"`
|
||||||
|
Exporter ExporterConfig `yaml:"exporter"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// SamplerConfig controls trace sampling.
|
||||||
|
type SamplerConfig struct {
|
||||||
|
Type string `yaml:"type"` // "always", "never", "probability"
|
||||||
|
Rate float64 `yaml:"rate"` // 0.0 to 1.0
|
||||||
|
}
|
||||||
|
|
||||||
|
// ExporterConfig controls trace exporters.
|
||||||
|
type ExporterConfig struct {
|
||||||
|
Type string `yaml:"type"` // "otlp", "stdout"
|
||||||
|
Endpoint string `yaml:"endpoint"`
|
||||||
|
Insecure bool `yaml:"insecure"`
|
||||||
|
Headers map[string]string `yaml:"headers"`
|
||||||
|
}
|
||||||
|
|
||||||
// AuthConfig holds OIDC authentication settings.
|
// AuthConfig holds OIDC authentication settings.
|
||||||
type AuthConfig struct {
|
type AuthConfig struct {
|
||||||
Enabled bool `yaml:"enabled"`
|
Enabled bool `yaml:"enabled"`
|
||||||
|
|||||||
@@ -4,6 +4,8 @@ import (
|
|||||||
"context"
|
"context"
|
||||||
"log/slog"
|
"log/slog"
|
||||||
"os"
|
"os"
|
||||||
|
|
||||||
|
"go.opentelemetry.io/otel/trace"
|
||||||
)
|
)
|
||||||
|
|
||||||
type contextKey string
|
type contextKey string
|
||||||
@@ -57,3 +59,15 @@ func FromContext(ctx context.Context) string {
|
|||||||
}
|
}
|
||||||
return ""
|
return ""
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// LogAttrsWithTrace adds trace context to log attributes for correlation.
|
||||||
|
func LogAttrsWithTrace(ctx context.Context, attrs ...any) []any {
|
||||||
|
spanCtx := trace.SpanFromContext(ctx).SpanContext()
|
||||||
|
if spanCtx.IsValid() {
|
||||||
|
attrs = append(attrs,
|
||||||
|
slog.String("trace_id", spanCtx.TraceID().String()),
|
||||||
|
slog.String("span_id", spanCtx.SpanID().String()),
|
||||||
|
)
|
||||||
|
}
|
||||||
|
return attrs
|
||||||
|
}
|
||||||
|
|||||||
98
internal/observability/init.go
Normal file
98
internal/observability/init.go
Normal file
@@ -0,0 +1,98 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"github.com/ajac-zero/latticelm/internal/conversation"
|
||||||
|
"github.com/ajac-zero/latticelm/internal/providers"
|
||||||
|
"github.com/prometheus/client_golang/prometheus"
|
||||||
|
sdktrace "go.opentelemetry.io/otel/sdk/trace"
|
||||||
|
)
|
||||||
|
|
||||||
|
// ProviderRegistry defines the interface for provider registries.
|
||||||
|
// This matches the interface expected by the server.
|
||||||
|
type ProviderRegistry interface {
|
||||||
|
Get(name string) (providers.Provider, bool)
|
||||||
|
Models() []struct{ Provider, Model string }
|
||||||
|
ResolveModelID(model string) string
|
||||||
|
Default(model string) (providers.Provider, error)
|
||||||
|
}
|
||||||
|
|
||||||
|
// WrapProviderRegistry wraps all providers in a registry with observability.
|
||||||
|
func WrapProviderRegistry(registry ProviderRegistry, metricsRegistry *prometheus.Registry, tp *sdktrace.TracerProvider) ProviderRegistry {
|
||||||
|
if registry == nil {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// We can't directly modify the registry's internal map, so we'll need to
|
||||||
|
// wrap providers as they're retrieved. Instead, create a new instrumented registry.
|
||||||
|
return &InstrumentedRegistry{
|
||||||
|
base: registry,
|
||||||
|
metrics: metricsRegistry,
|
||||||
|
tracer: tp,
|
||||||
|
wrappedProviders: make(map[string]providers.Provider),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// InstrumentedRegistry wraps a provider registry to return instrumented providers.
|
||||||
|
type InstrumentedRegistry struct {
|
||||||
|
base ProviderRegistry
|
||||||
|
metrics *prometheus.Registry
|
||||||
|
tracer *sdktrace.TracerProvider
|
||||||
|
wrappedProviders map[string]providers.Provider
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get returns an instrumented provider by entry name.
|
||||||
|
func (r *InstrumentedRegistry) Get(name string) (providers.Provider, bool) {
|
||||||
|
// Check if we've already wrapped this provider
|
||||||
|
if wrapped, ok := r.wrappedProviders[name]; ok {
|
||||||
|
return wrapped, true
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get the base provider
|
||||||
|
p, ok := r.base.Get(name)
|
||||||
|
if !ok {
|
||||||
|
return nil, false
|
||||||
|
}
|
||||||
|
|
||||||
|
// Wrap it
|
||||||
|
wrapped := NewInstrumentedProvider(p, r.metrics, r.tracer)
|
||||||
|
r.wrappedProviders[name] = wrapped
|
||||||
|
return wrapped, true
|
||||||
|
}
|
||||||
|
|
||||||
|
// Default returns the instrumented provider for the given model name.
|
||||||
|
func (r *InstrumentedRegistry) Default(model string) (providers.Provider, error) {
|
||||||
|
p, err := r.base.Default(model)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check if we've already wrapped this provider
|
||||||
|
name := p.Name()
|
||||||
|
if wrapped, ok := r.wrappedProviders[name]; ok {
|
||||||
|
return wrapped, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// Wrap it
|
||||||
|
wrapped := NewInstrumentedProvider(p, r.metrics, r.tracer)
|
||||||
|
r.wrappedProviders[name] = wrapped
|
||||||
|
return wrapped, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// Models returns the list of configured models and their provider entry names.
|
||||||
|
func (r *InstrumentedRegistry) Models() []struct{ Provider, Model string } {
|
||||||
|
return r.base.Models()
|
||||||
|
}
|
||||||
|
|
||||||
|
// ResolveModelID returns the provider_model_id for a model.
|
||||||
|
func (r *InstrumentedRegistry) ResolveModelID(model string) string {
|
||||||
|
return r.base.ResolveModelID(model)
|
||||||
|
}
|
||||||
|
|
||||||
|
// WrapConversationStore wraps a conversation store with observability.
|
||||||
|
func WrapConversationStore(store conversation.Store, backend string, metricsRegistry *prometheus.Registry, tp *sdktrace.TracerProvider) conversation.Store {
|
||||||
|
if store == nil {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
return NewInstrumentedStore(store, backend, metricsRegistry, tp)
|
||||||
|
}
|
||||||
147
internal/observability/metrics.go
Normal file
147
internal/observability/metrics.go
Normal file
@@ -0,0 +1,147 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"github.com/prometheus/client_golang/prometheus"
|
||||||
|
)
|
||||||
|
|
||||||
|
var (
|
||||||
|
// HTTP Metrics
|
||||||
|
httpRequestsTotal = prometheus.NewCounterVec(
|
||||||
|
prometheus.CounterOpts{
|
||||||
|
Name: "http_requests_total",
|
||||||
|
Help: "Total number of HTTP requests",
|
||||||
|
},
|
||||||
|
[]string{"method", "path", "status"},
|
||||||
|
)
|
||||||
|
|
||||||
|
httpRequestDuration = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "http_request_duration_seconds",
|
||||||
|
Help: "HTTP request latency in seconds",
|
||||||
|
Buckets: []float64{0.01, 0.05, 0.1, 0.5, 1, 2.5, 5, 10, 30},
|
||||||
|
},
|
||||||
|
[]string{"method", "path", "status"},
|
||||||
|
)
|
||||||
|
|
||||||
|
httpRequestSize = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "http_request_size_bytes",
|
||||||
|
Help: "HTTP request size in bytes",
|
||||||
|
Buckets: prometheus.ExponentialBuckets(100, 10, 7), // 100B to 100MB
|
||||||
|
},
|
||||||
|
[]string{"method", "path"},
|
||||||
|
)
|
||||||
|
|
||||||
|
httpResponseSize = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "http_response_size_bytes",
|
||||||
|
Help: "HTTP response size in bytes",
|
||||||
|
Buckets: prometheus.ExponentialBuckets(100, 10, 7), // 100B to 100MB
|
||||||
|
},
|
||||||
|
[]string{"method", "path"},
|
||||||
|
)
|
||||||
|
|
||||||
|
// Provider Metrics
|
||||||
|
providerRequestsTotal = prometheus.NewCounterVec(
|
||||||
|
prometheus.CounterOpts{
|
||||||
|
Name: "provider_requests_total",
|
||||||
|
Help: "Total number of provider requests",
|
||||||
|
},
|
||||||
|
[]string{"provider", "model", "operation", "status"},
|
||||||
|
)
|
||||||
|
|
||||||
|
providerRequestDuration = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "provider_request_duration_seconds",
|
||||||
|
Help: "Provider request latency in seconds",
|
||||||
|
Buckets: []float64{0.1, 0.5, 1, 2, 5, 10, 20, 30, 60},
|
||||||
|
},
|
||||||
|
[]string{"provider", "model", "operation"},
|
||||||
|
)
|
||||||
|
|
||||||
|
providerTokensTotal = prometheus.NewCounterVec(
|
||||||
|
prometheus.CounterOpts{
|
||||||
|
Name: "provider_tokens_total",
|
||||||
|
Help: "Total number of tokens processed",
|
||||||
|
},
|
||||||
|
[]string{"provider", "model", "type"}, // type: input, output
|
||||||
|
)
|
||||||
|
|
||||||
|
providerStreamTTFB = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "provider_stream_ttfb_seconds",
|
||||||
|
Help: "Time to first byte for streaming requests in seconds",
|
||||||
|
Buckets: []float64{0.05, 0.1, 0.5, 1, 2, 5, 10},
|
||||||
|
},
|
||||||
|
[]string{"provider", "model"},
|
||||||
|
)
|
||||||
|
|
||||||
|
providerStreamChunks = prometheus.NewCounterVec(
|
||||||
|
prometheus.CounterOpts{
|
||||||
|
Name: "provider_stream_chunks_total",
|
||||||
|
Help: "Total number of stream chunks received",
|
||||||
|
},
|
||||||
|
[]string{"provider", "model"},
|
||||||
|
)
|
||||||
|
|
||||||
|
providerStreamDuration = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "provider_stream_duration_seconds",
|
||||||
|
Help: "Total duration of streaming requests in seconds",
|
||||||
|
Buckets: []float64{0.1, 0.5, 1, 2, 5, 10, 20, 30, 60},
|
||||||
|
},
|
||||||
|
[]string{"provider", "model"},
|
||||||
|
)
|
||||||
|
|
||||||
|
// Conversation Store Metrics
|
||||||
|
conversationOperationsTotal = prometheus.NewCounterVec(
|
||||||
|
prometheus.CounterOpts{
|
||||||
|
Name: "conversation_operations_total",
|
||||||
|
Help: "Total number of conversation store operations",
|
||||||
|
},
|
||||||
|
[]string{"operation", "backend", "status"},
|
||||||
|
)
|
||||||
|
|
||||||
|
conversationOperationDuration = prometheus.NewHistogramVec(
|
||||||
|
prometheus.HistogramOpts{
|
||||||
|
Name: "conversation_operation_duration_seconds",
|
||||||
|
Help: "Conversation store operation latency in seconds",
|
||||||
|
Buckets: []float64{0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1},
|
||||||
|
},
|
||||||
|
[]string{"operation", "backend"},
|
||||||
|
)
|
||||||
|
|
||||||
|
conversationActiveCount = prometheus.NewGaugeVec(
|
||||||
|
prometheus.GaugeOpts{
|
||||||
|
Name: "conversation_active_count",
|
||||||
|
Help: "Number of active conversations",
|
||||||
|
},
|
||||||
|
[]string{"backend"},
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
// InitMetrics registers all metrics with a new Prometheus registry.
|
||||||
|
func InitMetrics() *prometheus.Registry {
|
||||||
|
registry := prometheus.NewRegistry()
|
||||||
|
|
||||||
|
// Register HTTP metrics
|
||||||
|
registry.MustRegister(httpRequestsTotal)
|
||||||
|
registry.MustRegister(httpRequestDuration)
|
||||||
|
registry.MustRegister(httpRequestSize)
|
||||||
|
registry.MustRegister(httpResponseSize)
|
||||||
|
|
||||||
|
// Register provider metrics
|
||||||
|
registry.MustRegister(providerRequestsTotal)
|
||||||
|
registry.MustRegister(providerRequestDuration)
|
||||||
|
registry.MustRegister(providerTokensTotal)
|
||||||
|
registry.MustRegister(providerStreamTTFB)
|
||||||
|
registry.MustRegister(providerStreamChunks)
|
||||||
|
registry.MustRegister(providerStreamDuration)
|
||||||
|
|
||||||
|
// Register conversation store metrics
|
||||||
|
registry.MustRegister(conversationOperationsTotal)
|
||||||
|
registry.MustRegister(conversationOperationDuration)
|
||||||
|
registry.MustRegister(conversationActiveCount)
|
||||||
|
|
||||||
|
return registry
|
||||||
|
}
|
||||||
62
internal/observability/metrics_middleware.go
Normal file
62
internal/observability/metrics_middleware.go
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"net/http"
|
||||||
|
"strconv"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/prometheus/client_golang/prometheus"
|
||||||
|
)
|
||||||
|
|
||||||
|
// MetricsMiddleware creates a middleware that records HTTP metrics.
|
||||||
|
func MetricsMiddleware(next http.Handler, registry *prometheus.Registry, _ interface{}) http.Handler {
|
||||||
|
if registry == nil {
|
||||||
|
// If metrics are not enabled, pass through without modification
|
||||||
|
return next
|
||||||
|
}
|
||||||
|
|
||||||
|
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// Record request size
|
||||||
|
if r.ContentLength > 0 {
|
||||||
|
httpRequestSize.WithLabelValues(r.Method, r.URL.Path).Observe(float64(r.ContentLength))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Wrap response writer to capture status code and response size
|
||||||
|
wrapped := &metricsResponseWriter{
|
||||||
|
ResponseWriter: w,
|
||||||
|
statusCode: http.StatusOK,
|
||||||
|
bytesWritten: 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Call the next handler
|
||||||
|
next.ServeHTTP(wrapped, r)
|
||||||
|
|
||||||
|
// Record metrics after request completes
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := strconv.Itoa(wrapped.statusCode)
|
||||||
|
|
||||||
|
httpRequestsTotal.WithLabelValues(r.Method, r.URL.Path, status).Inc()
|
||||||
|
httpRequestDuration.WithLabelValues(r.Method, r.URL.Path, status).Observe(duration)
|
||||||
|
httpResponseSize.WithLabelValues(r.Method, r.URL.Path).Observe(float64(wrapped.bytesWritten))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
// metricsResponseWriter wraps http.ResponseWriter to capture status code and bytes written.
|
||||||
|
type metricsResponseWriter struct {
|
||||||
|
http.ResponseWriter
|
||||||
|
statusCode int
|
||||||
|
bytesWritten int
|
||||||
|
}
|
||||||
|
|
||||||
|
func (w *metricsResponseWriter) WriteHeader(statusCode int) {
|
||||||
|
w.statusCode = statusCode
|
||||||
|
w.ResponseWriter.WriteHeader(statusCode)
|
||||||
|
}
|
||||||
|
|
||||||
|
func (w *metricsResponseWriter) Write(b []byte) (int, error) {
|
||||||
|
n, err := w.ResponseWriter.Write(b)
|
||||||
|
w.bytesWritten += n
|
||||||
|
return n, err
|
||||||
|
}
|
||||||
208
internal/observability/provider_wrapper.go
Normal file
208
internal/observability/provider_wrapper.go
Normal file
@@ -0,0 +1,208 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/ajac-zero/latticelm/internal/api"
|
||||||
|
"github.com/ajac-zero/latticelm/internal/providers"
|
||||||
|
"github.com/prometheus/client_golang/prometheus"
|
||||||
|
"go.opentelemetry.io/otel/attribute"
|
||||||
|
"go.opentelemetry.io/otel/codes"
|
||||||
|
sdktrace "go.opentelemetry.io/otel/sdk/trace"
|
||||||
|
"go.opentelemetry.io/otel/trace"
|
||||||
|
)
|
||||||
|
|
||||||
|
// InstrumentedProvider wraps a provider with metrics and tracing.
|
||||||
|
type InstrumentedProvider struct {
|
||||||
|
base providers.Provider
|
||||||
|
registry *prometheus.Registry
|
||||||
|
tracer trace.Tracer
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewInstrumentedProvider wraps a provider with observability.
|
||||||
|
func NewInstrumentedProvider(p providers.Provider, registry *prometheus.Registry, tp *sdktrace.TracerProvider) providers.Provider {
|
||||||
|
var tracer trace.Tracer
|
||||||
|
if tp != nil {
|
||||||
|
tracer = tp.Tracer("llm-gateway")
|
||||||
|
}
|
||||||
|
|
||||||
|
return &InstrumentedProvider{
|
||||||
|
base: p,
|
||||||
|
registry: registry,
|
||||||
|
tracer: tracer,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Name returns the name of the underlying provider.
|
||||||
|
func (p *InstrumentedProvider) Name() string {
|
||||||
|
return p.base.Name()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Generate wraps the provider's Generate method with metrics and tracing.
|
||||||
|
func (p *InstrumentedProvider) Generate(ctx context.Context, messages []api.Message, req *api.ResponseRequest) (*api.ProviderResult, error) {
|
||||||
|
// Start span if tracing is enabled
|
||||||
|
if p.tracer != nil {
|
||||||
|
var span trace.Span
|
||||||
|
ctx, span = p.tracer.Start(ctx, "provider.generate",
|
||||||
|
trace.WithSpanKind(trace.SpanKindClient),
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("provider.name", p.base.Name()),
|
||||||
|
attribute.String("provider.model", req.Model),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record start time
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// Call underlying provider
|
||||||
|
result, err := p.base.Generate(ctx, messages, req)
|
||||||
|
|
||||||
|
// Record metrics
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := "success"
|
||||||
|
if err != nil {
|
||||||
|
status = "error"
|
||||||
|
if p.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.RecordError(err)
|
||||||
|
span.SetStatus(codes.Error, err.Error())
|
||||||
|
}
|
||||||
|
} else if result != nil {
|
||||||
|
// Add token attributes to span
|
||||||
|
if p.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.SetAttributes(
|
||||||
|
attribute.Int64("provider.input_tokens", int64(result.Usage.InputTokens)),
|
||||||
|
attribute.Int64("provider.output_tokens", int64(result.Usage.OutputTokens)),
|
||||||
|
attribute.Int64("provider.total_tokens", int64(result.Usage.TotalTokens)),
|
||||||
|
)
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record token metrics
|
||||||
|
if p.registry != nil {
|
||||||
|
providerTokensTotal.WithLabelValues(p.base.Name(), req.Model, "input").Add(float64(result.Usage.InputTokens))
|
||||||
|
providerTokensTotal.WithLabelValues(p.base.Name(), req.Model, "output").Add(float64(result.Usage.OutputTokens))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record request metrics
|
||||||
|
if p.registry != nil {
|
||||||
|
providerRequestsTotal.WithLabelValues(p.base.Name(), req.Model, "generate", status).Inc()
|
||||||
|
providerRequestDuration.WithLabelValues(p.base.Name(), req.Model, "generate").Observe(duration)
|
||||||
|
}
|
||||||
|
|
||||||
|
return result, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// GenerateStream wraps the provider's GenerateStream method with metrics and tracing.
|
||||||
|
func (p *InstrumentedProvider) GenerateStream(ctx context.Context, messages []api.Message, req *api.ResponseRequest) (<-chan *api.ProviderStreamDelta, <-chan error) {
|
||||||
|
// Start span if tracing is enabled
|
||||||
|
if p.tracer != nil {
|
||||||
|
var span trace.Span
|
||||||
|
ctx, span = p.tracer.Start(ctx, "provider.generate_stream",
|
||||||
|
trace.WithSpanKind(trace.SpanKindClient),
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("provider.name", p.base.Name()),
|
||||||
|
attribute.String("provider.model", req.Model),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record start time
|
||||||
|
start := time.Now()
|
||||||
|
var ttfb time.Duration
|
||||||
|
firstChunk := true
|
||||||
|
|
||||||
|
// Create instrumented channels
|
||||||
|
baseChan, baseErrChan := p.base.GenerateStream(ctx, messages, req)
|
||||||
|
outChan := make(chan *api.ProviderStreamDelta)
|
||||||
|
outErrChan := make(chan error, 1)
|
||||||
|
|
||||||
|
// Metrics tracking
|
||||||
|
var chunkCount int64
|
||||||
|
var totalInputTokens, totalOutputTokens int64
|
||||||
|
var streamErr error
|
||||||
|
|
||||||
|
go func() {
|
||||||
|
defer close(outChan)
|
||||||
|
defer close(outErrChan)
|
||||||
|
|
||||||
|
for {
|
||||||
|
select {
|
||||||
|
case delta, ok := <-baseChan:
|
||||||
|
if !ok {
|
||||||
|
// Stream finished - record final metrics
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := "success"
|
||||||
|
if streamErr != nil {
|
||||||
|
status = "error"
|
||||||
|
if p.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.RecordError(streamErr)
|
||||||
|
span.SetStatus(codes.Error, streamErr.Error())
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if p.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.SetAttributes(
|
||||||
|
attribute.Int64("provider.input_tokens", totalInputTokens),
|
||||||
|
attribute.Int64("provider.output_tokens", totalOutputTokens),
|
||||||
|
attribute.Int64("provider.chunk_count", chunkCount),
|
||||||
|
attribute.Float64("provider.ttfb_seconds", ttfb.Seconds()),
|
||||||
|
)
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record token metrics
|
||||||
|
if p.registry != nil && (totalInputTokens > 0 || totalOutputTokens > 0) {
|
||||||
|
providerTokensTotal.WithLabelValues(p.base.Name(), req.Model, "input").Add(float64(totalInputTokens))
|
||||||
|
providerTokensTotal.WithLabelValues(p.base.Name(), req.Model, "output").Add(float64(totalOutputTokens))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record stream metrics
|
||||||
|
if p.registry != nil {
|
||||||
|
providerRequestsTotal.WithLabelValues(p.base.Name(), req.Model, "generate_stream", status).Inc()
|
||||||
|
providerStreamDuration.WithLabelValues(p.base.Name(), req.Model).Observe(duration)
|
||||||
|
providerStreamChunks.WithLabelValues(p.base.Name(), req.Model).Add(float64(chunkCount))
|
||||||
|
if ttfb > 0 {
|
||||||
|
providerStreamTTFB.WithLabelValues(p.base.Name(), req.Model).Observe(ttfb.Seconds())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record TTFB on first chunk
|
||||||
|
if firstChunk {
|
||||||
|
ttfb = time.Since(start)
|
||||||
|
firstChunk = false
|
||||||
|
}
|
||||||
|
|
||||||
|
chunkCount++
|
||||||
|
|
||||||
|
// Track token usage
|
||||||
|
if delta.Usage != nil {
|
||||||
|
totalInputTokens = int64(delta.Usage.InputTokens)
|
||||||
|
totalOutputTokens = int64(delta.Usage.OutputTokens)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Forward the delta
|
||||||
|
outChan <- delta
|
||||||
|
|
||||||
|
case err, ok := <-baseErrChan:
|
||||||
|
if ok && err != nil {
|
||||||
|
streamErr = err
|
||||||
|
outErrChan <- err
|
||||||
|
}
|
||||||
|
return
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}()
|
||||||
|
|
||||||
|
return outChan, outErrChan
|
||||||
|
}
|
||||||
258
internal/observability/store_wrapper.go
Normal file
258
internal/observability/store_wrapper.go
Normal file
@@ -0,0 +1,258 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/ajac-zero/latticelm/internal/api"
|
||||||
|
"github.com/ajac-zero/latticelm/internal/conversation"
|
||||||
|
"github.com/prometheus/client_golang/prometheus"
|
||||||
|
"go.opentelemetry.io/otel/attribute"
|
||||||
|
"go.opentelemetry.io/otel/codes"
|
||||||
|
sdktrace "go.opentelemetry.io/otel/sdk/trace"
|
||||||
|
"go.opentelemetry.io/otel/trace"
|
||||||
|
)
|
||||||
|
|
||||||
|
// InstrumentedStore wraps a conversation store with metrics and tracing.
|
||||||
|
type InstrumentedStore struct {
|
||||||
|
base conversation.Store
|
||||||
|
registry *prometheus.Registry
|
||||||
|
tracer trace.Tracer
|
||||||
|
backend string
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewInstrumentedStore wraps a conversation store with observability.
|
||||||
|
func NewInstrumentedStore(s conversation.Store, backend string, registry *prometheus.Registry, tp *sdktrace.TracerProvider) conversation.Store {
|
||||||
|
var tracer trace.Tracer
|
||||||
|
if tp != nil {
|
||||||
|
tracer = tp.Tracer("llm-gateway")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Initialize gauge with current size
|
||||||
|
if registry != nil {
|
||||||
|
conversationActiveCount.WithLabelValues(backend).Set(float64(s.Size()))
|
||||||
|
}
|
||||||
|
|
||||||
|
return &InstrumentedStore{
|
||||||
|
base: s,
|
||||||
|
registry: registry,
|
||||||
|
tracer: tracer,
|
||||||
|
backend: backend,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get wraps the store's Get method with metrics and tracing.
|
||||||
|
func (s *InstrumentedStore) Get(id string) (*conversation.Conversation, error) {
|
||||||
|
ctx := context.Background()
|
||||||
|
|
||||||
|
// Start span if tracing is enabled
|
||||||
|
if s.tracer != nil {
|
||||||
|
var span trace.Span
|
||||||
|
ctx, span = s.tracer.Start(ctx, "conversation.get",
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("conversation.id", id),
|
||||||
|
attribute.String("conversation.backend", s.backend),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record start time
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// Call underlying store
|
||||||
|
conv, err := s.base.Get(id)
|
||||||
|
|
||||||
|
// Record metrics
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := "success"
|
||||||
|
if err != nil {
|
||||||
|
status = "error"
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.RecordError(err)
|
||||||
|
span.SetStatus(codes.Error, err.Error())
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
if conv != nil {
|
||||||
|
span.SetAttributes(
|
||||||
|
attribute.Int("conversation.message_count", len(conv.Messages)),
|
||||||
|
attribute.String("conversation.model", conv.Model),
|
||||||
|
)
|
||||||
|
}
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if s.registry != nil {
|
||||||
|
conversationOperationsTotal.WithLabelValues("get", s.backend, status).Inc()
|
||||||
|
conversationOperationDuration.WithLabelValues("get", s.backend).Observe(duration)
|
||||||
|
}
|
||||||
|
|
||||||
|
return conv, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create wraps the store's Create method with metrics and tracing.
|
||||||
|
func (s *InstrumentedStore) Create(id string, model string, messages []api.Message) (*conversation.Conversation, error) {
|
||||||
|
ctx := context.Background()
|
||||||
|
|
||||||
|
// Start span if tracing is enabled
|
||||||
|
if s.tracer != nil {
|
||||||
|
var span trace.Span
|
||||||
|
ctx, span = s.tracer.Start(ctx, "conversation.create",
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("conversation.id", id),
|
||||||
|
attribute.String("conversation.backend", s.backend),
|
||||||
|
attribute.String("conversation.model", model),
|
||||||
|
attribute.Int("conversation.initial_messages", len(messages)),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record start time
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// Call underlying store
|
||||||
|
conv, err := s.base.Create(id, model, messages)
|
||||||
|
|
||||||
|
// Record metrics
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := "success"
|
||||||
|
if err != nil {
|
||||||
|
status = "error"
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.RecordError(err)
|
||||||
|
span.SetStatus(codes.Error, err.Error())
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if s.registry != nil {
|
||||||
|
conversationOperationsTotal.WithLabelValues("create", s.backend, status).Inc()
|
||||||
|
conversationOperationDuration.WithLabelValues("create", s.backend).Observe(duration)
|
||||||
|
// Update active count
|
||||||
|
conversationActiveCount.WithLabelValues(s.backend).Set(float64(s.base.Size()))
|
||||||
|
}
|
||||||
|
|
||||||
|
return conv, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Append wraps the store's Append method with metrics and tracing.
|
||||||
|
func (s *InstrumentedStore) Append(id string, messages ...api.Message) (*conversation.Conversation, error) {
|
||||||
|
ctx := context.Background()
|
||||||
|
|
||||||
|
// Start span if tracing is enabled
|
||||||
|
if s.tracer != nil {
|
||||||
|
var span trace.Span
|
||||||
|
ctx, span = s.tracer.Start(ctx, "conversation.append",
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("conversation.id", id),
|
||||||
|
attribute.String("conversation.backend", s.backend),
|
||||||
|
attribute.Int("conversation.appended_messages", len(messages)),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record start time
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// Call underlying store
|
||||||
|
conv, err := s.base.Append(id, messages...)
|
||||||
|
|
||||||
|
// Record metrics
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := "success"
|
||||||
|
if err != nil {
|
||||||
|
status = "error"
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.RecordError(err)
|
||||||
|
span.SetStatus(codes.Error, err.Error())
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
if conv != nil {
|
||||||
|
span.SetAttributes(
|
||||||
|
attribute.Int("conversation.total_messages", len(conv.Messages)),
|
||||||
|
)
|
||||||
|
}
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if s.registry != nil {
|
||||||
|
conversationOperationsTotal.WithLabelValues("append", s.backend, status).Inc()
|
||||||
|
conversationOperationDuration.WithLabelValues("append", s.backend).Observe(duration)
|
||||||
|
}
|
||||||
|
|
||||||
|
return conv, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Delete wraps the store's Delete method with metrics and tracing.
|
||||||
|
func (s *InstrumentedStore) Delete(id string) error {
|
||||||
|
ctx := context.Background()
|
||||||
|
|
||||||
|
// Start span if tracing is enabled
|
||||||
|
if s.tracer != nil {
|
||||||
|
var span trace.Span
|
||||||
|
ctx, span = s.tracer.Start(ctx, "conversation.delete",
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("conversation.id", id),
|
||||||
|
attribute.String("conversation.backend", s.backend),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Record start time
|
||||||
|
start := time.Now()
|
||||||
|
|
||||||
|
// Call underlying store
|
||||||
|
err := s.base.Delete(id)
|
||||||
|
|
||||||
|
// Record metrics
|
||||||
|
duration := time.Since(start).Seconds()
|
||||||
|
status := "success"
|
||||||
|
if err != nil {
|
||||||
|
status = "error"
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.RecordError(err)
|
||||||
|
span.SetStatus(codes.Error, err.Error())
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if s.tracer != nil {
|
||||||
|
span := trace.SpanFromContext(ctx)
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if s.registry != nil {
|
||||||
|
conversationOperationsTotal.WithLabelValues("delete", s.backend, status).Inc()
|
||||||
|
conversationOperationDuration.WithLabelValues("delete", s.backend).Observe(duration)
|
||||||
|
// Update active count
|
||||||
|
conversationActiveCount.WithLabelValues(s.backend).Set(float64(s.base.Size()))
|
||||||
|
}
|
||||||
|
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Size returns the size of the underlying store.
|
||||||
|
func (s *InstrumentedStore) Size() int {
|
||||||
|
return s.base.Size()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Close wraps the store's Close method.
|
||||||
|
func (s *InstrumentedStore) Close() error {
|
||||||
|
return s.base.Close()
|
||||||
|
}
|
||||||
104
internal/observability/tracing.go
Normal file
104
internal/observability/tracing.go
Normal file
@@ -0,0 +1,104 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"fmt"
|
||||||
|
|
||||||
|
"github.com/ajac-zero/latticelm/internal/config"
|
||||||
|
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
|
||||||
|
"go.opentelemetry.io/otel/exporters/stdout/stdouttrace"
|
||||||
|
"go.opentelemetry.io/otel/sdk/resource"
|
||||||
|
sdktrace "go.opentelemetry.io/otel/sdk/trace"
|
||||||
|
semconv "go.opentelemetry.io/otel/semconv/v1.24.0"
|
||||||
|
"google.golang.org/grpc"
|
||||||
|
"google.golang.org/grpc/credentials/insecure"
|
||||||
|
)
|
||||||
|
|
||||||
|
// InitTracer initializes the OpenTelemetry tracer provider.
|
||||||
|
func InitTracer(cfg config.TracingConfig) (*sdktrace.TracerProvider, error) {
|
||||||
|
// Create resource with service information
|
||||||
|
res, err := resource.Merge(
|
||||||
|
resource.Default(),
|
||||||
|
resource.NewWithAttributes(
|
||||||
|
semconv.SchemaURL,
|
||||||
|
semconv.ServiceName(cfg.ServiceName),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create resource: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create exporter
|
||||||
|
var exporter sdktrace.SpanExporter
|
||||||
|
switch cfg.Exporter.Type {
|
||||||
|
case "otlp":
|
||||||
|
exporter, err = createOTLPExporter(cfg.Exporter)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create OTLP exporter: %w", err)
|
||||||
|
}
|
||||||
|
case "stdout":
|
||||||
|
exporter, err = stdouttrace.New(
|
||||||
|
stdouttrace.WithPrettyPrint(),
|
||||||
|
)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create stdout exporter: %w", err)
|
||||||
|
}
|
||||||
|
default:
|
||||||
|
return nil, fmt.Errorf("unsupported exporter type: %s", cfg.Exporter.Type)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create sampler
|
||||||
|
sampler := createSampler(cfg.Sampler)
|
||||||
|
|
||||||
|
// Create tracer provider
|
||||||
|
tp := sdktrace.NewTracerProvider(
|
||||||
|
sdktrace.WithBatcher(exporter),
|
||||||
|
sdktrace.WithResource(res),
|
||||||
|
sdktrace.WithSampler(sampler),
|
||||||
|
)
|
||||||
|
|
||||||
|
return tp, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// createOTLPExporter creates an OTLP gRPC exporter.
|
||||||
|
func createOTLPExporter(cfg config.ExporterConfig) (sdktrace.SpanExporter, error) {
|
||||||
|
opts := []otlptracegrpc.Option{
|
||||||
|
otlptracegrpc.WithEndpoint(cfg.Endpoint),
|
||||||
|
}
|
||||||
|
|
||||||
|
if cfg.Insecure {
|
||||||
|
opts = append(opts, otlptracegrpc.WithTLSCredentials(insecure.NewCredentials()))
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(cfg.Headers) > 0 {
|
||||||
|
opts = append(opts, otlptracegrpc.WithHeaders(cfg.Headers))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Add dial options to ensure connection
|
||||||
|
opts = append(opts, otlptracegrpc.WithDialOption(grpc.WithBlock()))
|
||||||
|
|
||||||
|
return otlptracegrpc.New(context.Background(), opts...)
|
||||||
|
}
|
||||||
|
|
||||||
|
// createSampler creates a sampler based on the configuration.
|
||||||
|
func createSampler(cfg config.SamplerConfig) sdktrace.Sampler {
|
||||||
|
switch cfg.Type {
|
||||||
|
case "always":
|
||||||
|
return sdktrace.AlwaysSample()
|
||||||
|
case "never":
|
||||||
|
return sdktrace.NeverSample()
|
||||||
|
case "probability":
|
||||||
|
return sdktrace.TraceIDRatioBased(cfg.Rate)
|
||||||
|
default:
|
||||||
|
// Default to 10% sampling
|
||||||
|
return sdktrace.TraceIDRatioBased(0.1)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Shutdown gracefully shuts down the tracer provider.
|
||||||
|
func Shutdown(ctx context.Context, tp *sdktrace.TracerProvider) error {
|
||||||
|
if tp == nil {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
return tp.Shutdown(ctx)
|
||||||
|
}
|
||||||
85
internal/observability/tracing_middleware.go
Normal file
85
internal/observability/tracing_middleware.go
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
package observability
|
||||||
|
|
||||||
|
import (
|
||||||
|
"net/http"
|
||||||
|
|
||||||
|
"go.opentelemetry.io/otel"
|
||||||
|
"go.opentelemetry.io/otel/attribute"
|
||||||
|
"go.opentelemetry.io/otel/codes"
|
||||||
|
"go.opentelemetry.io/otel/propagation"
|
||||||
|
sdktrace "go.opentelemetry.io/otel/sdk/trace"
|
||||||
|
"go.opentelemetry.io/otel/trace"
|
||||||
|
)
|
||||||
|
|
||||||
|
// TracingMiddleware creates a middleware that adds OpenTelemetry tracing to HTTP requests.
|
||||||
|
func TracingMiddleware(next http.Handler, tp *sdktrace.TracerProvider) http.Handler {
|
||||||
|
if tp == nil {
|
||||||
|
// If tracing is not enabled, pass through without modification
|
||||||
|
return next
|
||||||
|
}
|
||||||
|
|
||||||
|
// Set up W3C Trace Context propagation
|
||||||
|
otel.SetTextMapPropagator(propagation.TraceContext{})
|
||||||
|
|
||||||
|
tracer := tp.Tracer("llm-gateway")
|
||||||
|
|
||||||
|
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
// Extract trace context from incoming request headers
|
||||||
|
ctx := otel.GetTextMapPropagator().Extract(r.Context(), propagation.HeaderCarrier(r.Header))
|
||||||
|
|
||||||
|
// Start a new span
|
||||||
|
ctx, span := tracer.Start(ctx, "HTTP "+r.Method+" "+r.URL.Path,
|
||||||
|
trace.WithSpanKind(trace.SpanKindServer),
|
||||||
|
trace.WithAttributes(
|
||||||
|
attribute.String("http.method", r.Method),
|
||||||
|
attribute.String("http.route", r.URL.Path),
|
||||||
|
attribute.String("http.scheme", r.URL.Scheme),
|
||||||
|
attribute.String("http.host", r.Host),
|
||||||
|
attribute.String("http.user_agent", r.Header.Get("User-Agent")),
|
||||||
|
),
|
||||||
|
)
|
||||||
|
defer span.End()
|
||||||
|
|
||||||
|
// Add request ID to span if present
|
||||||
|
if requestID := r.Header.Get("X-Request-ID"); requestID != "" {
|
||||||
|
span.SetAttributes(attribute.String("http.request_id", requestID))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create a response writer wrapper to capture status code
|
||||||
|
wrapped := &statusResponseWriter{
|
||||||
|
ResponseWriter: w,
|
||||||
|
statusCode: http.StatusOK,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Inject trace context into request for downstream services
|
||||||
|
r = r.WithContext(ctx)
|
||||||
|
|
||||||
|
// Call the next handler
|
||||||
|
next.ServeHTTP(wrapped, r)
|
||||||
|
|
||||||
|
// Record the status code in the span
|
||||||
|
span.SetAttributes(attribute.Int("http.status_code", wrapped.statusCode))
|
||||||
|
|
||||||
|
// Set span status based on HTTP status code
|
||||||
|
if wrapped.statusCode >= 400 {
|
||||||
|
span.SetStatus(codes.Error, http.StatusText(wrapped.statusCode))
|
||||||
|
} else {
|
||||||
|
span.SetStatus(codes.Ok, "")
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
// statusResponseWriter wraps http.ResponseWriter to capture the status code.
|
||||||
|
type statusResponseWriter struct {
|
||||||
|
http.ResponseWriter
|
||||||
|
statusCode int
|
||||||
|
}
|
||||||
|
|
||||||
|
func (w *statusResponseWriter) WriteHeader(statusCode int) {
|
||||||
|
w.statusCode = statusCode
|
||||||
|
w.ResponseWriter.WriteHeader(statusCode)
|
||||||
|
}
|
||||||
|
|
||||||
|
func (w *statusResponseWriter) Write(b []byte) (int, error) {
|
||||||
|
return w.ResponseWriter.Write(b)
|
||||||
|
}
|
||||||
@@ -98,9 +98,11 @@ func (s *GatewayServer) handleResponses(w http.ResponseWriter, r *http.Request)
|
|||||||
conv, err := s.convs.Get(*req.PreviousResponseID)
|
conv, err := s.convs.Get(*req.PreviousResponseID)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
s.logger.ErrorContext(r.Context(), "failed to retrieve conversation",
|
s.logger.ErrorContext(r.Context(), "failed to retrieve conversation",
|
||||||
|
logger.LogAttrsWithTrace(r.Context(),
|
||||||
slog.String("request_id", logger.FromContext(r.Context())),
|
slog.String("request_id", logger.FromContext(r.Context())),
|
||||||
slog.String("conversation_id", *req.PreviousResponseID),
|
slog.String("conversation_id", *req.PreviousResponseID),
|
||||||
slog.String("error", err.Error()),
|
slog.String("error", err.Error()),
|
||||||
|
)...,
|
||||||
)
|
)
|
||||||
http.Error(w, "error retrieving conversation", http.StatusInternalServerError)
|
http.Error(w, "error retrieving conversation", http.StatusInternalServerError)
|
||||||
return
|
return
|
||||||
@@ -152,10 +154,12 @@ func (s *GatewayServer) handleSyncResponse(w http.ResponseWriter, r *http.Reques
|
|||||||
result, err := provider.Generate(r.Context(), providerMsgs, resolvedReq)
|
result, err := provider.Generate(r.Context(), providerMsgs, resolvedReq)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
s.logger.ErrorContext(r.Context(), "provider generation failed",
|
s.logger.ErrorContext(r.Context(), "provider generation failed",
|
||||||
|
logger.LogAttrsWithTrace(r.Context(),
|
||||||
slog.String("request_id", logger.FromContext(r.Context())),
|
slog.String("request_id", logger.FromContext(r.Context())),
|
||||||
slog.String("provider", provider.Name()),
|
slog.String("provider", provider.Name()),
|
||||||
slog.String("model", resolvedReq.Model),
|
slog.String("model", resolvedReq.Model),
|
||||||
slog.String("error", err.Error()),
|
slog.String("error", err.Error()),
|
||||||
|
)...,
|
||||||
)
|
)
|
||||||
http.Error(w, "provider error", http.StatusBadGateway)
|
http.Error(w, "provider error", http.StatusBadGateway)
|
||||||
return
|
return
|
||||||
@@ -172,14 +176,17 @@ func (s *GatewayServer) handleSyncResponse(w http.ResponseWriter, r *http.Reques
|
|||||||
allMsgs := append(storeMsgs, assistantMsg)
|
allMsgs := append(storeMsgs, assistantMsg)
|
||||||
if _, err := s.convs.Create(responseID, result.Model, allMsgs); err != nil {
|
if _, err := s.convs.Create(responseID, result.Model, allMsgs); err != nil {
|
||||||
s.logger.ErrorContext(r.Context(), "failed to store conversation",
|
s.logger.ErrorContext(r.Context(), "failed to store conversation",
|
||||||
|
logger.LogAttrsWithTrace(r.Context(),
|
||||||
slog.String("request_id", logger.FromContext(r.Context())),
|
slog.String("request_id", logger.FromContext(r.Context())),
|
||||||
slog.String("response_id", responseID),
|
slog.String("response_id", responseID),
|
||||||
slog.String("error", err.Error()),
|
slog.String("error", err.Error()),
|
||||||
|
)...,
|
||||||
)
|
)
|
||||||
// Don't fail the response if storage fails
|
// Don't fail the response if storage fails
|
||||||
}
|
}
|
||||||
|
|
||||||
s.logger.InfoContext(r.Context(), "response generated",
|
s.logger.InfoContext(r.Context(), "response generated",
|
||||||
|
logger.LogAttrsWithTrace(r.Context(),
|
||||||
slog.String("request_id", logger.FromContext(r.Context())),
|
slog.String("request_id", logger.FromContext(r.Context())),
|
||||||
slog.String("provider", provider.Name()),
|
slog.String("provider", provider.Name()),
|
||||||
slog.String("model", result.Model),
|
slog.String("model", result.Model),
|
||||||
@@ -187,6 +194,7 @@ func (s *GatewayServer) handleSyncResponse(w http.ResponseWriter, r *http.Reques
|
|||||||
slog.Int("input_tokens", result.Usage.InputTokens),
|
slog.Int("input_tokens", result.Usage.InputTokens),
|
||||||
slog.Int("output_tokens", result.Usage.OutputTokens),
|
slog.Int("output_tokens", result.Usage.OutputTokens),
|
||||||
slog.Bool("has_tool_calls", len(result.ToolCalls) > 0),
|
slog.Bool("has_tool_calls", len(result.ToolCalls) > 0),
|
||||||
|
)...,
|
||||||
)
|
)
|
||||||
|
|
||||||
// Build spec-compliant response
|
// Build spec-compliant response
|
||||||
@@ -374,10 +382,12 @@ loop:
|
|||||||
|
|
||||||
if streamErr != nil {
|
if streamErr != nil {
|
||||||
s.logger.ErrorContext(r.Context(), "stream error",
|
s.logger.ErrorContext(r.Context(), "stream error",
|
||||||
|
logger.LogAttrsWithTrace(r.Context(),
|
||||||
slog.String("request_id", logger.FromContext(r.Context())),
|
slog.String("request_id", logger.FromContext(r.Context())),
|
||||||
slog.String("provider", provider.Name()),
|
slog.String("provider", provider.Name()),
|
||||||
slog.String("model", origReq.Model),
|
slog.String("model", origReq.Model),
|
||||||
slog.String("error", streamErr.Error()),
|
slog.String("error", streamErr.Error()),
|
||||||
|
)...,
|
||||||
)
|
)
|
||||||
failedResp := s.buildResponse(origReq, &api.ProviderResult{
|
failedResp := s.buildResponse(origReq, &api.ProviderResult{
|
||||||
Model: origReq.Model,
|
Model: origReq.Model,
|
||||||
|
|||||||
Reference in New Issue
Block a user