Files
latticelm/docs/admin-ui-spec.md
Anibal Angulo 7025ec746c
Some checks failed
CI / Test (pull_request) Failing after 1m33s
CI / Lint (pull_request) Failing after 13s
CI / Build (pull_request) Has been skipped
CI / Security Scan (pull_request) Failing after 4m47s
CI / Build and Push Docker Image (pull_request) Has been skipped
Add admin UI
2026-03-05 23:09:27 +00:00

87 KiB

Admin Web UI Specification

Project: go-llm-gateway (latticelm) Feature: Admin Web UI Version: 1.0 Status: Draft Date: 2026-03-05


Table of Contents

  1. Overview
  2. Goals and Objectives
  3. Requirements
  4. Architecture
  5. API Specification
  6. UI Design
  7. Security
  8. Implementation Phases
  9. Testing Strategy
  10. Deployment
  11. Future Enhancements

Overview

The Admin Web UI provides a browser-based interface for managing and monitoring the go-llm-gateway service. It enables operators to configure providers, manage models, monitor system health, and perform administrative tasks without directly editing configuration files or using CLI tools.

Problem Statement

Currently, configuring and operating go-llm-gateway requires:

  • Manual editing of config.yaml files
  • Restarting the service for configuration changes
  • Using external tools (Grafana, Prometheus) for monitoring
  • Command-line access for operational tasks
  • No centralized view of system health and configuration

Solution

A web-based administration interface that provides:

  • Real-time system status and metrics visualization
  • Configuration management with validation
  • Provider and model management
  • Conversation store administration
  • Integrated monitoring and diagnostics

Goals and Objectives

Primary Goals

  1. Simplify Configuration Management

    • Reduce time to configure providers from minutes to seconds
    • Eliminate configuration syntax errors through UI validation
    • Provide immediate feedback on configuration changes
  2. Improve Operational Visibility

    • Centralized dashboard for system health
    • Real-time metrics and performance monitoring
    • Provider connection status and circuit breaker states
  3. Enhance Developer Experience

    • Intuitive interface requiring no YAML knowledge
    • Self-documenting configuration options
    • Quick testing of provider configurations

Non-Goals

  • Not a replacement for Grafana/Prometheus - Focus on operational tasks, not deep metrics analysis
  • Not a user-facing API explorer - Admin-only, not for end users of the gateway
  • Not a conversation UI - Management only, not for interactive LLM chat
  • Not a multi-tenancy admin - Single instance management only

Requirements

Functional Requirements

FR1: Dashboard and Overview

  • FR1.1: Display system status (uptime, version, build info)
  • FR1.2: Show current configuration summary
  • FR1.3: Display provider health status with circuit breaker states
  • FR1.4: Show key metrics (requests/sec, error rate, latency percentiles)
  • FR1.5: Display recent logs/events (last 100 entries)

FR2: Provider Management

  • FR2.1: List all configured providers with status indicators
  • FR2.2: Add new provider configurations (OpenAI, Azure, Anthropic, Google, Vertex AI)
  • FR2.3: Edit existing provider settings (API keys, endpoints, parameters)
  • FR2.4: Delete provider configurations with confirmation
  • FR2.5: Test provider connectivity with sample request
  • FR2.6: View provider-specific metrics (request count, error rate, latency)
  • FR2.7: Reset circuit breaker state for providers

FR3: Model Management

  • FR3.1: List all configured model mappings
  • FR3.2: Add new model mappings (name → provider + model ID)
  • FR3.3: Edit model mappings
  • FR3.4: Delete model mappings with confirmation
  • FR3.5: View model usage statistics (request count per model)
  • FR3.6: Test model availability with sample request

FR4: Configuration Management

  • FR4.1: View current configuration (all sections)
  • FR4.2: Edit server settings (address, body size limits)
  • FR4.3: Edit logging configuration (format, level)
  • FR4.4: Edit rate limiting settings (enabled, requests/sec, burst)
  • FR4.5: Edit authentication settings (OIDC issuer, audience)
  • FR4.6: Edit observability settings (metrics, tracing)
  • FR4.7: Validate configuration before applying
  • FR4.8: Export current configuration as YAML
  • FR4.9: Preview configuration diff before applying changes
  • FR4.10: Apply configuration with hot-reload or restart prompt

FR5: Conversation Store Management

  • FR5.1: View conversation store type and connection status
  • FR5.2: Browse conversations (paginated list)
  • FR5.3: Search conversations by ID or metadata
  • FR5.4: View conversation details (messages, metadata, timestamps)
  • FR5.5: Delete individual conversations
  • FR5.6: Bulk delete conversations (by age, by criteria)
  • FR5.7: View conversation statistics (total count, storage size)

FR6: Monitoring and Metrics

  • FR6.1: Display request rate (current, 1m, 5m, 15m averages)
  • FR6.2: Display error rate by provider and model
  • FR6.3: Display latency percentiles (p50, p90, p95, p99)
  • FR6.4: Display provider-specific metrics
  • FR6.5: Display circuit breaker state changes (timeline)
  • FR6.6: Export metrics in Prometheus format

FR7: Logs and Diagnostics

  • FR7.1: View recent application logs (tail -f style)
  • FR7.2: Filter logs by level (debug, info, warn, error)
  • FR7.3: Search logs by keyword
  • FR7.4: Download log exports
  • FR7.5: View OpenTelemetry trace samples (if enabled)

FR8: System Operations

  • FR8.1: View health check status (/health, /ready)
  • FR8.2: Trigger graceful restart (with countdown)
  • FR8.3: View environment variables (sanitized, no secrets)
  • FR8.4: Download diagnostic bundle (config + logs + metrics)

Non-Functional Requirements

NFR1: Performance

  • NFR1.1: Admin UI must not impact gateway performance (< 1% CPU overhead)
  • NFR1.2: Dashboard load time < 2 seconds on modern browsers
  • NFR1.3: API endpoints respond within 500ms (p95)
  • NFR1.4: Support concurrent admin users (up to 10)

NFR2: Security

  • NFR2.1: All admin endpoints require authentication
  • NFR2.2: Support OIDC/OAuth2 authentication (reuse existing auth)
  • NFR2.3: Support role-based access control (admin vs viewer roles)
  • NFR2.4: Sanitize secrets in all UI displays (mask API keys)
  • NFR2.5: Audit log for all configuration changes
  • NFR2.6: CSRF protection for state-changing operations
  • NFR2.7: Content Security Policy (CSP) headers

NFR3: Usability

  • NFR3.1: Responsive design (desktop, tablet, mobile)
  • NFR3.2: Accessible (WCAG 2.1 Level AA)
  • NFR3.3: Dark mode support
  • NFR3.4: Keyboard navigation support
  • NFR3.5: Inline help text and tooltips

NFR4: Reliability

  • NFR4.1: Admin UI failures must not crash the gateway
  • NFR4.2: Configuration validation prevents invalid states
  • NFR4.3: Rollback capability for configuration changes
  • NFR4.4: Graceful degradation if metrics unavailable

NFR5: Maintainability

  • NFR5.1: Minimal external dependencies (prefer stdlib)
  • NFR5.2: Embedded assets (single binary deployment)
  • NFR5.3: API versioning for future compatibility
  • NFR5.4: Comprehensive error messages

Architecture

High-Level Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Browser Client                        │
│  ┌────────────┐  ┌──────────────┐  ┌──────────────────┐    │
│  │ Dashboard  │  │  Providers   │  │  Configuration   │    │
│  └────────────┘  └──────────────┘  └──────────────────┘    │
│  ┌────────────┐  ┌──────────────┐  ┌──────────────────┐    │
│  │   Models   │  │Conversations │  │      Logs        │    │
│  └────────────┘  └──────────────┘  └──────────────────┘    │
└─────────────────────────────────────────────────────────────┘
                            │
                            │ HTTPS
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                    go-llm-gateway Server                     │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │              Middleware Stack                       │    │
│  │  Auth → Rate Limit → Logging → CORS → Router      │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌──────────────────┐  ┌──────────────────────────────┐   │
│  │  Gateway API     │  │      Admin API               │   │
│  │  /v1/*           │  │      /admin/api/*            │   │
│  ├──────────────────┤  ├──────────────────────────────┤   │
│  │ • /responses     │  │ • /config                    │   │
│  │ • /models        │  │ • /providers                 │   │
│  │ • /health        │  │ • /models                    │   │
│  │ • /ready         │  │ • /conversations             │   │
│  │ • /metrics       │  │ • /metrics                   │   │
│  └──────────────────┘  │ • /logs                      │   │
│                        │ • /system                    │   │
│  ┌──────────────────┐  └──────────────────────────────┘   │
│  │  Static Assets   │                                      │
│  │  /admin/*        │                                      │
│  │  (embedded)      │                                      │
│  └──────────────────┘                                      │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │              Core Components                        │    │
│  │  • Provider Registry                               │    │
│  │  • Conversation Store                              │    │
│  │  • Config Manager (new)                            │    │
│  │  • Metrics Collector                               │    │
│  │  • Log Buffer (new)                                │    │
│  └────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

Component Breakdown

Frontend Components

Technology Stack Options:

  1. Vue 3 + Vite (Recommended)

    • Lightweight (~50KB gzipped)
    • Reactive data binding
    • Component-based architecture
    • Excellent TypeScript support
  2. Svelte + Vite (Alternative)

    • Even lighter (~20KB)
    • Compile-time optimization
    • Simpler learning curve
  3. htmx + Alpine.js (Minimal)

    • No build step
    • Server-rendered hypermedia
    • ~40KB total

Recommended Choice: Vue 3 + Vite + TypeScript

  • Balance of features and bundle size
  • Strong ecosystem and tooling
  • Familiar to most developers

Frontend Structure:

frontend/
├── src/
│   ├── main.ts                 # App entry point
│   ├── App.vue                 # Root component
│   ├── router.ts               # Vue Router config
│   ├── api/                    # API client
│   │   ├── client.ts           # Axios/fetch wrapper
│   │   ├── config.ts           # Config API
│   │   ├── providers.ts        # Provider API
│   │   ├── models.ts           # Model API
│   │   ├── conversations.ts    # Conversation API
│   │   ├── metrics.ts          # Metrics API
│   │   └── system.ts           # System API
│   ├── components/             # Reusable components
│   │   ├── Layout.vue          # App layout
│   │   ├── Sidebar.vue         # Navigation
│   │   ├── Header.vue          # Top bar
│   │   ├── StatusBadge.vue     # Provider status
│   │   ├── MetricCard.vue      # Metric display
│   │   ├── ProviderForm.vue    # Provider editor
│   │   ├── ModelForm.vue       # Model editor
│   │   └── ConfigEditor.vue    # YAML/JSON editor
│   ├── views/                  # Page components
│   │   ├── Dashboard.vue       # Overview dashboard
│   │   ├── Providers.vue       # Provider management
│   │   ├── ProviderDetail.vue  # Single provider view
│   │   ├── Models.vue          # Model management
│   │   ├── Configuration.vue   # Config editor
│   │   ├── Conversations.vue   # Conversation browser
│   │   ├── Metrics.vue         # Metrics dashboard
│   │   ├── Logs.vue            # Log viewer
│   │   └── System.vue          # System info
│   ├── stores/                 # Pinia state management
│   │   ├── auth.ts             # Auth state
│   │   ├── config.ts           # Config state
│   │   ├── providers.ts        # Provider state
│   │   └── metrics.ts          # Metrics state
│   ├── types/                  # TypeScript types
│   │   └── api.ts              # API response types
│   └── utils/                  # Utilities
│       ├── formatting.ts       # Format helpers
│       └── validation.ts       # Form validation
├── public/
│   └── favicon.ico
├── index.html
├── package.json
├── tsconfig.json
├── vite.config.ts
└── README.md

Backend Components

New Go Packages:

internal/
├── admin/                      # Admin API package (NEW)
│   ├── handler.go              # HTTP handlers
│   ├── config_handler.go       # Config management
│   ├── provider_handler.go     # Provider management
│   ├── model_handler.go        # Model management
│   ├── conversation_handler.go # Conversation management
│   ├── metrics_handler.go      # Metrics aggregation
│   ├── logs_handler.go         # Log streaming
│   ├── system_handler.go       # System operations
│   └── middleware.go           # Admin-specific middleware
├── configmanager/              # Config management (NEW)
│   ├── manager.go              # Config CRUD operations
│   ├── validator.go            # Config validation
│   ├── diff.go                 # Config diff generation
│   └── reload.go               # Hot-reload logic
├── logbuffer/                  # Log buffering (NEW)
│   ├── buffer.go               # Circular log buffer
│   └── writer.go               # slog.Handler wrapper
└── auditlog/                   # Audit logging (NEW)
    ├── logger.go               # Audit event logger
    └── types.go                # Audit event types

Data Flow

Configuration Update Flow

User clicks "Save Config" in UI
    ↓
Frontend validates form input
    ↓
POST /admin/api/config with new config
    ↓
Backend validates config structure
    ↓
Generate diff (old vs new)
    ↓
Return diff to frontend for confirmation
    ↓
User confirms change
    ↓
POST /admin/api/config/apply
    ↓
Write to config file (or temp file)
    ↓
Reload config (hot-reload or restart)
    ↓
Update audit log
    ↓
Return success/failure
    ↓
Frontend refreshes dashboard

Metrics Data Flow

Prometheus metrics continuously collected
    ↓
GET /admin/api/metrics
    ↓
Backend queries Prometheus registry
    ↓
Aggregate by provider, model, status
    ↓
Calculate percentiles and rates
    ↓
Return JSON response
    ↓
Frontend updates charts (auto-refresh every 5s)

API Specification

Base Path

All admin API endpoints are under /admin/api/v1

Authentication

All endpoints require authentication via OIDC JWT token in Authorization: Bearer <token> header.

Common Response Format

Success Response:

{
  "success": true,
  "data": { /* endpoint-specific data */ },
  "timestamp": "2026-03-05T10:30:00Z"
}

Error Response:

{
  "success": false,
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid provider configuration",
    "details": {
      "field": "api_key",
      "reason": "API key is required"
    }
  },
  "timestamp": "2026-03-05T10:30:00Z"
}

Endpoints

System Information

GET /admin/api/v1/system/info

Get system information and status.

Response:

{
  "success": true,
  "data": {
    "version": "1.2.0",
    "build_time": "2026-03-01T08:00:00Z",
    "git_commit": "59ded10",
    "go_version": "1.25.7",
    "platform": "linux/amd64",
    "uptime_seconds": 86400,
    "config_file": "/app/config.yaml",
    "config_last_modified": "2026-03-05T09:00:00Z"
  }
}

GET /admin/api/v1/system/health

Get detailed health status.

Response:

{
  "success": true,
  "data": {
    "status": "healthy",
    "checks": {
      "server": { "status": "pass", "message": "Server running" },
      "providers": { "status": "pass", "message": "3/3 providers healthy" },
      "conversation_store": { "status": "pass", "message": "Connected to Redis" },
      "metrics": { "status": "pass", "message": "Prometheus collecting" }
    }
  }
}

POST /admin/api/v1/system/restart

Trigger graceful restart.

Request:

{
  "countdown_seconds": 5,
  "reason": "Configuration update"
}

Response:

{
  "success": true,
  "data": {
    "message": "Restart scheduled in 5 seconds",
    "restart_at": "2026-03-05T10:30:05Z"
  }
}

Configuration Management

GET /admin/api/v1/config

Get current configuration.

Query Parameters:

  • sanitized (boolean, default: true) - Mask sensitive values (API keys)

Response:

{
  "success": true,
  "data": {
    "config": {
      "server": {
        "address": ":8080",
        "max_request_body_size": 10485760
      },
      "logging": {
        "format": "json",
        "level": "info"
      },
      "providers": {
        "openai": {
          "type": "openai",
          "api_key": "sk-*********************xyz",
          "endpoint": "https://api.openai.com/v1"
        }
      },
      "models": [
        {
          "name": "gpt-4",
          "provider": "openai"
        }
      ]
    },
    "source": "file",
    "last_modified": "2026-03-05T09:00:00Z"
  }
}

POST /admin/api/v1/config/validate

Validate configuration without applying.

Request:

{
  "config": {
    "server": { "address": ":8081" }
  }
}

Response:

{
  "success": true,
  "data": {
    "valid": true,
    "warnings": [
      "Changing server address requires restart"
    ],
    "errors": []
  }
}

POST /admin/api/v1/config/diff

Generate diff between current and proposed config.

Request:

{
  "new_config": { /* full or partial config */ }
}

Response:

{
  "success": true,
  "data": {
    "diff": [
      {
        "path": "server.address",
        "old_value": ":8080",
        "new_value": ":8081",
        "type": "modified"
      },
      {
        "path": "providers.anthropic",
        "old_value": null,
        "new_value": { "type": "anthropic", "api_key": "***" },
        "type": "added"
      }
    ],
    "requires_restart": true
  }
}

PUT /admin/api/v1/config

Update configuration.

Request:

{
  "config": { /* new configuration */ },
  "apply_method": "hot_reload",  // or "restart"
  "backup": true
}

Response:

{
  "success": true,
  "data": {
    "applied": true,
    "method": "hot_reload",
    "backup_file": "/app/backups/config.yaml.2026-03-05-103000.bak",
    "changes": [ /* diff */ ]
  }
}

GET /admin/api/v1/config/export

Export configuration as YAML.

Response: (Content-Type: application/x-yaml)

server:
  address: ":8080"
# ... full config

Provider Management

GET /admin/api/v1/providers

List all providers.

Response:

{
  "success": true,
  "data": {
    "providers": [
      {
        "name": "openai",
        "type": "openai",
        "status": "healthy",
        "circuit_breaker_state": "closed",
        "endpoint": "https://api.openai.com/v1",
        "metrics": {
          "total_requests": 1523,
          "error_count": 12,
          "error_rate": 0.0079,
          "avg_latency_ms": 342,
          "p95_latency_ms": 876
        },
        "last_request_at": "2026-03-05T10:29:45Z",
        "last_error_at": "2026-03-05T09:15:22Z"
      }
    ]
  }
}

GET /admin/api/v1/providers/{name}

Get provider details.

Response:

{
  "success": true,
  "data": {
    "name": "openai",
    "type": "openai",
    "config": {
      "api_key": "sk-*********************xyz",
      "endpoint": "https://api.openai.com/v1"
    },
    "status": "healthy",
    "circuit_breaker": {
      "state": "closed",
      "consecutive_failures": 0,
      "last_state_change": "2026-03-05T08:00:00Z"
    },
    "metrics": { /* detailed metrics */ }
  }
}

POST /admin/api/v1/providers

Add new provider.

Request:

{
  "name": "anthropic-prod",
  "type": "anthropic",
  "config": {
    "api_key": "sk-ant-...",
    "endpoint": "https://api.anthropic.com"
  }
}

Response:

{
  "success": true,
  "data": {
    "name": "anthropic-prod",
    "created": true
  }
}

PUT /admin/api/v1/providers/{name}

Update provider configuration.

Request:

{
  "config": {
    "api_key": "new-key",
    "endpoint": "https://api.anthropic.com"
  }
}

DELETE /admin/api/v1/providers/{name}

Delete provider.

Response:

{
  "success": true,
  "data": {
    "deleted": true,
    "affected_models": ["claude-3-opus", "claude-3-sonnet"]
  }
}

POST /admin/api/v1/providers/{name}/test

Test provider connectivity.

Request:

{
  "test_message": "Hello, test",
  "model": "gpt-4"  // optional, uses default
}

Response:

{
  "success": true,
  "data": {
    "reachable": true,
    "latency_ms": 342,
    "response": "Test successful",
    "error": null
  }
}

POST /admin/api/v1/providers/{name}/circuit-breaker/reset

Reset circuit breaker state.

Response:

{
  "success": true,
  "data": {
    "previous_state": "open",
    "new_state": "closed"
  }
}

Model Management

GET /admin/api/v1/models

List all model configurations.

Response:

{
  "success": true,
  "data": {
    "models": [
      {
        "name": "gpt-4",
        "provider": "openai",
        "provider_model_id": null,
        "metrics": {
          "total_requests": 856,
          "avg_latency_ms": 1234
        }
      },
      {
        "name": "gpt-4-azure",
        "provider": "azure-openai",
        "provider_model_id": "gpt-4-deployment-001",
        "metrics": {
          "total_requests": 234,
          "avg_latency_ms": 987
        }
      }
    ]
  }
}

POST /admin/api/v1/models

Add new model mapping.

Request:

{
  "name": "claude-opus",
  "provider": "anthropic-prod",
  "provider_model_id": "claude-3-opus-20240229"
}

PUT /admin/api/v1/models/{name}

Update model mapping.

DELETE /admin/api/v1/models/{name}

Delete model mapping.

Conversation Management

GET /admin/api/v1/conversations

List conversations with pagination.

Query Parameters:

  • page (int, default: 1)
  • page_size (int, default: 50, max: 200)
  • search (string) - Search by conversation ID
  • sort (string) - Sort field (created_at, updated_at)
  • order (string) - asc or desc

Response:

{
  "success": true,
  "data": {
    "conversations": [
      {
        "id": "conv_abc123",
        "created_at": "2026-03-05T10:00:00Z",
        "updated_at": "2026-03-05T10:15:00Z",
        "message_count": 6,
        "total_tokens": 2456,
        "model": "gpt-4",
        "metadata": {}
      }
    ],
    "pagination": {
      "page": 1,
      "page_size": 50,
      "total_count": 1234,
      "total_pages": 25
    }
  }
}

GET /admin/api/v1/conversations/{id}

Get conversation details.

Response:

{
  "success": true,
  "data": {
    "id": "conv_abc123",
    "created_at": "2026-03-05T10:00:00Z",
    "updated_at": "2026-03-05T10:15:00Z",
    "messages": [
      {
        "role": "user",
        "content": "Hello",
        "timestamp": "2026-03-05T10:00:00Z"
      },
      {
        "role": "assistant",
        "content": "Hi there!",
        "timestamp": "2026-03-05T10:00:02Z"
      }
    ],
    "metadata": {},
    "total_tokens": 2456
  }
}

DELETE /admin/api/v1/conversations/{id}

Delete specific conversation.

POST /admin/api/v1/conversations/bulk-delete

Bulk delete conversations.

Request:

{
  "criteria": {
    "older_than_days": 30,
    "model": "gpt-3.5-turbo"  // optional filter
  },
  "dry_run": true  // preview without deleting
}

Response:

{
  "success": true,
  "data": {
    "matched_count": 456,
    "deleted_count": 0,  // 0 if dry_run
    "dry_run": true
  }
}

GET /admin/api/v1/conversations/stats

Get conversation statistics.

Response:

{
  "success": true,
  "data": {
    "total_conversations": 1234,
    "total_messages": 7890,
    "total_tokens": 1234567,
    "by_model": {
      "gpt-4": 856,
      "claude-3-opus": 378
    },
    "by_date": [
      { "date": "2026-03-05", "count": 123 },
      { "date": "2026-03-04", "count": 98 }
    ],
    "storage_size_bytes": 52428800
  }
}

Metrics

GET /admin/api/v1/metrics/summary

Get aggregated metrics summary.

Query Parameters:

  • duration (string, default: "1h") - Time window (1m, 5m, 1h, 24h)

Response:

{
  "success": true,
  "data": {
    "time_window": "1h",
    "request_count": 1523,
    "error_count": 12,
    "error_rate": 0.0079,
    "requests_per_second": 0.42,
    "latency": {
      "p50": 234,
      "p90": 567,
      "p95": 876,
      "p99": 1234
    },
    "by_provider": {
      "openai": {
        "request_count": 1200,
        "error_count": 8,
        "avg_latency_ms": 342
      },
      "anthropic": {
        "request_count": 323,
        "error_count": 4,
        "avg_latency_ms": 567
      }
    },
    "by_model": {
      "gpt-4": { "request_count": 856, "error_count": 5 },
      "claude-3-opus": { "request_count": 323, "error_count": 4 }
    }
  }
}

GET /admin/api/v1/metrics/timeseries

Get time-series metrics for charting.

Query Parameters:

  • metric (string) - request_count, error_rate, latency_p95
  • duration (string) - 1h, 6h, 24h, 7d
  • interval (string) - 1m, 5m, 1h
  • provider (string, optional) - Filter by provider
  • model (string, optional) - Filter by model

Response:

{
  "success": true,
  "data": {
    "metric": "request_count",
    "interval": "5m",
    "data_points": [
      { "timestamp": "2026-03-05T10:00:00Z", "value": 42 },
      { "timestamp": "2026-03-05T10:05:00Z", "value": 38 },
      { "timestamp": "2026-03-05T10:10:00Z", "value": 51 }
    ]
  }
}

Logs

GET /admin/api/v1/logs

Get recent logs (last N entries).

Query Parameters:

  • limit (int, default: 100, max: 1000)
  • level (string) - Filter by level (debug, info, warn, error)
  • search (string) - Search in message

Response:

{
  "success": true,
  "data": {
    "logs": [
      {
        "timestamp": "2026-03-05T10:30:15Z",
        "level": "info",
        "message": "Request completed",
        "fields": {
          "method": "POST",
          "path": "/v1/responses",
          "status": 200,
          "duration_ms": 342
        }
      }
    ],
    "total_count": 100,
    "truncated": false
  }
}

GET /admin/api/v1/logs/stream

Stream logs via Server-Sent Events (SSE).

Response: (text/event-stream)

data: {"timestamp":"2026-03-05T10:30:15Z","level":"info","message":"..."}

data: {"timestamp":"2026-03-05T10:30:16Z","level":"error","message":"..."}

Audit Log

GET /admin/api/v1/audit

Get audit log of admin actions.

Query Parameters:

  • page (int)
  • page_size (int)
  • user (string) - Filter by user
  • action (string) - Filter by action type

Response:

{
  "success": true,
  "data": {
    "events": [
      {
        "id": "audit_xyz789",
        "timestamp": "2026-03-05T10:25:00Z",
        "user": "admin@example.com",
        "action": "config.update",
        "resource": "server.address",
        "changes": {
          "old_value": ":8080",
          "new_value": ":8081"
        },
        "ip_address": "192.168.1.100",
        "user_agent": "Mozilla/5.0..."
      }
    ],
    "pagination": { /* ... */ }
  }
}

UI Design

Design Principles

  1. Clarity over Complexity - Show what matters, hide what doesn't
  2. Progressive Disclosure - Surface details on demand
  3. Immediate Feedback - Loading states, success/error messages
  4. Consistency - Reuse patterns across views
  5. Accessibility - Keyboard navigation, screen reader support

Layout Structure

┌────────────────────────────────────────────────────────────┐
│  Header: [Logo] go-llm-gateway Admin  [User] [Dark Mode]  │
├──────────┬─────────────────────────────────────────────────┤
│          │                                                  │
│ Sidebar  │              Main Content Area                  │
│          │                                                  │
│ ☰ Dash   │  ┌─────────────────────────────────────────┐   │
│ 📊 Prov  │  │                                          │   │
│ 🔧 Model │  │                                          │   │
│ ⚙️  Conf │  │                                          │   │
│ 💬 Conv  │  │         Page-Specific Content            │   │
│ 📈 Metr  │  │                                          │   │
│ 📝 Logs  │  │                                          │   │
│ 🖥️  Sys  │  │                                          │   │
│          │  └─────────────────────────────────────────┘   │
│          │                                                  │
└──────────┴─────────────────────────────────────────────────┘

Page Wireframes

1. Dashboard (Home)

┌─────────────────────────────────────────────────────────────┐
│  Dashboard                                                   │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐          │
│  │  Uptime     │ │  Requests   │ │  Error Rate │          │
│  │  2d 14h     │ │  1,523      │ │  0.79%      │          │
│  │  ✓ Healthy  │ │  ↑ 12% 1h   │ │  ↓ 0.3% 1h  │          │
│  └─────────────┘ └─────────────┘ └─────────────┘          │
│                                                              │
│  Provider Status                                            │
│  ┌───────────────────────────────────────────────────────┐ │
│  │ openai       ✓ Healthy     │ 1,200 req │  342ms      │ │
│  │ anthropic    ✓ Healthy     │   323 req │  567ms      │ │
│  │ google       ⚠ Degraded    │     0 req │    0ms      │ │
│  └───────────────────────────────────────────────────────┘ │
│                                                              │
│  Request Rate (Last Hour)                                   │
│  ┌───────────────────────────────────────────────────────┐ │
│  │      📊 [Line Chart]                                   │ │
│  │      requests/sec over time                            │ │
│  └───────────────────────────────────────────────────────┘ │
│                                                              │
│  Recent Activity                                            │
│  ┌───────────────────────────────────────────────────────┐ │
│  │ 10:30:15 INFO  Request completed (gpt-4, 342ms)       │ │
│  │ 10:30:10 INFO  Request completed (claude-3, 567ms)    │ │
│  │ 10:29:58 ERROR Provider timeout (google)              │ │
│  └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

2. Providers

┌─────────────────────────────────────────────────────────────┐
│  Providers                            [+ Add Provider]      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ ┌─┐ openai                                ✓ Healthy    ││
│  │ │▼│ Type: OpenAI                          [Test] [Edit]││
│  │ └─┘ Endpoint: https://api.openai.com/v1  [Delete]      ││
│  │                                                          ││
│  │     Circuit Breaker: Closed (0 failures)                ││
│  │     Metrics: 1,200 requests, 0.67% errors, 342ms avg    ││
│  │     Last request: 2 seconds ago                         ││
│  │                                                          ││
│  │     ┌──────────────────────────────────────────────┐   ││
│  │     │ Request Count:  [Mini chart ↗]               │   ││
│  │     │ Latency P95:    [Mini chart →]               │   ││
│  │     └──────────────────────────────────────────────┘   ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ ┌─┐ anthropic-prod                       ✓ Healthy    ││
│  │ │▶│ Type: Anthropic                      [Test] [Edit]││
│  │ └─┘ Endpoint: https://api.anthropic.com  [Delete]      ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ ┌─┐ google                                ⚠ Degraded   ││
│  │ │▶│ Type: Google Generative AI           [Test] [Edit]││
│  │ └─┘ Circuit Breaker: OPEN (5 failures)   [Delete]      ││
│  │                                           [Reset CB]    ││
│  └────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

Add/Edit Provider Modal:

┌─────────────────────────────────────────────────────┐
│  Add Provider                              [X]      │
├─────────────────────────────────────────────────────┤
│                                                      │
│  Provider Name *                                    │
│  [openai-prod              ]                        │
│                                                      │
│  Provider Type *                                    │
│  [OpenAI        ▼]                                  │
│                                                      │
│  API Key *                                          │
│  [sk-••••••••••••••••••••xyz]  [Show] [Test]       │
│                                                      │
│  Endpoint (optional)                                │
│  [https://api.openai.com/v1]                        │
│                                                      │
│  ⓘ Leave blank to use default endpoint              │
│                                                      │
│                          [Cancel]  [Save Provider]  │
└─────────────────────────────────────────────────────┘

3. Models

┌─────────────────────────────────────────────────────────────┐
│  Models                                  [+ Add Model]      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Search: [          🔍]     Filter: [All Providers ▼]       │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ Name              Provider      Model ID      Requests ││
│  ├────────────────────────────────────────────────────────┤│
│  │ gpt-4             openai        (default)     856      ││
│  │ gpt-4-turbo       openai        (default)     432      ││
│  │ gpt-4-azure       azure-openai  gpt4-dep-001  234      ││
│  │ claude-3-opus     anthropic     claude-3-...  323      ││
│  │ claude-3-sonnet   anthropic     claude-3-...  189      ││
│  │ gemini-pro        google        (default)     56       ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  [← Prev]  Page 1 of 1  [Next →]                           │
└─────────────────────────────────────────────────────────────┘

4. Configuration

┌─────────────────────────────────────────────────────────────┐
│  Configuration                                              │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  [Server] [Logging] [Rate Limit] [Auth] [Observability]    │
│  ─────────────────────────────────────────────────────────  │
│                                                              │
│  Server Configuration                                       │
│  ┌────────────────────────────────────────────────────────┐│
│  │                                                          ││
│  │  Listen Address                                         ││
│  │  [:8080              ]                                  ││
│  │                                                          ││
│  │  Max Request Body Size (bytes)                          ││
│  │  [10485760           ]  (10 MB)                         ││
│  │                                                          ││
│  │  Read Timeout (seconds)                                 ││
│  │  [15                 ]                                  ││
│  │                                                          ││
│  │  Write Timeout (seconds)                                ││
│  │  [60                 ]                                  ││
│  │                                                          ││
│  │  Idle Timeout (seconds)                                 ││
│  │  [120                ]                                  ││
│  │                                                          ││
│  │  ⚠ Changing these settings requires a restart           ││
│  │                                                          ││
│  │                       [Reset]  [Save Configuration]     ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  Advanced Options                                           │
│  [View as YAML]  [Export Config]  [Import Config]          │
└─────────────────────────────────────────────────────────────┘

YAML Editor View:

┌─────────────────────────────────────────────────────────────┐
│  Configuration (YAML)              [Switch to Form View]   │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │  1  server:                                            ││
│  │  2    address: ":8080"                                 ││
│  │  3    max_request_body_size: 10485760                  ││
│  │  4                                                      ││
│  │  5  logging:                                           ││
│  │  6    format: "json"                                   ││
│  │  7    level: "info"                                    ││
│  │  8                                                      ││
│  │  9  providers:                                         ││
│  │ 10    openai:                                          ││
│  │ 11      type: "openai"                                 ││
│  │ 12      api_key: "${OPENAI_API_KEY}"                   ││
│  │                                                         ││
│  │ [Syntax highlighting and validation]                   ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  ✓ Configuration is valid                                  │
│                                                              │
│  [Show Diff]  [Validate]  [Save Configuration]             │
└─────────────────────────────────────────────────────────────┘

5. Conversations

┌─────────────────────────────────────────────────────────────┐
│  Conversations                                              │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Search: [conv_abc123     🔍]   [Bulk Delete...]           │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ ID            Created    Messages  Model       Actions ││
│  ├────────────────────────────────────────────────────────┤│
│  │ conv_abc123   2h ago     6         gpt-4       [View]  ││
│  │ conv_def456   3h ago     12        claude-3    [View]  ││
│  │ conv_ghi789   5h ago     3         gpt-4       [View]  ││
│  │ conv_jkl012   1d ago     8         gemini-pro  [View]  ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  [← Prev]  Page 1 of 25 (1,234 total)  [Next →]           │
│                                                              │
│  Statistics                                                 │
│  Total: 1,234 conversations  |  7,890 messages  |  52 MB   │
└─────────────────────────────────────────────────────────────┘

Conversation Detail Modal:

┌─────────────────────────────────────────────────────────────┐
│  Conversation: conv_abc123                    [Delete] [X] │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Created: 2026-03-05 08:15:30  |  Model: gpt-4             │
│  Messages: 6  |  Tokens: 2,456  |  Updated: 08:30:15       │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ 👤 User (08:15:30)                                     ││
│  │ Hello, can you help me with a coding question?         ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ 🤖 Assistant (08:15:32)                                ││
│  │ Of course! I'd be happy to help. What's your question?││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ 👤 User (08:16:10)                                     ││
│  │ How do I implement a binary search in Python?          ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  [... more messages ...]                                    │
│                                                              │
│                                              [Close]        │
└─────────────────────────────────────────────────────────────┘

6. Metrics

┌─────────────────────────────────────────────────────────────┐
│  Metrics                    Time: [Last Hour ▼]  [Refresh] │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Overview                                                   │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐       │
│  │ Total Req    │ │ Requests/sec │ │ Error Rate   │       │
│  │ 1,523        │ │ 0.42         │ │ 0.79%        │       │
│  └──────────────┘ └──────────────┘ └──────────────┘       │
│                                                              │
│  Request Rate                                               │
│  ┌────────────────────────────────────────────────────────┐│
│  │  50 ┤                                                   ││
│  │  40 ┤              ╭─╮                                  ││
│  │  30 ┤         ╭────╯ ╰─╮                               ││
│  │  20 ┤    ╭────╯        ╰──╮                            ││
│  │  10 ┤────╯                ╰────                         ││
│  │   0 ┼────────────────────────────────────              ││
│  │     9:30   10:00   10:30   11:00                       ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  Latency (P95)                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ 1200ms ┤                                                ││
│  │  900ms ┤         ╭─────╮                               ││
│  │  600ms ┤─────────╯     ╰─────────                      ││
│  │  300ms ┤                                                ││
│  │      0 ┼────────────────────────────────────            ││
│  │        9:30   10:00   10:30   11:00                    ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  By Provider                                                │
│  ┌────────────────────────────────────────────────────────┐│
│  │ Provider    Requests  Errors  Avg Latency  P95        ││
│  ├────────────────────────────────────────────────────────┤│
│  │ openai      1,200     8       342ms        876ms      ││
│  │ anthropic   323       4       567ms        1234ms     ││
│  │ google      0         0       -            -          ││
│  └────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

7. Logs

┌─────────────────────────────────────────────────────────────┐
│  Logs                   [Auto-refresh: ON]  [Download]     │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Level: [All ▼]  Search: [          🔍]                     │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ 10:30:45 INFO  Request completed                       ││
│  │              method=POST path=/v1/responses status=200 ││
│  │              duration=342ms model=gpt-4                ││
│  │                                                         ││
│  │ 10:30:42 INFO  Provider request started                ││
│  │              provider=openai model=gpt-4               ││
│  │                                                         ││
│  │ 10:30:30 ERROR Provider request failed                 ││
│  │              provider=google error="connection timeout"││
│  │              circuit_breaker=open                      ││
│  │                                                         ││
│  │ 10:30:15 INFO  Request completed                       ││
│  │              method=POST path=/v1/responses status=200 ││
│  │                                                         ││
│  │ 10:29:58 WARN  Rate limit exceeded                     ││
│  │              ip=192.168.1.100 path=/v1/responses       ││
│  │                                                         ││
│  │ [... scrollable log entries ...]                       ││
│  │                                                         ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  Showing last 100 entries  |  [Load More]                  │
└─────────────────────────────────────────────────────────────┘

8. System

┌─────────────────────────────────────────────────────────────┐
│  System Information                                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Application                                                │
│  ┌────────────────────────────────────────────────────────┐│
│  │ Version:        1.2.0                                   ││
│  │ Build Time:     2026-03-01 08:00:00 UTC                ││
│  │ Git Commit:     59ded10                                ││
│  │ Go Version:     1.25.7                                 ││
│  │ Platform:       linux/amd64                            ││
│  │ Uptime:         2 days 14 hours 23 minutes             ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  Configuration                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ Config File:    /app/config.yaml                       ││
│  │ Last Modified:  2026-03-05 09:00:00 UTC                ││
│  │ File Size:      4.2 KB                                 ││
│  │ Valid:          ✓ Yes                                  ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  Health Checks                                              │
│  ┌────────────────────────────────────────────────────────┐│
│  │ ✓ Server               Healthy                         ││
│  │ ✓ Providers            3/3 healthy                     ││
│  │ ✓ Conversation Store   Connected (Redis)               ││
│  │ ✓ Metrics              Collecting                      ││
│  │ ✓ Tracing              Enabled (OTLP)                  ││
│  └────────────────────────────────────────────────────────┘│
│                                                              │
│  Operations                                                 │
│  [Download Diagnostic Bundle]  [Restart Service...]        │
│                                                              │
│  Environment (Sanitized)                                    │
│  [View Environment Variables]                              │
└─────────────────────────────────────────────────────────────┘

UI Components Library

Reusable Components:

  1. StatusBadge - Color-coded status indicators

    • Healthy (green), Degraded (yellow), Unhealthy (red), Unknown (gray)
  2. MetricCard - Display single metric with trend

    • Large number, label, trend arrow, sparkline
  3. ProviderCard - Provider summary with expand/collapse

  4. DataTable - Sortable, filterable table with pagination

  5. Chart - Line/bar charts for time-series data

    • Use lightweight charting library (Chart.js or Apache ECharts)
  6. CodeEditor - Syntax-highlighted YAML/JSON editor

    • Monaco Editor (VS Code engine) or CodeMirror
  7. Modal - Overlay dialogs for forms and details

  8. Toast - Success/error notifications

  9. ConfirmDialog - Confirmation for destructive actions


Security

Authentication & Authorization

Authentication:

  • Reuse existing OIDC/OAuth2 middleware from internal/auth/auth.go
  • All /admin/* routes require valid JWT token
  • Support same identity providers as gateway API

Authorization (RBAC):

Introduce role-based access control with two roles:

  1. Admin Role (admin)

    • Full read/write access
    • Can modify configuration
    • Can delete resources (conversations, providers)
    • Can restart service
  2. Viewer Role (viewer)

    • Read-only access
    • Can view all pages
    • Cannot modify configuration
    • Cannot delete resources
    • Cannot restart service

Role Assignment:

  • Roles extracted from JWT claims (e.g., roles or groups claim)
  • Configurable claim name in config.yaml:
    auth:
      enabled: true
      issuer: "https://auth.example.com"
      audience: "gateway-admin"
      roles_claim: "roles"  # JWT claim containing roles
      admin_roles:          # Values that grant admin access
        - "admin"
        - "gateway-admin"
    

Implementation:

// internal/admin/middleware.go

func RequireRole(requiredRole string) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            claims := auth.ClaimsFromContext(r.Context())
            userRoles := claims["roles"].([]string)

            if !hasRole(userRoles, requiredRole) {
                http.Error(w, "Forbidden", http.StatusForbidden)
                return
            }

            next.ServeHTTP(w, r)
        })
    }
}

// Usage in routes
mux.Handle("/admin/api/v1/config", RequireRole("admin")(configHandler))
mux.Handle("/admin/api/v1/providers", RequireRole("viewer")(providersHandler))

Input Validation & Sanitization

Configuration Validation:

  • Validate all config changes before applying
  • Use strong typing (Go structs) for validation
  • Reject invalid YAML syntax
  • Validate provider-specific fields (API key format, endpoint URLs)
  • Prevent path traversal in file operations

API Input Validation:

  • Validate all request bodies against expected schemas
  • Sanitize user input (conversation search, log search)
  • Limit input sizes (prevent DoS via large payloads)
  • Validate pagination parameters (prevent negative pages)

Secret Management

Masking Secrets:

  • Always mask API keys and sensitive values in UI displays
  • Show format: sk-*********************xyz (first 3 + last 3 chars)
  • Never log full API keys in audit logs
  • Sanitize secrets before returning in API responses

Storage:

  • Secrets stored in config.yaml with environment variable references
  • Never commit secrets to version control
  • Support secret management systems (future: Vault, AWS Secrets Manager)

CSRF Protection

Protection Strategy:

  • Generate CSRF token on admin UI load
  • Include token in all state-changing requests (POST, PUT, DELETE)
  • Validate token on server before processing request
  • Use SameSite cookies for additional protection

Implementation:

// Double Submit Cookie pattern
func CSRFMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        if r.Method != "GET" && r.Method != "HEAD" {
            tokenHeader := r.Header.Get("X-CSRF-Token")
            tokenCookie, _ := r.Cookie("csrf_token")

            if tokenHeader == "" || tokenCookie == nil || tokenHeader != tokenCookie.Value {
                http.Error(w, "CSRF token mismatch", http.StatusForbidden)
                return
            }
        }

        next.ServeHTTP(w, r)
    })
}

Content Security Policy

CSP Headers:

Content-Security-Policy:
  default-src 'self';
  script-src 'self' 'unsafe-inline';  # Allow inline Vue scripts
  style-src 'self' 'unsafe-inline';   # Allow inline styles
  img-src 'self' data:;
  connect-src 'self';                 # API calls to same origin
  frame-ancestors 'none';             # Prevent clickjacking
  base-uri 'self';
  form-action 'self';

Rate Limiting

Admin API Rate Limiting:

  • Separate rate limits for admin API vs gateway API
  • Higher limits for read operations, lower for writes
  • Per-user rate limiting (based on JWT subject)
  • Example: 100 req/min for reads, 20 req/min for writes

Audit Logging

Log All Admin Actions:

  • Configuration changes (before/after values)
  • Provider additions/deletions
  • Model changes
  • Bulk deletions
  • Service restarts
  • Authentication failures

Audit Log Format:

{
  "timestamp": "2026-03-05T10:25:00Z",
  "event_type": "config.update",
  "user": "admin@example.com",
  "user_ip": "192.168.1.100",
  "resource": "providers.openai.api_key",
  "action": "update",
  "old_value": "sk-***old***",
  "new_value": "sk-***new***",
  "success": true,
  "error": null
}

Storage:

  • Write to separate audit log file (/var/log/gateway-audit.log)
  • Structured JSON format for easy parsing
  • Rotate logs daily, retain for 90 days
  • Optional: Send to external SIEM system

TLS/HTTPS

Production Requirements:

  • Admin UI MUST be served over HTTPS in production
  • Support TLS 1.2+ only
  • Strong cipher suites only
  • HSTS headers: Strict-Transport-Security: max-age=31536000; includeSubDomains

Configuration:

server:
  address: ":8443"
  tls:
    enabled: true
    cert_file: "/etc/gateway/tls/cert.pem"
    key_file: "/etc/gateway/tls/key.pem"

Implementation Phases

Phase 1: Foundation (Week 1)

Goal: Basic admin API and static UI serving

Backend Tasks:

  1. Create internal/admin/ package structure
  2. Implement basic HTTP handlers for system info and health
  3. Add static file serving for admin UI assets (using embed.FS)
  4. Set up admin-specific middleware (auth, CORS, CSRF)
  5. Implement audit logging infrastructure

Frontend Tasks:

  1. Set up Vue 3 + Vite project in frontend/admin/
  2. Create basic layout (header, sidebar, main content)
  3. Implement routing (Vue Router)
  4. Create API client wrapper (Axios)
  5. Build Dashboard page (system info, health status)

Deliverables:

  • Admin UI accessible at /admin/
  • System info and health endpoints working
  • Basic authentication enforced
  • Static assets served from embedded FS

Phase 2: Configuration Management (Week 2)

Goal: View and edit configuration

Backend Tasks:

  1. Create internal/configmanager/ package
  2. Implement config CRUD operations
  3. Add config validation logic
  4. Implement diff generation
  5. Add config export/import endpoints
  6. Implement hot-reload for config changes (where possible)

Frontend Tasks:

  1. Build Configuration page with tabbed interface
  2. Implement form-based config editor
  3. Build YAML editor with syntax highlighting (Monaco Editor)
  4. Add config validation UI
  5. Implement diff viewer before applying changes
  6. Add export/import functionality

Deliverables:

  • View current configuration (sanitized)
  • Edit configuration via forms or YAML
  • Validate configuration before saving
  • Preview changes before applying
  • Export configuration as YAML file

Phase 3: Provider & Model Management (Week 3)

Goal: Manage providers and models

Backend Tasks:

  1. Implement provider CRUD endpoints
  2. Add provider test connectivity endpoint
  3. Implement circuit breaker reset endpoint
  4. Add model CRUD endpoints
  5. Aggregate provider metrics from Prometheus

Frontend Tasks:

  1. Build Providers page with expandable cards
  2. Implement provider add/edit forms
  3. Add provider connection testing
  4. Display provider metrics and circuit breaker status
  5. Build Models page with data table
  6. Implement model add/edit functionality

Deliverables:

  • List all providers with status
  • Add/edit/delete providers
  • Test provider connectivity
  • Reset circuit breakers
  • Manage model mappings

Phase 4: Metrics & Monitoring (Week 4)

Goal: Real-time metrics visualization

Backend Tasks:

  1. Implement metrics aggregation endpoints
  2. Add time-series data endpoints
  3. Implement metrics filtering (by provider, model)
  4. Add circuit breaker state change history

Frontend Tasks:

  1. Build Metrics page with charts (Chart.js)
  2. Implement real-time metrics (auto-refresh)
  3. Add interactive time range selection
  4. Build provider-specific metric views
  5. Add latency percentile charts

Deliverables:

  • Real-time request rate charts
  • Error rate visualization
  • Latency percentile charts
  • Provider-specific metrics
  • Auto-refreshing dashboard

Phase 5: Conversations & Logs (Week 5)

Goal: Conversation management and log viewing

Backend Tasks:

  1. Implement internal/logbuffer/ for log buffering
  2. Add conversation list/search endpoints
  3. Implement conversation detail endpoint
  4. Add bulk delete functionality
  5. Implement log streaming (SSE)

Frontend Tasks:

  1. Build Conversations page with pagination
  2. Implement conversation search
  3. Add conversation detail modal
  4. Build bulk delete interface
  5. Build Logs page with filtering
  6. Implement real-time log streaming

Deliverables:

  • Browse and search conversations
  • View conversation details
  • Delete conversations (single and bulk)
  • View application logs with filtering
  • Real-time log streaming

Phase 6: Polish & Production Readiness (Week 6)

Goal: Security hardening, testing, documentation

Tasks:

  1. Implement RBAC (admin vs viewer roles)
  2. Add comprehensive input validation
  3. Implement CSRF protection
  4. Add CSP headers
  5. Write unit tests (backend handlers)
  6. Write integration tests (API endpoints)
  7. Add E2E tests (Playwright)
  8. Performance optimization (bundle size, lazy loading)
  9. Accessibility audit and fixes
  10. Documentation (user guide, API docs)
  11. Docker image updates (include frontend build)

Deliverables:

  • Production-ready security hardening
  • Comprehensive test coverage
  • Performance optimized
  • Fully documented
  • Docker deployment ready

Testing Strategy

Backend Testing

Unit Tests:

  • Test all handler functions with mock dependencies
  • Test config validation logic
  • Test audit logging
  • Target: 80%+ code coverage

Integration Tests:

  • Test API endpoints with real HTTP requests
  • Test authentication/authorization flows
  • Test RBAC enforcement
  • Test configuration hot-reload

Example:

func TestProviderHandler(t *testing.T) {
    tests := []struct {
        name           string
        method         string
        path           string
        body           string
        expectedStatus int
    }{
        {
            name:           "List providers",
            method:         "GET",
            path:           "/admin/api/v1/providers",
            expectedStatus: http.StatusOK,
        },
        {
            name:           "Add provider",
            method:         "POST",
            path:           "/admin/api/v1/providers",
            body:           `{"name":"test","type":"openai","config":{"api_key":"sk-test"}}`,
            expectedStatus: http.StatusCreated,
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            // Test implementation
        })
    }
}

Frontend Testing

Unit Tests (Vitest):

  • Test Vue components in isolation
  • Test API client functions
  • Test utility functions
  • Target: 70%+ component coverage

Component Tests:

  • Test user interactions
  • Test form validation
  • Test state management (Pinia stores)

E2E Tests (Playwright):

  • Test complete user workflows
  • Test authentication flow
  • Test config editing flow
  • Test provider management

Example:

// tests/e2e/providers.spec.ts
test('should add new provider', async ({ page }) => {
  await page.goto('/admin/providers');
  await page.click('text=Add Provider');
  await page.fill('input[name="name"]', 'test-provider');
  await page.selectOption('select[name="type"]', 'openai');
  await page.fill('input[name="api_key"]', 'sk-test-key');
  await page.click('button:has-text("Save Provider")');

  await expect(page.locator('.toast-success')).toBeVisible();
  await expect(page.locator('text=test-provider')).toBeVisible();
});

Performance Testing

Load Testing:

  • Test admin API under load (Apache Bench, k6)
  • Ensure < 1% CPU overhead when admin UI active
  • Test with 10 concurrent admin users
  • Verify no impact on gateway API performance

Frontend Performance:

  • Lighthouse audit (target: 90+ performance score)
  • Bundle size analysis (target: < 500KB gzipped)
  • Time to Interactive (target: < 2s)

Security Testing

Automated Scans:

  • OWASP ZAP scan for common vulnerabilities
  • npm audit / go mod audit for dependency vulnerabilities
  • CodeQL static analysis

Manual Testing:

  • Test RBAC enforcement
  • Test CSRF protection
  • Test secret masking
  • Test input validation
  • Test audit logging

Deployment

Build Process

Frontend Build:

cd frontend/admin
npm install
npm run build  # Outputs to frontend/admin/dist/

Embed Frontend in Go Binary:

// internal/admin/assets.go
package admin

import "embed"

//go:embed frontend/dist/*
var frontendAssets embed.FS

Full Build:

# Build frontend
cd frontend/admin && npm run build && cd ../..

# Build Go binary (includes embedded frontend)
go build -o gateway ./cmd/gateway

# Result: Single binary with admin UI embedded

Docker Image

Updated Dockerfile:

# Stage 1: Build frontend
FROM node:20-alpine AS frontend-builder
WORKDIR /app/frontend/admin
COPY frontend/admin/package*.json ./
RUN npm ci
COPY frontend/admin/ ./
RUN npm run build

# Stage 2: Build Go binary
FROM golang:1.25.7-alpine AS go-builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
COPY --from=frontend-builder /app/frontend/admin/dist ./internal/admin/frontend/dist
RUN CGO_ENABLED=1 go build -o gateway ./cmd/gateway

# Stage 3: Runtime
FROM alpine:3.19
RUN apk --no-cache add ca-certificates
WORKDIR /app
COPY --from=go-builder /app/gateway /app/gateway
COPY config.example.yaml /app/config.yaml
EXPOSE 8080
USER 1000:1000
ENTRYPOINT ["/app/gateway"]

Build Command:

docker build -t go-llm-gateway:latest .

Kubernetes Deployment

Updated Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gateway
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: gateway
        image: go-llm-gateway:latest
        ports:
        - containerPort: 8080
          name: http
        env:
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: gateway-secrets
              key: openai-api-key
        volumeMounts:
        - name: config
          mountPath: /app/config.yaml
          subPath: config.yaml
      volumes:
      - name: config
        configMap:
          name: gateway-config
---
apiVersion: v1
kind: Service
metadata:
  name: gateway
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
    name: http
  selector:
    app: gateway
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: gateway-admin
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - admin.gateway.example.com
    secretName: gateway-admin-tls
  rules:
  - host: admin.gateway.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: gateway
            port:
              number: 80

Configuration Management

Production Config:

# config.yaml
server:
  address: ":8080"
  tls:
    enabled: false  # Terminated at ingress

auth:
  enabled: true
  issuer: "https://auth.example.com"
  audience: "gateway-admin"
  roles_claim: "roles"
  admin_roles: ["admin", "gateway-admin"]

admin:
  enabled: true
  base_path: "/admin"
  cors:
    allowed_origins:
      - "https://admin.gateway.example.com"
    allowed_methods: ["GET", "POST", "PUT", "DELETE"]
    allowed_headers: ["Authorization", "Content-Type", "X-CSRF-Token"]

Monitoring

Prometheus Metrics:

New metrics for admin UI:

# Admin API request count
gateway_admin_requests_total{endpoint, method, status}

# Admin API request duration
gateway_admin_request_duration_seconds{endpoint, method}

# Configuration changes
gateway_admin_config_changes_total{user, resource}

# Authentication failures
gateway_admin_auth_failures_total{reason}

Grafana Dashboard:

Create dedicated admin UI dashboard with panels for:

  • Admin API request rate
  • Admin API error rate
  • Configuration change timeline
  • Active admin sessions
  • Authentication failures

Backup & Recovery

Configuration Backup:

  • Automatic backup before applying config changes
  • Stored in /app/backups/config.yaml.TIMESTAMP.bak
  • Retain last 10 backups
  • Restore via UI or CLI

Audit Log Backup:

  • Rotate audit logs daily
  • Compress and archive old logs
  • Retain for 90 days (configurable)
  • Optional: Ship to external storage (S3, GCS)

Future Enhancements

Phase 2 Features (Post-MVP)

  1. Multi-Instance Management

    • Manage multiple gateway instances from single UI
    • Fleet view with aggregate metrics
    • Centralized configuration management
  2. Advanced Monitoring

    • Custom alerting rules
    • Anomaly detection (ML-based)
    • Cost tracking per provider/model
    • Token usage forecasting
  3. Enhanced Security

    • SSO integration (SAML, LDAP)
    • Fine-grained permissions (resource-level RBAC)
    • API key rotation automation
    • Secret management integration (HashiCorp Vault)
  4. Configuration Templates

    • Pre-built provider templates
    • Environment-specific configs (dev, staging, prod)
    • Config versioning and rollback
    • Git integration for config-as-code
  5. Testing & Debugging

    • Interactive API playground (Swagger UI style)
    • Request/response inspector
    • Provider response comparison
    • Load testing tools
  6. Conversation Analytics

    • Conversation analytics dashboard
    • Topic clustering
    • Sentiment analysis
    • Export conversations to CSV/JSON
  7. User Management

    • Multi-user support (not just admins)
    • Team workspaces
    • Usage quotas per user/team
    • Billing integration
  8. Notifications

    • Email/Slack alerts for errors
    • Webhook support for events
    • Scheduled reports (daily/weekly summaries)
  9. Mobile Support

    • Progressive Web App (PWA)
    • Native mobile app (React Native)
    • Push notifications
  10. AI-Powered Features

    • Automatic provider selection based on query type
    • Cost optimization suggestions
    • Performance recommendations
    • Anomaly detection in logs

Technical Debt & Improvements

  1. Performance Optimizations

    • Server-side pagination for large datasets
    • Caching layer (Redis) for metrics
    • WebSocket for real-time updates (replace polling)
    • GraphQL API (alternative to REST)
  2. Developer Experience

    • Admin API SDK (TypeScript, Python)
    • Terraform provider for config management
    • CLI tool for admin operations
    • OpenAPI/Swagger spec for API
  3. Observability

    • Distributed tracing for admin operations
    • Request correlation IDs
    • Detailed error tracking (Sentry integration)
    • User session replay (LogRocket style)
  4. Internationalization

    • Multi-language UI support
    • Localized date/time formats
    • Currency formatting for costs

Appendix

Technology Choices Rationale

Why Vue 3?

  • Lightweight (50KB gzipped vs React's 130KB)
  • Progressive framework (can start simple, add complexity as needed)
  • Excellent TypeScript support
  • Single-file components (easy to understand)
  • Strong ecosystem (Vue Router, Pinia)

Why embed.FS?

  • Single binary deployment (no separate asset hosting)
  • Simplifies Docker images
  • No CDN dependencies
  • Faster initial load (no external requests)

Why Monaco Editor?

  • Full VS Code editing experience
  • Excellent YAML/JSON support
  • Syntax validation built-in
  • Auto-completion

Why Chart.js?

  • Simple API
  • Good performance for real-time updates
  • Small bundle size (~40KB)
  • Responsive by default

Alternative Architectures Considered

  1. Server-Side Rendering (SSR)

    • Pros: Better SEO, faster initial load
    • Cons: More complex deployment, slower interactions
    • Decision: Not needed for admin UI (auth-required, no SEO needs)
  2. Separate Admin Service

    • Pros: True separation of concerns, independent scaling
    • Cons: More infrastructure, harder deployment, network latency
    • Decision: Embedded admin (simpler, one binary)
  3. GraphQL API

    • Pros: Flexible queries, reduced over-fetching
    • Cons: Added complexity, overkill for admin use case
    • Decision: REST API (simpler, adequate)
  4. WebSockets for Real-Time

    • Pros: True bi-directional real-time
    • Cons: Connection management complexity, harder to scale
    • Decision: SSE + polling (simpler, sufficient)

Security Considerations Summary

Threat Mitigation
Unauthorized access OIDC authentication required
Privilege escalation RBAC with admin/viewer roles
CSRF attacks Double-submit cookie pattern
XSS attacks CSP headers, Vue auto-escaping
Secret exposure Mask secrets in UI, audit logs
Injection attacks Input validation, parameterized queries
DoS attacks Rate limiting, request size limits
Man-in-the-middle HTTPS/TLS required in production
Session hijacking Secure cookies, short JWT expiry
Brute force auth Rate limiting on auth endpoints

Performance Benchmarks (Targets)

Metric Target Notes
Dashboard load time < 2s On modern browsers, 4G network
API response time (p95) < 500ms For most endpoints
Concurrent admin users 10+ Without degradation
CPU overhead < 1% When admin UI active
Memory overhead < 50MB For admin UI components
Frontend bundle size < 500KB Gzipped, with code splitting
Time to Interactive (TTI) < 3s Lighthouse metric

Success Metrics

Adoption Metrics

  • Number of active admin users per week
  • Frequency of configuration changes
  • Time spent in admin UI per session

Efficiency Metrics

  • Reduction in configuration errors (target: 50%)
  • Time to configure new provider (target: < 2 minutes)
  • Time to diagnose issues (target: < 5 minutes)

Reliability Metrics

  • Admin UI uptime (target: 99.9%)
  • Zero impact on gateway API performance
  • Admin API error rate (target: < 0.1%)

User Satisfaction

  • User feedback score (target: 4.5/5)
  • Feature adoption rate (target: 80% use within 1 month)
  • Support ticket reduction (target: 30% reduction)

References


Document Version: 1.0 Last Updated: 2026-03-05 Authors: Development Team Status: Draft - Pending Review