867 lines
19 KiB
Markdown
867 lines
19 KiB
Markdown
# Kubernetes Deployment Guide
|
|
|
|
> Production-ready Kubernetes manifests for deploying the LLM Gateway with high availability, monitoring, and security.
|
|
|
|
## Table of Contents
|
|
|
|
- [Quick Start](#quick-start)
|
|
- [Prerequisites](#prerequisites)
|
|
- [Deployment](#deployment)
|
|
- [Configuration](#configuration)
|
|
- [Secrets Management](#secrets-management)
|
|
- [Monitoring](#monitoring)
|
|
- [Storage Options](#storage-options)
|
|
- [Scaling](#scaling)
|
|
- [Updates and Rollbacks](#updates-and-rollbacks)
|
|
- [Security](#security)
|
|
- [Cloud Provider Guides](#cloud-provider-guides)
|
|
- [Troubleshooting](#troubleshooting)
|
|
|
|
## Quick Start
|
|
|
|
Deploy with default settings using pre-built images:
|
|
|
|
```bash
|
|
# Update kustomization.yaml with your image
|
|
cd k8s/
|
|
vim kustomization.yaml # Set image to ghcr.io/yourusername/llm-gateway:v1.0.0
|
|
|
|
# Create secrets
|
|
kubectl create namespace llm-gateway
|
|
kubectl create secret generic llm-gateway-secrets \
|
|
--from-literal=OPENAI_API_KEY="sk-your-key" \
|
|
--from-literal=ANTHROPIC_API_KEY="sk-ant-your-key" \
|
|
--from-literal=GOOGLE_API_KEY="your-key" \
|
|
-n llm-gateway
|
|
|
|
# Deploy
|
|
kubectl apply -k .
|
|
|
|
# Verify
|
|
kubectl get pods -n llm-gateway
|
|
kubectl logs -n llm-gateway -l app=llm-gateway
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
- **Kubernetes**: v1.24+ cluster
|
|
- **kubectl**: Configured and authenticated
|
|
- **Container images**: Access to `ghcr.io/yourusername/llm-gateway`
|
|
|
|
**Optional but recommended:**
|
|
- **Prometheus Operator**: For metrics and alerting
|
|
- **cert-manager**: For automatic TLS certificates
|
|
- **Ingress Controller**: nginx, ALB, or GCE
|
|
- **External Secrets Operator**: For secrets management
|
|
|
|
## Deployment
|
|
|
|
### Using Kustomize (Recommended)
|
|
|
|
```bash
|
|
# Review and customize
|
|
cd k8s/
|
|
vim kustomization.yaml # Update image, namespace, etc.
|
|
vim configmap.yaml # Configure gateway settings
|
|
vim ingress.yaml # Set your domain
|
|
|
|
# Deploy all resources
|
|
kubectl apply -k .
|
|
|
|
# Deploy with Kustomize overlays
|
|
kubectl apply -k overlays/production/
|
|
```
|
|
|
|
### Using kubectl
|
|
|
|
```bash
|
|
kubectl apply -f namespace.yaml
|
|
kubectl apply -f serviceaccount.yaml
|
|
kubectl apply -f secret.yaml
|
|
kubectl apply -f configmap.yaml
|
|
kubectl apply -f redis.yaml
|
|
kubectl apply -f deployment.yaml
|
|
kubectl apply -f service.yaml
|
|
kubectl apply -f ingress.yaml
|
|
kubectl apply -f hpa.yaml
|
|
kubectl apply -f pdb.yaml
|
|
kubectl apply -f networkpolicy.yaml
|
|
```
|
|
|
|
### With Monitoring
|
|
|
|
If Prometheus Operator is installed:
|
|
|
|
```bash
|
|
kubectl apply -f servicemonitor.yaml
|
|
kubectl apply -f prometheusrule.yaml
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Image Configuration
|
|
|
|
Update `kustomization.yaml`:
|
|
|
|
```yaml
|
|
images:
|
|
- name: llm-gateway
|
|
newName: ghcr.io/yourusername/llm-gateway
|
|
newTag: v1.2.3 # Or 'latest', 'main', 'sha-abc123'
|
|
```
|
|
|
|
### Gateway Configuration
|
|
|
|
Edit `configmap.yaml` for gateway settings:
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: ConfigMap
|
|
metadata:
|
|
name: llm-gateway-config
|
|
data:
|
|
config.yaml: |
|
|
server:
|
|
address: ":8080"
|
|
|
|
logging:
|
|
level: info
|
|
format: json
|
|
|
|
rate_limit:
|
|
enabled: true
|
|
requests_per_second: 10
|
|
burst: 20
|
|
|
|
observability:
|
|
enabled: true
|
|
metrics:
|
|
enabled: true
|
|
tracing:
|
|
enabled: true
|
|
exporter:
|
|
type: otlp
|
|
endpoint: tempo:4317
|
|
|
|
conversations:
|
|
store: redis
|
|
dsn: redis://redis:6379/0
|
|
ttl: 1h
|
|
```
|
|
|
|
### Resource Limits
|
|
|
|
Default resources (adjust based on load testing):
|
|
|
|
```yaml
|
|
resources:
|
|
requests:
|
|
cpu: 100m
|
|
memory: 128Mi
|
|
limits:
|
|
cpu: 1000m
|
|
memory: 512Mi
|
|
```
|
|
|
|
### Ingress Configuration
|
|
|
|
Edit `ingress.yaml` for your domain:
|
|
|
|
```yaml
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: llm-gateway
|
|
annotations:
|
|
cert-manager.io/cluster-issuer: letsencrypt-prod
|
|
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
|
spec:
|
|
ingressClassName: nginx
|
|
tls:
|
|
- hosts:
|
|
- llm-gateway.yourdomain.com
|
|
secretName: llm-gateway-tls
|
|
rules:
|
|
- host: llm-gateway.yourdomain.com
|
|
http:
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: llm-gateway
|
|
port:
|
|
number: 80
|
|
```
|
|
|
|
## Secrets Management
|
|
|
|
### Option 1: kubectl (Development)
|
|
|
|
```bash
|
|
kubectl create secret generic llm-gateway-secrets \
|
|
--from-literal=OPENAI_API_KEY="sk-..." \
|
|
--from-literal=ANTHROPIC_API_KEY="sk-ant-..." \
|
|
--from-literal=GOOGLE_API_KEY="..." \
|
|
--from-literal=OIDC_AUDIENCE="your-client-id" \
|
|
-n llm-gateway
|
|
```
|
|
|
|
### Option 2: External Secrets Operator (Production)
|
|
|
|
Install ESO, then create ExternalSecret:
|
|
|
|
```yaml
|
|
apiVersion: external-secrets.io/v1beta1
|
|
kind: ExternalSecret
|
|
metadata:
|
|
name: llm-gateway-secrets
|
|
namespace: llm-gateway
|
|
spec:
|
|
refreshInterval: 1h
|
|
secretStoreRef:
|
|
name: aws-secretsmanager # or vault, gcpsm, etc.
|
|
kind: ClusterSecretStore
|
|
target:
|
|
name: llm-gateway-secrets
|
|
data:
|
|
- secretKey: OPENAI_API_KEY
|
|
remoteRef:
|
|
key: llm-gateway/openai-key
|
|
- secretKey: ANTHROPIC_API_KEY
|
|
remoteRef:
|
|
key: llm-gateway/anthropic-key
|
|
- secretKey: GOOGLE_API_KEY
|
|
remoteRef:
|
|
key: llm-gateway/google-key
|
|
```
|
|
|
|
### Option 3: Sealed Secrets
|
|
|
|
```bash
|
|
# Encrypt secrets
|
|
echo -n "sk-your-key" | kubectl create secret generic llm-gateway-secrets \
|
|
--dry-run=client --from-file=OPENAI_API_KEY=/dev/stdin -o yaml | \
|
|
kubeseal -o yaml > sealed-secret.yaml
|
|
|
|
# Commit sealed-secret.yaml to git
|
|
kubectl apply -f sealed-secret.yaml
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### Metrics
|
|
|
|
ServiceMonitor for Prometheus Operator:
|
|
|
|
```yaml
|
|
apiVersion: monitoring.coreos.com/v1
|
|
kind: ServiceMonitor
|
|
metadata:
|
|
name: llm-gateway
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: llm-gateway
|
|
endpoints:
|
|
- port: http
|
|
path: /metrics
|
|
interval: 30s
|
|
```
|
|
|
|
**Available metrics:**
|
|
- `gateway_requests_total` - Total requests by provider/model
|
|
- `gateway_request_duration_seconds` - Request latency histogram
|
|
- `gateway_provider_errors_total` - Errors by provider
|
|
- `gateway_circuit_breaker_state` - Circuit breaker state changes
|
|
- `gateway_rate_limit_hits_total` - Rate limit violations
|
|
|
|
### Alerts
|
|
|
|
PrometheusRule with common alerts:
|
|
|
|
```yaml
|
|
apiVersion: monitoring.coreos.com/v1
|
|
kind: PrometheusRule
|
|
metadata:
|
|
name: llm-gateway-alerts
|
|
spec:
|
|
groups:
|
|
- name: llm-gateway
|
|
interval: 30s
|
|
rules:
|
|
- alert: HighErrorRate
|
|
expr: rate(gateway_requests_total{status=~"5.."}[5m]) > 0.05
|
|
for: 5m
|
|
annotations:
|
|
summary: High error rate detected
|
|
|
|
- alert: PodDown
|
|
expr: kube_deployment_status_replicas_available{deployment="llm-gateway"} < 2
|
|
for: 5m
|
|
annotations:
|
|
summary: Less than 2 gateway pods running
|
|
```
|
|
|
|
### Logging
|
|
|
|
View logs:
|
|
|
|
```bash
|
|
# Tail logs
|
|
kubectl logs -n llm-gateway -l app=llm-gateway -f
|
|
|
|
# Filter by level
|
|
kubectl logs -n llm-gateway -l app=llm-gateway | jq 'select(.level=="error")'
|
|
|
|
# Search logs
|
|
kubectl logs -n llm-gateway -l app=llm-gateway | grep "circuit.*open"
|
|
```
|
|
|
|
### Tracing
|
|
|
|
Configure OpenTelemetry collector:
|
|
|
|
```yaml
|
|
observability:
|
|
tracing:
|
|
enabled: true
|
|
exporter:
|
|
type: otlp
|
|
endpoint: tempo:4317 # or jaeger-collector:4317
|
|
```
|
|
|
|
## Storage Options
|
|
|
|
### In-Memory (Default)
|
|
|
|
No persistence, lost on pod restart:
|
|
|
|
```yaml
|
|
conversations:
|
|
store: memory
|
|
```
|
|
|
|
### Redis (Recommended)
|
|
|
|
Deploy Redis StatefulSet:
|
|
|
|
```bash
|
|
kubectl apply -f redis.yaml
|
|
```
|
|
|
|
Configure gateway:
|
|
|
|
```yaml
|
|
conversations:
|
|
store: redis
|
|
dsn: redis://redis:6379/0
|
|
ttl: 1h
|
|
```
|
|
|
|
### External Redis
|
|
|
|
For production, use managed Redis:
|
|
|
|
```yaml
|
|
conversations:
|
|
store: redis
|
|
dsn: redis://:password@redis.example.com:6379/0
|
|
ttl: 1h
|
|
```
|
|
|
|
**Cloud providers:**
|
|
- **AWS**: ElastiCache for Redis
|
|
- **GCP**: Memorystore for Redis
|
|
- **Azure**: Azure Cache for Redis
|
|
|
|
### PostgreSQL
|
|
|
|
```yaml
|
|
conversations:
|
|
store: sql
|
|
driver: pgx
|
|
dsn: postgres://user:pass@postgres:5432/llm_gateway?sslmode=require
|
|
ttl: 1h
|
|
```
|
|
|
|
## Scaling
|
|
|
|
### Horizontal Pod Autoscaler
|
|
|
|
Default HPA configuration:
|
|
|
|
```yaml
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: llm-gateway
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: llm-gateway
|
|
minReplicas: 3
|
|
maxReplicas: 20
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 70
|
|
- type: Resource
|
|
resource:
|
|
name: memory
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 80
|
|
```
|
|
|
|
Monitor HPA:
|
|
|
|
```bash
|
|
kubectl get hpa -n llm-gateway
|
|
kubectl describe hpa llm-gateway -n llm-gateway
|
|
```
|
|
|
|
### Manual Scaling
|
|
|
|
```bash
|
|
# Scale to specific replica count
|
|
kubectl scale deployment/llm-gateway --replicas=10 -n llm-gateway
|
|
|
|
# Check status
|
|
kubectl get deployment llm-gateway -n llm-gateway
|
|
```
|
|
|
|
### Pod Disruption Budget
|
|
|
|
Ensures availability during disruptions:
|
|
|
|
```yaml
|
|
apiVersion: policy/v1
|
|
kind: PodDisruptionBudget
|
|
metadata:
|
|
name: llm-gateway
|
|
spec:
|
|
minAvailable: 2
|
|
selector:
|
|
matchLabels:
|
|
app: llm-gateway
|
|
```
|
|
|
|
## Updates and Rollbacks
|
|
|
|
### Rolling Updates
|
|
|
|
```bash
|
|
# Update image
|
|
kubectl set image deployment/llm-gateway \
|
|
gateway=ghcr.io/yourusername/llm-gateway:v1.2.3 \
|
|
-n llm-gateway
|
|
|
|
# Watch rollout
|
|
kubectl rollout status deployment/llm-gateway -n llm-gateway
|
|
|
|
# Pause rollout if issues
|
|
kubectl rollout pause deployment/llm-gateway -n llm-gateway
|
|
|
|
# Resume rollout
|
|
kubectl rollout resume deployment/llm-gateway -n llm-gateway
|
|
```
|
|
|
|
### Rollback
|
|
|
|
```bash
|
|
# Rollback to previous version
|
|
kubectl rollout undo deployment/llm-gateway -n llm-gateway
|
|
|
|
# Rollback to specific revision
|
|
kubectl rollout history deployment/llm-gateway -n llm-gateway
|
|
kubectl rollout undo deployment/llm-gateway --to-revision=3 -n llm-gateway
|
|
```
|
|
|
|
### Blue-Green Deployment
|
|
|
|
```bash
|
|
# Deploy new version with different label
|
|
kubectl apply -f deployment-v2.yaml
|
|
|
|
# Test new version
|
|
kubectl port-forward -n llm-gateway deployment/llm-gateway-v2 8080:8080
|
|
|
|
# Switch service to new version
|
|
kubectl patch service llm-gateway -n llm-gateway \
|
|
-p '{"spec":{"selector":{"version":"v2"}}}'
|
|
|
|
# Delete old version after verification
|
|
kubectl delete deployment llm-gateway-v1 -n llm-gateway
|
|
```
|
|
|
|
## Security
|
|
|
|
### Pod Security
|
|
|
|
Deployment includes security best practices:
|
|
|
|
```yaml
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 1000
|
|
fsGroup: 1000
|
|
seccompProfile:
|
|
type: RuntimeDefault
|
|
|
|
containers:
|
|
- name: gateway
|
|
securityContext:
|
|
allowPrivilegeEscalation: false
|
|
readOnlyRootFilesystem: true
|
|
capabilities:
|
|
drop:
|
|
- ALL
|
|
```
|
|
|
|
### Network Policies
|
|
|
|
Restrict traffic to/from gateway pods:
|
|
|
|
```yaml
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: llm-gateway
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app: llm-gateway
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
ingress:
|
|
- from:
|
|
- namespaceSelector:
|
|
matchLabels:
|
|
name: ingress-nginx
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8080
|
|
egress:
|
|
- to: # Allow DNS
|
|
- namespaceSelector: {}
|
|
podSelector:
|
|
matchLabels:
|
|
k8s-app: kube-dns
|
|
ports:
|
|
- protocol: UDP
|
|
port: 53
|
|
- to: # Allow Redis
|
|
- podSelector:
|
|
matchLabels:
|
|
app: redis
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
- to: # Allow external LLM providers (HTTPS)
|
|
- namespaceSelector: {}
|
|
ports:
|
|
- protocol: TCP
|
|
port: 443
|
|
```
|
|
|
|
### RBAC
|
|
|
|
ServiceAccount with minimal permissions:
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: llm-gateway
|
|
---
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: Role
|
|
metadata:
|
|
name: llm-gateway
|
|
rules:
|
|
- apiGroups: [""]
|
|
resources: ["configmaps"]
|
|
verbs: ["get", "list", "watch"]
|
|
---
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: RoleBinding
|
|
metadata:
|
|
name: llm-gateway
|
|
roleRef:
|
|
apiGroup: rbac.authorization.k8s.io
|
|
kind: Role
|
|
name: llm-gateway
|
|
subjects:
|
|
- kind: ServiceAccount
|
|
name: llm-gateway
|
|
```
|
|
|
|
## Cloud Provider Guides
|
|
|
|
### AWS EKS
|
|
|
|
```bash
|
|
# Install AWS Load Balancer Controller
|
|
kubectl apply -k "github.com/aws/eks-charts/stable/aws-load-balancer-controller//crds?ref=master"
|
|
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
|
|
-n kube-system \
|
|
--set clusterName=my-cluster
|
|
|
|
# Update ingress for ALB
|
|
# Add annotations to ingress.yaml:
|
|
metadata:
|
|
annotations:
|
|
kubernetes.io/ingress.class: alb
|
|
alb.ingress.kubernetes.io/scheme: internet-facing
|
|
alb.ingress.kubernetes.io/target-type: ip
|
|
```
|
|
|
|
**IRSA for secrets:**
|
|
|
|
```bash
|
|
# Create IAM role and associate with ServiceAccount
|
|
eksctl create iamserviceaccount \
|
|
--name llm-gateway \
|
|
--namespace llm-gateway \
|
|
--cluster my-cluster \
|
|
--attach-policy-arn arn:aws:iam::aws:policy/SecretsManagerReadWrite \
|
|
--approve
|
|
```
|
|
|
|
**ElastiCache Redis:**
|
|
|
|
```yaml
|
|
conversations:
|
|
store: redis
|
|
dsn: redis://my-cluster.cache.amazonaws.com:6379/0
|
|
```
|
|
|
|
### GCP GKE
|
|
|
|
```bash
|
|
# Enable Workload Identity
|
|
gcloud container clusters update my-cluster \
|
|
--workload-pool=PROJECT_ID.svc.id.goog
|
|
|
|
# Create service account with Secret Manager access
|
|
gcloud iam service-accounts create llm-gateway
|
|
|
|
gcloud projects add-iam-policy-binding PROJECT_ID \
|
|
--member "serviceAccount:llm-gateway@PROJECT_ID.iam.gserviceaccount.com" \
|
|
--role "roles/secretmanager.secretAccessor"
|
|
|
|
# Bind K8s SA to GCP SA
|
|
kubectl annotate serviceaccount llm-gateway \
|
|
-n llm-gateway \
|
|
iam.gke.io/gcp-service-account=llm-gateway@PROJECT_ID.iam.gserviceaccount.com
|
|
```
|
|
|
|
**Memorystore Redis:**
|
|
|
|
```yaml
|
|
conversations:
|
|
store: redis
|
|
dsn: redis://10.0.0.3:6379/0 # Private IP from Memorystore
|
|
```
|
|
|
|
### Azure AKS
|
|
|
|
```bash
|
|
# Install Application Gateway Ingress Controller
|
|
az aks enable-addons \
|
|
--resource-group myResourceGroup \
|
|
--name myAKSCluster \
|
|
--addons ingress-appgw \
|
|
--appgw-name myApplicationGateway
|
|
|
|
# Configure Azure AD Workload Identity
|
|
az aks update \
|
|
--resource-group myResourceGroup \
|
|
--name myAKSCluster \
|
|
--enable-oidc-issuer \
|
|
--enable-workload-identity
|
|
```
|
|
|
|
**Azure Key Vault with ESO:**
|
|
|
|
```yaml
|
|
apiVersion: external-secrets.io/v1beta1
|
|
kind: SecretStore
|
|
metadata:
|
|
name: azure-keyvault
|
|
spec:
|
|
provider:
|
|
azurekv:
|
|
authType: WorkloadIdentity
|
|
vaultUrl: https://my-vault.vault.azure.net
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Pods Not Starting
|
|
|
|
```bash
|
|
# Check pod status
|
|
kubectl get pods -n llm-gateway
|
|
|
|
# Describe pod for events
|
|
kubectl describe pod llm-gateway-xxx -n llm-gateway
|
|
|
|
# Check logs
|
|
kubectl logs -n llm-gateway llm-gateway-xxx
|
|
|
|
# Check previous container logs (if crashed)
|
|
kubectl logs -n llm-gateway llm-gateway-xxx --previous
|
|
```
|
|
|
|
**Common issues:**
|
|
- Image pull errors: Check registry credentials
|
|
- CrashLoopBackOff: Check logs for startup errors
|
|
- Pending: Check resource quotas and node capacity
|
|
|
|
### Health Check Failures
|
|
|
|
```bash
|
|
# Port-forward to test locally
|
|
kubectl port-forward -n llm-gateway svc/llm-gateway 8080:80
|
|
|
|
# Test endpoints
|
|
curl http://localhost:8080/health
|
|
curl http://localhost:8080/ready
|
|
|
|
# Check from inside pod
|
|
kubectl exec -n llm-gateway deployment/llm-gateway -- wget -O- http://localhost:8080/health
|
|
```
|
|
|
|
### Provider Connection Issues
|
|
|
|
```bash
|
|
# Test egress from pod
|
|
kubectl exec -n llm-gateway deployment/llm-gateway -- wget -O- https://api.openai.com
|
|
|
|
# Check secrets
|
|
kubectl get secret llm-gateway-secrets -n llm-gateway -o jsonpath='{.data.OPENAI_API_KEY}' | base64 -d
|
|
|
|
# Verify network policies
|
|
kubectl get networkpolicy -n llm-gateway
|
|
kubectl describe networkpolicy llm-gateway -n llm-gateway
|
|
```
|
|
|
|
### Redis Connection Issues
|
|
|
|
```bash
|
|
# Test Redis connectivity
|
|
kubectl exec -n llm-gateway deployment/llm-gateway -- nc -zv redis 6379
|
|
|
|
# Connect to Redis
|
|
kubectl exec -it -n llm-gateway redis-0 -- redis-cli
|
|
|
|
# Check Redis logs
|
|
kubectl logs -n llm-gateway redis-0
|
|
```
|
|
|
|
### Performance Issues
|
|
|
|
```bash
|
|
# Check resource usage
|
|
kubectl top pods -n llm-gateway
|
|
kubectl top nodes
|
|
|
|
# Check HPA status
|
|
kubectl describe hpa llm-gateway -n llm-gateway
|
|
|
|
# Check for throttling
|
|
kubectl describe pod llm-gateway-xxx -n llm-gateway | grep -i throttl
|
|
```
|
|
|
|
### Debug Container
|
|
|
|
For distroless/minimal images:
|
|
|
|
```bash
|
|
# Use ephemeral debug container
|
|
kubectl debug -it -n llm-gateway llm-gateway-xxx --image=busybox --target=gateway
|
|
|
|
# Or use debug pod
|
|
kubectl run debug --rm -it --image=nicolaka/netshoot -n llm-gateway -- /bin/bash
|
|
```
|
|
|
|
## Useful Commands
|
|
|
|
```bash
|
|
# View all resources
|
|
kubectl get all -n llm-gateway
|
|
|
|
# Check deployment status
|
|
kubectl rollout status deployment/llm-gateway -n llm-gateway
|
|
|
|
# Tail logs from all pods
|
|
kubectl logs -n llm-gateway -l app=llm-gateway -f --max-log-requests=10
|
|
|
|
# Get events
|
|
kubectl get events -n llm-gateway --sort-by='.lastTimestamp'
|
|
|
|
# Check resource quotas
|
|
kubectl describe resourcequota -n llm-gateway
|
|
|
|
# Export current config
|
|
kubectl get deployment llm-gateway -n llm-gateway -o yaml > deployment-backup.yaml
|
|
|
|
# Force pod restart
|
|
kubectl rollout restart deployment/llm-gateway -n llm-gateway
|
|
|
|
# Delete and recreate deployment
|
|
kubectl delete deployment llm-gateway -n llm-gateway
|
|
kubectl apply -f deployment.yaml
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Internet / Load Balancer │
|
|
└────────────────────┬────────────────────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────────┐
|
|
│ Ingress Controller │
|
|
│ (TLS/SSL) │
|
|
└──────────┬───────────┘
|
|
│
|
|
▼
|
|
┌──────────────────────┐
|
|
│ Gateway Service │
|
|
│ (ClusterIP:80) │
|
|
└──────────┬───────────┘
|
|
│
|
|
┌────────────┼────────────┐
|
|
▼ ▼ ▼
|
|
┌─────┐ ┌─────┐ ┌─────┐
|
|
│ Pod │ │ Pod │ │ Pod │
|
|
│ 1 │ │ 2 │ │ 3 │
|
|
└──┬──┘ └──┬──┘ └──┬──┘
|
|
│ │ │
|
|
└────────────┼────────────┘
|
|
│
|
|
┌────────────┼────────────┐
|
|
▼ ▼ ▼
|
|
┌──────┐ ┌──────┐ ┌──────┐
|
|
│Redis │ │Prom │ │Tempo │
|
|
└──────┘ └──────┘ └──────┘
|
|
```
|
|
|
|
## Additional Resources
|
|
|
|
- [Main Documentation](../README.md)
|
|
- [Docker Deployment](../docs/DOCKER_DEPLOYMENT.md)
|
|
- [Kubernetes Best Practices](https://kubernetes.io/docs/concepts/configuration/overview/)
|
|
- [Prometheus Operator](https://prometheus-operator.dev/)
|
|
- [External Secrets Operator](https://external-secrets.io/)
|
|
- [cert-manager](https://cert-manager.io/)
|