feat: migrate backend from rust to go

This commit replaces the Axum/Rust backend with a Gin/Go implementation. The original Rust code has been archived in the 'rust' branch.
2026-03-19 10:30:05 -04:00
parent 649371154f
commit 6b10d4249c
69 changed files with 3460 additions and 15096 deletions
@@ -1,322 +1,52 @@
-# LLM Proxy Gateway - Deployment Guide
+# Deployment Guide (Go)

-## Overview
-A unified LLM proxy gateway supporting OpenAI, Google Gemini, DeepSeek, and xAI Grok with token tracking, cost calculation, and admin dashboard.
+This guide covers deploying the Go-based LLM Proxy Gateway.

-## System Requirements
- **CPU**: 2 cores minimum
- **RAM**: 512MB minimum (1GB recommended)
- **Storage**: 10GB minimum
- **OS**: Linux (tested on Arch Linux, Ubuntu, Debian)
- **Runtime**: Rust 1.70+ with Cargo
+## Environment Setup

-## Deployment Options
-
-### Option 1: Docker (Recommended)
-```dockerfile
-FROM rust:1.70-alpine as builder
-WORKDIR /app
-COPY . .
-RUN cargo build --release
-
-FROM alpine:latest
-RUN apk add --no-cache libgcc
-COPY --from=builder /app/target/release/llm-proxy /usr/local/bin/
-COPY --from=builder /app/static /app/static
-WORKDIR /app
-EXPOSE 8080
-CMD ["llm-proxy"]
-```
-
-### Option 2: Systemd Service (Bare Metal/LXC)
-```ini
-# /etc/systemd/system/llm-proxy.service
-[Unit]
-Description=LLM Proxy Gateway
-After=network.target
-
-[Service]
-Type=simple
-User=llmproxy
-Group=llmproxy
-WorkingDirectory=/opt/llm-proxy
-ExecStart=/opt/llm-proxy/llm-proxy
-Restart=always
-RestartSec=10
-Environment="RUST_LOG=info"
-Environment="LLM_PROXY__SERVER__PORT=8080"
-Environment="LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456"
-
-[Install]
-WantedBy=multi-user.target
-```
-
-### Option 3: LXC Container (Proxmox)
-1. Create Alpine Linux LXC container
-2. Install Rust: `apk add rust cargo`
-3. Copy application files
-4. Build: `cargo build --release`
-5. Run: `./target/release/llm-proxy`
-
-## Configuration
-
-### Environment Variables
-```bash
-# Required API Keys
-OPENAI_API_KEY=sk-...
-GEMINI_API_KEY=AIza...
-DEEPSEEK_API_KEY=sk-...
-GROK_API_KEY=gk-...  # Optional
-
-# Server Configuration (with LLM_PROXY__ prefix)
-LLM_PROXY__SERVER__PORT=8080
-LLM_PROXY__SERVER__HOST=0.0.0.0
-LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456
-
-# Database Configuration
-LLM_PROXY__DATABASE__PATH=./data/llm_proxy.db
-LLM_PROXY__DATABASE__MAX_CONNECTIONS=10
-
-# Provider Configuration
-LLM_PROXY__PROVIDERS__OPENAI__ENABLED=true
-LLM_PROXY__PROVIDERS__GEMINI__ENABLED=true
-LLM_PROXY__PROVIDERS__DEEPSEEK__ENABLED=true
-LLM_PROXY__PROVIDERS__GROK__ENABLED=false
-```
-
-### Configuration File (config.toml)
-Create `config.toml` in the application directory:
-```toml
-[server]
-port = 8080
-host = "0.0.0.0"
-auth_tokens = ["sk-test-123", "sk-test-456"]
-
-[database]
-path = "./data/llm_proxy.db"
-max_connections = 10
-
-[providers.openai]
-enabled = true
-base_url = "https://api.openai.com/v1"
-default_model = "gpt-4o"
-
-[providers.gemini]
-enabled = true
-base_url = "https://generativelanguage.googleapis.com/v1"
-default_model = "gemini-2.0-flash"
-
-[providers.deepseek]
-enabled = true
-base_url = "https://api.deepseek.com"
-default_model = "deepseek-reasoner"
-
-[providers.grok]
-enabled = false
-base_url = "https://api.x.ai/v1"
-default_model = "grok-beta"
-```
-
-## Nginx Reverse Proxy Configuration
-
-**Important for SSE/Streaming:** Disable buffering and configure timeouts for proper SSE support.
-
-```nginx
-server {
-    listen 80;
-    server_name llm-proxy.yourdomain.com;
-    
-    location / {
-        proxy_pass http://localhost:8080;
-        proxy_http_version 1.1;
-        
-        # SSE/Streaming support
-        proxy_buffering off;
-        chunked_transfer_encoding on;
-        proxy_set_header Connection '';
-        
-        # Timeouts for long-running streams
-        proxy_connect_timeout 7200s;
-        proxy_read_timeout 7200s;
-        proxy_send_timeout 7200s;
-        
-        # Disable gzip for streaming
-        gzip off;
-        
-        # Headers
-        proxy_set_header Host $host;
-        proxy_set_header X-Real-IP $remote_addr;
-        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
-        proxy_set_header X-Forwarded-Proto $scheme;
-    }
-    
-    # SSL configuration (recommended)
-    listen 443 ssl http2;
-    ssl_certificate /etc/letsencrypt/live/llm-proxy.yourdomain.com/fullchain.pem;
-    ssl_certificate_key /etc/letsencrypt/live/llm-proxy.yourdomain.com/privkey.pem;
-}
-```
-
-### NGINX Proxy Manager
-
-If using NGINX Proxy Manager, add this to **Advanced Settings**:
-
-```nginx
-proxy_buffering off;
-proxy_http_version 1.1;
-proxy_set_header Connection '';
-chunked_transfer_encoding on;
-proxy_connect_timeout 7200s;
-proxy_read_timeout 7200s;
-proxy_send_timeout 7200s;
-gzip off;
-```
-
-## Security Considerations
-
-### 1. Authentication
- Use strong Bearer tokens
- Rotate tokens regularly
- Consider implementing JWT for production
-
-### 2. Rate Limiting
- Implement per-client rate limiting
- Consider using `governor` crate for advanced rate limiting
-
-### 3. Network Security
- Run behind reverse proxy (nginx)
- Enable HTTPS
- Restrict access by IP if needed
- Use firewall rules
-
-### 4. Data Security
- Database encryption (SQLCipher for SQLite)
- Secure API key storage
- Regular backups
-
-## Monitoring & Maintenance
-
-### Logging
- Application logs: `RUST_LOG=info` (or `debug` for troubleshooting)
- Access logs via nginx
- Database logs for audit trail
-
-### Health Checks
-```bash
-# Health endpoint
-curl http://localhost:8080/health
-
-# Database check
-sqlite3 ./data/llm_proxy.db "SELECT COUNT(*) FROM llm_requests;"
-```
-
-### Backup Strategy
-```bash
-#!/bin/bash
-# backup.sh
-BACKUP_DIR="/backups/llm-proxy"
-DATE=$(date +%Y%m%d_%H%M%S)
-
-# Backup database
-sqlite3 ./data/llm_proxy.db ".backup $BACKUP_DIR/llm_proxy_$DATE.db"
-
-# Backup configuration
-cp config.toml $BACKUP_DIR/config_$DATE.toml
-
-# Rotate old backups (keep 30 days)
-find $BACKUP_DIR -name "*.db" -mtime +30 -delete
-find $BACKUP_DIR -name "*.toml" -mtime +30 -delete
-```
-
-## Performance Tuning
-
-### Database Optimization
-```sql
-- Run these SQL commands periodically
-VACUUM;
-ANALYZE;
-```
-
-### Memory Management
- Monitor memory usage with `htop` or `ps aux`
- Adjust `max_connections` based on load
- Consider connection pooling for high traffic
-
-### Scaling
-1. **Vertical Scaling**: Increase container resources
-2. **Horizontal Scaling**: Deploy multiple instances behind load balancer
-3. **Database**: Migrate to PostgreSQL for high-volume usage
-
-## Troubleshooting
-
-### Common Issues
-
-1. **Port already in use**
+1. **Mandatory Configuration:**
+   Create a `.env` file from the example:
   ```bash
-   netstat -tulpn | grep :8080
-   kill <PID>  # or change port in config
+   cp .env.example .env
   ```
+   Ensure `LLM_PROXY__ENCRYPTION_KEY` is set to a secure 32-byte string.

-2. **Database permissions**
-   ```bash
-   chown -R llmproxy:llmproxy /opt/llm-proxy/data
-   chmod 600 /opt/llm-proxy/data/llm_proxy.db
-   ```
+2. **Data Directory:**
+   The proxy stores its database in `./data/llm_proxy.db` by default. Ensure this directory exists and is writable.

-3. **API key errors**
-   - Verify environment variables are set
-   - Check provider status (dashboard)
-   - Test connectivity: `curl https://api.openai.com/v1/models`
+## Binary Deployment

-4. **High memory usage**
-   - Check for memory leaks
-   - Reduce `max_connections`
-   - Implement connection timeouts
-
-### Debug Mode
+### 1. Build
 ```bash
-# Run with debug logging
-RUST_LOG=debug ./llm-proxy
-
-# Check system logs
-journalctl -u llm-proxy -f
+go build -o llm-proxy ./cmd/llm-proxy
 ```

-## Integration
-
-### Open-WebUI Compatibility
-The proxy provides OpenAI-compatible API, so configure Open-WebUI:
-```
-API Base URL: http://your-proxy-address:8080
-API Key: sk-test-123 (or your configured token)
+### 2. Run
+```bash
+./llm-proxy
 ```

-### Custom Clients
-```python
-import openai
+## Docker Deployment

-client = openai.OpenAI(
-    base_url="http://localhost:8080/v1",
-    api_key="sk-test-123"
-)
+The project includes a multi-stage `Dockerfile` for minimal image size.

-response = client.chat.completions.create(
-    model="gpt-4",
-    messages=[{"role": "user", "content": "Hello"}]
-)
+### 1. Build Image
+```bash
+docker build -t llm-proxy .
 ```

-## Updates & Upgrades
+### 2. Run Container
+```bash
+docker run -d \
+  --name llm-proxy \
+  -p 8080:8080 \
+  -v $(pwd)/data:/app/data \
+  --env-file .env \
+  llm-proxy
+```

-1. **Backup** current configuration and database
-2. **Stop** the service: `systemctl stop llm-proxy`
-3. **Update** code: `git pull` or copy new binaries
-4. **Migrate** database if needed (check migrations/)
-5. **Restart**: `systemctl start llm-proxy`
-6. **Verify**: Check logs and test endpoints
+## Production Considerations

-## Support
- Check logs in `/var/log/llm-proxy/`
- Monitor dashboard at `http://your-server:8080`
- Review database metrics in dashboard
- Enable debug logging for troubleshooting
+- **SSL/TLS:** It is recommended to run the proxy behind a reverse proxy like Nginx or Caddy for SSL termination.
+- **Backups:** Regularly backup the `data/llm_proxy.db` file.
+- **Monitoring:** Monitor the `/health` endpoint for system status.