chore: initial clean commit

2026-02-26 13:56:21 -05:00
commit 1755075657
53 changed files with 18068 additions and 0 deletions
@@ -0,0 +1,294 @@
+# LLM Proxy Gateway - Deployment Guide
+
+## Overview
+A unified LLM proxy gateway supporting OpenAI, Google Gemini, DeepSeek, and xAI Grok with token tracking, cost calculation, and admin dashboard.
+
+## System Requirements
+- **CPU**: 2 cores minimum
+- **RAM**: 512MB minimum (1GB recommended)
+- **Storage**: 10GB minimum
+- **OS**: Linux (tested on Arch Linux, Ubuntu, Debian)
+- **Runtime**: Rust 1.70+ with Cargo
+
+## Deployment Options
+
+### Option 1: Docker (Recommended)
+```dockerfile
+FROM rust:1.70-alpine as builder
+WORKDIR /app
+COPY . .
+RUN cargo build --release
+
+FROM alpine:latest
+RUN apk add --no-cache libgcc
+COPY --from=builder /app/target/release/llm-proxy /usr/local/bin/
+COPY --from=builder /app/static /app/static
+WORKDIR /app
+EXPOSE 8080
+CMD ["llm-proxy"]
+```
+
+### Option 2: Systemd Service (Bare Metal/LXC)
+```ini
+# /etc/systemd/system/llm-proxy.service
+[Unit]
+Description=LLM Proxy Gateway
+After=network.target
+
+[Service]
+Type=simple
+User=llmproxy
+Group=llmproxy
+WorkingDirectory=/opt/llm-proxy
+ExecStart=/opt/llm-proxy/llm-proxy
+Restart=always
+RestartSec=10
+Environment="RUST_LOG=info"
+Environment="LLM_PROXY__SERVER__PORT=8080"
+Environment="LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456"
+
+[Install]
+WantedBy=multi-user.target
+```
+
+### Option 3: LXC Container (Proxmox)
+1. Create Alpine Linux LXC container
+2. Install Rust: `apk add rust cargo`
+3. Copy application files
+4. Build: `cargo build --release`
+5. Run: `./target/release/llm-proxy`
+
+## Configuration
+
+### Environment Variables
+```bash
+# Required API Keys
+OPENAI_API_KEY=sk-...
+GEMINI_API_KEY=AIza...
+DEEPSEEK_API_KEY=sk-...
+GROK_API_KEY=gk-...  # Optional
+
+# Server Configuration (with LLM_PROXY__ prefix)
+LLM_PROXY__SERVER__PORT=8080
+LLM_PROXY__SERVER__HOST=0.0.0.0
+LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456
+
+# Database Configuration
+LLM_PROXY__DATABASE__PATH=./data/llm_proxy.db
+LLM_PROXY__DATABASE__MAX_CONNECTIONS=10
+
+# Provider Configuration
+LLM_PROXY__PROVIDERS__OPENAI__ENABLED=true
+LLM_PROXY__PROVIDERS__GEMINI__ENABLED=true
+LLM_PROXY__PROVIDERS__DEEPSEEK__ENABLED=true
+LLM_PROXY__PROVIDERS__GROK__ENABLED=false
+```
+
+### Configuration File (config.toml)
+Create `config.toml` in the application directory:
+```toml
+[server]
+port = 8080
+host = "0.0.0.0"
+auth_tokens = ["sk-test-123", "sk-test-456"]
+
+[database]
+path = "./data/llm_proxy.db"
+max_connections = 10
+
+[providers.openai]
+enabled = true
+base_url = "https://api.openai.com/v1"
+default_model = "gpt-4o"
+
+[providers.gemini]
+enabled = true
+base_url = "https://generativelanguage.googleapis.com/v1"
+default_model = "gemini-2.0-flash"
+
+[providers.deepseek]
+enabled = true
+base_url = "https://api.deepseek.com"
+default_model = "deepseek-reasoner"
+
+[providers.grok]
+enabled = false
+base_url = "https://api.x.ai/v1"
+default_model = "grok-beta"
+```
+
+## Nginx Reverse Proxy Configuration
+```nginx
+server {
+    listen 80;
+    server_name llm-proxy.yourdomain.com;
+    
+    location / {
+        proxy_pass http://localhost:8080;
+        proxy_http_version 1.1;
+        proxy_set_header Upgrade $http_upgrade;
+        proxy_set_header Connection 'upgrade';
+        proxy_set_header Host $host;
+        proxy_cache_bypass $http_upgrade;
+        
+        # WebSocket support
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Forwarded-Proto $scheme;
+    }
+    
+    # SSL configuration (recommended)
+    listen 443 ssl http2;
+    ssl_certificate /etc/letsencrypt/live/llm-proxy.yourdomain.com/fullchain.pem;
+    ssl_certificate_key /etc/letsencrypt/live/llm-proxy.yourdomain.com/privkey.pem;
+}
+```
+
+## Security Considerations
+
+### 1. Authentication
+- Use strong Bearer tokens
+- Rotate tokens regularly
+- Consider implementing JWT for production
+
+### 2. Rate Limiting
+- Implement per-client rate limiting
+- Consider using `governor` crate for advanced rate limiting
+
+### 3. Network Security
+- Run behind reverse proxy (nginx)
+- Enable HTTPS
+- Restrict access by IP if needed
+- Use firewall rules
+
+### 4. Data Security
+- Database encryption (SQLCipher for SQLite)
+- Secure API key storage
+- Regular backups
+
+## Monitoring & Maintenance
+
+### Logging
+- Application logs: `RUST_LOG=info` (or `debug` for troubleshooting)
+- Access logs via nginx
+- Database logs for audit trail
+
+### Health Checks
+```bash
+# Health endpoint
+curl http://localhost:8080/health
+
+# Database check
+sqlite3 ./data/llm_proxy.db "SELECT COUNT(*) FROM llm_requests;"
+```
+
+### Backup Strategy
+```bash
+#!/bin/bash
+# backup.sh
+BACKUP_DIR="/backups/llm-proxy"
+DATE=$(date +%Y%m%d_%H%M%S)
+
+# Backup database
+sqlite3 ./data/llm_proxy.db ".backup $BACKUP_DIR/llm_proxy_$DATE.db"
+
+# Backup configuration
+cp config.toml $BACKUP_DIR/config_$DATE.toml
+
+# Rotate old backups (keep 30 days)
+find $BACKUP_DIR -name "*.db" -mtime +30 -delete
+find $BACKUP_DIR -name "*.toml" -mtime +30 -delete
+```
+
+## Performance Tuning
+
+### Database Optimization
+```sql
+-- Run these SQL commands periodically
+VACUUM;
+ANALYZE;
+```
+
+### Memory Management
+- Monitor memory usage with `htop` or `ps aux`
+- Adjust `max_connections` based on load
+- Consider connection pooling for high traffic
+
+### Scaling
+1. **Vertical Scaling**: Increase container resources
+2. **Horizontal Scaling**: Deploy multiple instances behind load balancer
+3. **Database**: Migrate to PostgreSQL for high-volume usage
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Port already in use**
+   ```bash
+   netstat -tulpn | grep :8080
+   kill <PID>  # or change port in config
+   ```
+
+2. **Database permissions**
+   ```bash
+   chown -R llmproxy:llmproxy /opt/llm-proxy/data
+   chmod 600 /opt/llm-proxy/data/llm_proxy.db
+   ```
+
+3. **API key errors**
+   - Verify environment variables are set
+   - Check provider status (dashboard)
+   - Test connectivity: `curl https://api.openai.com/v1/models`
+
+4. **High memory usage**
+   - Check for memory leaks
+   - Reduce `max_connections`
+   - Implement connection timeouts
+
+### Debug Mode
+```bash
+# Run with debug logging
+RUST_LOG=debug ./llm-proxy
+
+# Check system logs
+journalctl -u llm-proxy -f
+```
+
+## Integration
+
+### Open-WebUI Compatibility
+The proxy provides OpenAI-compatible API, so configure Open-WebUI:
+```
+API Base URL: http://your-proxy-address:8080
+API Key: sk-test-123 (or your configured token)
+```
+
+### Custom Clients
+```python
+import openai
+
+client = openai.OpenAI(
+    base_url="http://localhost:8080/v1",
+    api_key="sk-test-123"
+)
+
+response = client.chat.completions.create(
+    model="gpt-4",
+    messages=[{"role": "user", "content": "Hello"}]
+)
+```
+
+## Updates & Upgrades
+
+1. **Backup** current configuration and database
+2. **Stop** the service: `systemctl stop llm-proxy`
+3. **Update** code: `git pull` or copy new binaries
+4. **Migrate** database if needed (check migrations/)
+5. **Restart**: `systemctl start llm-proxy`
+6. **Verify**: Check logs and test endpoints
+
+## Support
+- Check logs in `/var/log/llm-proxy/`
+- Monitor dashboard at `http://your-server:8080`
+- Review database metrics in dashboard
+- Enable debug logging for troubleshooting