feat: migrate backend from rust to go
This commit replaces the Axum/Rust backend with a Gin/Go implementation. The original Rust code has been archived in the 'rust' branch.
This commit is contained in:
336
deployment.md
336
deployment.md
@@ -1,322 +1,52 @@
|
||||
# LLM Proxy Gateway - Deployment Guide
|
||||
# Deployment Guide (Go)
|
||||
|
||||
## Overview
|
||||
A unified LLM proxy gateway supporting OpenAI, Google Gemini, DeepSeek, and xAI Grok with token tracking, cost calculation, and admin dashboard.
|
||||
This guide covers deploying the Go-based LLM Proxy Gateway.
|
||||
|
||||
## System Requirements
|
||||
- **CPU**: 2 cores minimum
|
||||
- **RAM**: 512MB minimum (1GB recommended)
|
||||
- **Storage**: 10GB minimum
|
||||
- **OS**: Linux (tested on Arch Linux, Ubuntu, Debian)
|
||||
- **Runtime**: Rust 1.70+ with Cargo
|
||||
## Environment Setup
|
||||
|
||||
## Deployment Options
|
||||
|
||||
### Option 1: Docker (Recommended)
|
||||
```dockerfile
|
||||
FROM rust:1.70-alpine as builder
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
RUN cargo build --release
|
||||
|
||||
FROM alpine:latest
|
||||
RUN apk add --no-cache libgcc
|
||||
COPY --from=builder /app/target/release/llm-proxy /usr/local/bin/
|
||||
COPY --from=builder /app/static /app/static
|
||||
WORKDIR /app
|
||||
EXPOSE 8080
|
||||
CMD ["llm-proxy"]
|
||||
```
|
||||
|
||||
### Option 2: Systemd Service (Bare Metal/LXC)
|
||||
```ini
|
||||
# /etc/systemd/system/llm-proxy.service
|
||||
[Unit]
|
||||
Description=LLM Proxy Gateway
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=llmproxy
|
||||
Group=llmproxy
|
||||
WorkingDirectory=/opt/llm-proxy
|
||||
ExecStart=/opt/llm-proxy/llm-proxy
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
Environment="RUST_LOG=info"
|
||||
Environment="LLM_PROXY__SERVER__PORT=8080"
|
||||
Environment="LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456"
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
### Option 3: LXC Container (Proxmox)
|
||||
1. Create Alpine Linux LXC container
|
||||
2. Install Rust: `apk add rust cargo`
|
||||
3. Copy application files
|
||||
4. Build: `cargo build --release`
|
||||
5. Run: `./target/release/llm-proxy`
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# Required API Keys
|
||||
OPENAI_API_KEY=sk-...
|
||||
GEMINI_API_KEY=AIza...
|
||||
DEEPSEEK_API_KEY=sk-...
|
||||
GROK_API_KEY=gk-... # Optional
|
||||
|
||||
# Server Configuration (with LLM_PROXY__ prefix)
|
||||
LLM_PROXY__SERVER__PORT=8080
|
||||
LLM_PROXY__SERVER__HOST=0.0.0.0
|
||||
LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456
|
||||
|
||||
# Database Configuration
|
||||
LLM_PROXY__DATABASE__PATH=./data/llm_proxy.db
|
||||
LLM_PROXY__DATABASE__MAX_CONNECTIONS=10
|
||||
|
||||
# Provider Configuration
|
||||
LLM_PROXY__PROVIDERS__OPENAI__ENABLED=true
|
||||
LLM_PROXY__PROVIDERS__GEMINI__ENABLED=true
|
||||
LLM_PROXY__PROVIDERS__DEEPSEEK__ENABLED=true
|
||||
LLM_PROXY__PROVIDERS__GROK__ENABLED=false
|
||||
```
|
||||
|
||||
### Configuration File (config.toml)
|
||||
Create `config.toml` in the application directory:
|
||||
```toml
|
||||
[server]
|
||||
port = 8080
|
||||
host = "0.0.0.0"
|
||||
auth_tokens = ["sk-test-123", "sk-test-456"]
|
||||
|
||||
[database]
|
||||
path = "./data/llm_proxy.db"
|
||||
max_connections = 10
|
||||
|
||||
[providers.openai]
|
||||
enabled = true
|
||||
base_url = "https://api.openai.com/v1"
|
||||
default_model = "gpt-4o"
|
||||
|
||||
[providers.gemini]
|
||||
enabled = true
|
||||
base_url = "https://generativelanguage.googleapis.com/v1"
|
||||
default_model = "gemini-2.0-flash"
|
||||
|
||||
[providers.deepseek]
|
||||
enabled = true
|
||||
base_url = "https://api.deepseek.com"
|
||||
default_model = "deepseek-reasoner"
|
||||
|
||||
[providers.grok]
|
||||
enabled = false
|
||||
base_url = "https://api.x.ai/v1"
|
||||
default_model = "grok-beta"
|
||||
```
|
||||
|
||||
## Nginx Reverse Proxy Configuration
|
||||
|
||||
**Important for SSE/Streaming:** Disable buffering and configure timeouts for proper SSE support.
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 80;
|
||||
server_name llm-proxy.yourdomain.com;
|
||||
|
||||
location / {
|
||||
proxy_pass http://localhost:8080;
|
||||
proxy_http_version 1.1;
|
||||
|
||||
# SSE/Streaming support
|
||||
proxy_buffering off;
|
||||
chunked_transfer_encoding on;
|
||||
proxy_set_header Connection '';
|
||||
|
||||
# Timeouts for long-running streams
|
||||
proxy_connect_timeout 7200s;
|
||||
proxy_read_timeout 7200s;
|
||||
proxy_send_timeout 7200s;
|
||||
|
||||
# Disable gzip for streaming
|
||||
gzip off;
|
||||
|
||||
# Headers
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
}
|
||||
|
||||
# SSL configuration (recommended)
|
||||
listen 443 ssl http2;
|
||||
ssl_certificate /etc/letsencrypt/live/llm-proxy.yourdomain.com/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/llm-proxy.yourdomain.com/privkey.pem;
|
||||
}
|
||||
```
|
||||
|
||||
### NGINX Proxy Manager
|
||||
|
||||
If using NGINX Proxy Manager, add this to **Advanced Settings**:
|
||||
|
||||
```nginx
|
||||
proxy_buffering off;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Connection '';
|
||||
chunked_transfer_encoding on;
|
||||
proxy_connect_timeout 7200s;
|
||||
proxy_read_timeout 7200s;
|
||||
proxy_send_timeout 7200s;
|
||||
gzip off;
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### 1. Authentication
|
||||
- Use strong Bearer tokens
|
||||
- Rotate tokens regularly
|
||||
- Consider implementing JWT for production
|
||||
|
||||
### 2. Rate Limiting
|
||||
- Implement per-client rate limiting
|
||||
- Consider using `governor` crate for advanced rate limiting
|
||||
|
||||
### 3. Network Security
|
||||
- Run behind reverse proxy (nginx)
|
||||
- Enable HTTPS
|
||||
- Restrict access by IP if needed
|
||||
- Use firewall rules
|
||||
|
||||
### 4. Data Security
|
||||
- Database encryption (SQLCipher for SQLite)
|
||||
- Secure API key storage
|
||||
- Regular backups
|
||||
|
||||
## Monitoring & Maintenance
|
||||
|
||||
### Logging
|
||||
- Application logs: `RUST_LOG=info` (or `debug` for troubleshooting)
|
||||
- Access logs via nginx
|
||||
- Database logs for audit trail
|
||||
|
||||
### Health Checks
|
||||
```bash
|
||||
# Health endpoint
|
||||
curl http://localhost:8080/health
|
||||
|
||||
# Database check
|
||||
sqlite3 ./data/llm_proxy.db "SELECT COUNT(*) FROM llm_requests;"
|
||||
```
|
||||
|
||||
### Backup Strategy
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# backup.sh
|
||||
BACKUP_DIR="/backups/llm-proxy"
|
||||
DATE=$(date +%Y%m%d_%H%M%S)
|
||||
|
||||
# Backup database
|
||||
sqlite3 ./data/llm_proxy.db ".backup $BACKUP_DIR/llm_proxy_$DATE.db"
|
||||
|
||||
# Backup configuration
|
||||
cp config.toml $BACKUP_DIR/config_$DATE.toml
|
||||
|
||||
# Rotate old backups (keep 30 days)
|
||||
find $BACKUP_DIR -name "*.db" -mtime +30 -delete
|
||||
find $BACKUP_DIR -name "*.toml" -mtime +30 -delete
|
||||
```
|
||||
|
||||
## Performance Tuning
|
||||
|
||||
### Database Optimization
|
||||
```sql
|
||||
-- Run these SQL commands periodically
|
||||
VACUUM;
|
||||
ANALYZE;
|
||||
```
|
||||
|
||||
### Memory Management
|
||||
- Monitor memory usage with `htop` or `ps aux`
|
||||
- Adjust `max_connections` based on load
|
||||
- Consider connection pooling for high traffic
|
||||
|
||||
### Scaling
|
||||
1. **Vertical Scaling**: Increase container resources
|
||||
2. **Horizontal Scaling**: Deploy multiple instances behind load balancer
|
||||
3. **Database**: Migrate to PostgreSQL for high-volume usage
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Port already in use**
|
||||
1. **Mandatory Configuration:**
|
||||
Create a `.env` file from the example:
|
||||
```bash
|
||||
netstat -tulpn | grep :8080
|
||||
kill <PID> # or change port in config
|
||||
cp .env.example .env
|
||||
```
|
||||
Ensure `LLM_PROXY__ENCRYPTION_KEY` is set to a secure 32-byte string.
|
||||
|
||||
2. **Database permissions**
|
||||
```bash
|
||||
chown -R llmproxy:llmproxy /opt/llm-proxy/data
|
||||
chmod 600 /opt/llm-proxy/data/llm_proxy.db
|
||||
```
|
||||
2. **Data Directory:**
|
||||
The proxy stores its database in `./data/llm_proxy.db` by default. Ensure this directory exists and is writable.
|
||||
|
||||
3. **API key errors**
|
||||
- Verify environment variables are set
|
||||
- Check provider status (dashboard)
|
||||
- Test connectivity: `curl https://api.openai.com/v1/models`
|
||||
## Binary Deployment
|
||||
|
||||
4. **High memory usage**
|
||||
- Check for memory leaks
|
||||
- Reduce `max_connections`
|
||||
- Implement connection timeouts
|
||||
|
||||
### Debug Mode
|
||||
### 1. Build
|
||||
```bash
|
||||
# Run with debug logging
|
||||
RUST_LOG=debug ./llm-proxy
|
||||
|
||||
# Check system logs
|
||||
journalctl -u llm-proxy -f
|
||||
go build -o llm-proxy ./cmd/llm-proxy
|
||||
```
|
||||
|
||||
## Integration
|
||||
|
||||
### Open-WebUI Compatibility
|
||||
The proxy provides OpenAI-compatible API, so configure Open-WebUI:
|
||||
```
|
||||
API Base URL: http://your-proxy-address:8080
|
||||
API Key: sk-test-123 (or your configured token)
|
||||
### 2. Run
|
||||
```bash
|
||||
./llm-proxy
|
||||
```
|
||||
|
||||
### Custom Clients
|
||||
```python
|
||||
import openai
|
||||
## Docker Deployment
|
||||
|
||||
client = openai.OpenAI(
|
||||
base_url="http://localhost:8080/v1",
|
||||
api_key="sk-test-123"
|
||||
)
|
||||
The project includes a multi-stage `Dockerfile` for minimal image size.
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4",
|
||||
messages=[{"role": "user", "content": "Hello"}]
|
||||
)
|
||||
### 1. Build Image
|
||||
```bash
|
||||
docker build -t llm-proxy .
|
||||
```
|
||||
|
||||
## Updates & Upgrades
|
||||
### 2. Run Container
|
||||
```bash
|
||||
docker run -d \
|
||||
--name llm-proxy \
|
||||
-p 8080:8080 \
|
||||
-v $(pwd)/data:/app/data \
|
||||
--env-file .env \
|
||||
llm-proxy
|
||||
```
|
||||
|
||||
1. **Backup** current configuration and database
|
||||
2. **Stop** the service: `systemctl stop llm-proxy`
|
||||
3. **Update** code: `git pull` or copy new binaries
|
||||
4. **Migrate** database if needed (check migrations/)
|
||||
5. **Restart**: `systemctl start llm-proxy`
|
||||
6. **Verify**: Check logs and test endpoints
|
||||
## Production Considerations
|
||||
|
||||
## Support
|
||||
- Check logs in `/var/log/llm-proxy/`
|
||||
- Monitor dashboard at `http://your-server:8080`
|
||||
- Review database metrics in dashboard
|
||||
- Enable debug logging for troubleshooting
|
||||
- **SSL/TLS:** It is recommended to run the proxy behind a reverse proxy like Nginx or Caddy for SSL termination.
|
||||
- **Backups:** Regularly backup the `data/llm_proxy.db` file.
|
||||
- **Monitoring:** Monitor the `/health` endpoint for system status.
|
||||
|
||||
Reference in New Issue
Block a user