chore: initial clean commit
This commit is contained in:
294
deployment.md
Normal file
294
deployment.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# LLM Proxy Gateway - Deployment Guide
|
||||
|
||||
## Overview
|
||||
A unified LLM proxy gateway supporting OpenAI, Google Gemini, DeepSeek, and xAI Grok with token tracking, cost calculation, and admin dashboard.
|
||||
|
||||
## System Requirements
|
||||
- **CPU**: 2 cores minimum
|
||||
- **RAM**: 512MB minimum (1GB recommended)
|
||||
- **Storage**: 10GB minimum
|
||||
- **OS**: Linux (tested on Arch Linux, Ubuntu, Debian)
|
||||
- **Runtime**: Rust 1.70+ with Cargo
|
||||
|
||||
## Deployment Options
|
||||
|
||||
### Option 1: Docker (Recommended)
|
||||
```dockerfile
|
||||
FROM rust:1.70-alpine as builder
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
RUN cargo build --release
|
||||
|
||||
FROM alpine:latest
|
||||
RUN apk add --no-cache libgcc
|
||||
COPY --from=builder /app/target/release/llm-proxy /usr/local/bin/
|
||||
COPY --from=builder /app/static /app/static
|
||||
WORKDIR /app
|
||||
EXPOSE 8080
|
||||
CMD ["llm-proxy"]
|
||||
```
|
||||
|
||||
### Option 2: Systemd Service (Bare Metal/LXC)
|
||||
```ini
|
||||
# /etc/systemd/system/llm-proxy.service
|
||||
[Unit]
|
||||
Description=LLM Proxy Gateway
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=llmproxy
|
||||
Group=llmproxy
|
||||
WorkingDirectory=/opt/llm-proxy
|
||||
ExecStart=/opt/llm-proxy/llm-proxy
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
Environment="RUST_LOG=info"
|
||||
Environment="LLM_PROXY__SERVER__PORT=8080"
|
||||
Environment="LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456"
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
### Option 3: LXC Container (Proxmox)
|
||||
1. Create Alpine Linux LXC container
|
||||
2. Install Rust: `apk add rust cargo`
|
||||
3. Copy application files
|
||||
4. Build: `cargo build --release`
|
||||
5. Run: `./target/release/llm-proxy`
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# Required API Keys
|
||||
OPENAI_API_KEY=sk-...
|
||||
GEMINI_API_KEY=AIza...
|
||||
DEEPSEEK_API_KEY=sk-...
|
||||
GROK_API_KEY=gk-... # Optional
|
||||
|
||||
# Server Configuration (with LLM_PROXY__ prefix)
|
||||
LLM_PROXY__SERVER__PORT=8080
|
||||
LLM_PROXY__SERVER__HOST=0.0.0.0
|
||||
LLM_PROXY__SERVER__AUTH_TOKENS=sk-test-123,sk-test-456
|
||||
|
||||
# Database Configuration
|
||||
LLM_PROXY__DATABASE__PATH=./data/llm_proxy.db
|
||||
LLM_PROXY__DATABASE__MAX_CONNECTIONS=10
|
||||
|
||||
# Provider Configuration
|
||||
LLM_PROXY__PROVIDERS__OPENAI__ENABLED=true
|
||||
LLM_PROXY__PROVIDERS__GEMINI__ENABLED=true
|
||||
LLM_PROXY__PROVIDERS__DEEPSEEK__ENABLED=true
|
||||
LLM_PROXY__PROVIDERS__GROK__ENABLED=false
|
||||
```
|
||||
|
||||
### Configuration File (config.toml)
|
||||
Create `config.toml` in the application directory:
|
||||
```toml
|
||||
[server]
|
||||
port = 8080
|
||||
host = "0.0.0.0"
|
||||
auth_tokens = ["sk-test-123", "sk-test-456"]
|
||||
|
||||
[database]
|
||||
path = "./data/llm_proxy.db"
|
||||
max_connections = 10
|
||||
|
||||
[providers.openai]
|
||||
enabled = true
|
||||
base_url = "https://api.openai.com/v1"
|
||||
default_model = "gpt-4o"
|
||||
|
||||
[providers.gemini]
|
||||
enabled = true
|
||||
base_url = "https://generativelanguage.googleapis.com/v1"
|
||||
default_model = "gemini-2.0-flash"
|
||||
|
||||
[providers.deepseek]
|
||||
enabled = true
|
||||
base_url = "https://api.deepseek.com"
|
||||
default_model = "deepseek-reasoner"
|
||||
|
||||
[providers.grok]
|
||||
enabled = false
|
||||
base_url = "https://api.x.ai/v1"
|
||||
default_model = "grok-beta"
|
||||
```
|
||||
|
||||
## Nginx Reverse Proxy Configuration
|
||||
```nginx
|
||||
server {
|
||||
listen 80;
|
||||
server_name llm-proxy.yourdomain.com;
|
||||
|
||||
location / {
|
||||
proxy_pass http://localhost:8080;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection 'upgrade';
|
||||
proxy_set_header Host $host;
|
||||
proxy_cache_bypass $http_upgrade;
|
||||
|
||||
# WebSocket support
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
}
|
||||
|
||||
# SSL configuration (recommended)
|
||||
listen 443 ssl http2;
|
||||
ssl_certificate /etc/letsencrypt/live/llm-proxy.yourdomain.com/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/llm-proxy.yourdomain.com/privkey.pem;
|
||||
}
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### 1. Authentication
|
||||
- Use strong Bearer tokens
|
||||
- Rotate tokens regularly
|
||||
- Consider implementing JWT for production
|
||||
|
||||
### 2. Rate Limiting
|
||||
- Implement per-client rate limiting
|
||||
- Consider using `governor` crate for advanced rate limiting
|
||||
|
||||
### 3. Network Security
|
||||
- Run behind reverse proxy (nginx)
|
||||
- Enable HTTPS
|
||||
- Restrict access by IP if needed
|
||||
- Use firewall rules
|
||||
|
||||
### 4. Data Security
|
||||
- Database encryption (SQLCipher for SQLite)
|
||||
- Secure API key storage
|
||||
- Regular backups
|
||||
|
||||
## Monitoring & Maintenance
|
||||
|
||||
### Logging
|
||||
- Application logs: `RUST_LOG=info` (or `debug` for troubleshooting)
|
||||
- Access logs via nginx
|
||||
- Database logs for audit trail
|
||||
|
||||
### Health Checks
|
||||
```bash
|
||||
# Health endpoint
|
||||
curl http://localhost:8080/health
|
||||
|
||||
# Database check
|
||||
sqlite3 ./data/llm_proxy.db "SELECT COUNT(*) FROM llm_requests;"
|
||||
```
|
||||
|
||||
### Backup Strategy
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# backup.sh
|
||||
BACKUP_DIR="/backups/llm-proxy"
|
||||
DATE=$(date +%Y%m%d_%H%M%S)
|
||||
|
||||
# Backup database
|
||||
sqlite3 ./data/llm_proxy.db ".backup $BACKUP_DIR/llm_proxy_$DATE.db"
|
||||
|
||||
# Backup configuration
|
||||
cp config.toml $BACKUP_DIR/config_$DATE.toml
|
||||
|
||||
# Rotate old backups (keep 30 days)
|
||||
find $BACKUP_DIR -name "*.db" -mtime +30 -delete
|
||||
find $BACKUP_DIR -name "*.toml" -mtime +30 -delete
|
||||
```
|
||||
|
||||
## Performance Tuning
|
||||
|
||||
### Database Optimization
|
||||
```sql
|
||||
-- Run these SQL commands periodically
|
||||
VACUUM;
|
||||
ANALYZE;
|
||||
```
|
||||
|
||||
### Memory Management
|
||||
- Monitor memory usage with `htop` or `ps aux`
|
||||
- Adjust `max_connections` based on load
|
||||
- Consider connection pooling for high traffic
|
||||
|
||||
### Scaling
|
||||
1. **Vertical Scaling**: Increase container resources
|
||||
2. **Horizontal Scaling**: Deploy multiple instances behind load balancer
|
||||
3. **Database**: Migrate to PostgreSQL for high-volume usage
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Port already in use**
|
||||
```bash
|
||||
netstat -tulpn | grep :8080
|
||||
kill <PID> # or change port in config
|
||||
```
|
||||
|
||||
2. **Database permissions**
|
||||
```bash
|
||||
chown -R llmproxy:llmproxy /opt/llm-proxy/data
|
||||
chmod 600 /opt/llm-proxy/data/llm_proxy.db
|
||||
```
|
||||
|
||||
3. **API key errors**
|
||||
- Verify environment variables are set
|
||||
- Check provider status (dashboard)
|
||||
- Test connectivity: `curl https://api.openai.com/v1/models`
|
||||
|
||||
4. **High memory usage**
|
||||
- Check for memory leaks
|
||||
- Reduce `max_connections`
|
||||
- Implement connection timeouts
|
||||
|
||||
### Debug Mode
|
||||
```bash
|
||||
# Run with debug logging
|
||||
RUST_LOG=debug ./llm-proxy
|
||||
|
||||
# Check system logs
|
||||
journalctl -u llm-proxy -f
|
||||
```
|
||||
|
||||
## Integration
|
||||
|
||||
### Open-WebUI Compatibility
|
||||
The proxy provides OpenAI-compatible API, so configure Open-WebUI:
|
||||
```
|
||||
API Base URL: http://your-proxy-address:8080
|
||||
API Key: sk-test-123 (or your configured token)
|
||||
```
|
||||
|
||||
### Custom Clients
|
||||
```python
|
||||
import openai
|
||||
|
||||
client = openai.OpenAI(
|
||||
base_url="http://localhost:8080/v1",
|
||||
api_key="sk-test-123"
|
||||
)
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4",
|
||||
messages=[{"role": "user", "content": "Hello"}]
|
||||
)
|
||||
```
|
||||
|
||||
## Updates & Upgrades
|
||||
|
||||
1. **Backup** current configuration and database
|
||||
2. **Stop** the service: `systemctl stop llm-proxy`
|
||||
3. **Update** code: `git pull` or copy new binaries
|
||||
4. **Migrate** database if needed (check migrations/)
|
||||
5. **Restart**: `systemctl start llm-proxy`
|
||||
6. **Verify**: Check logs and test endpoints
|
||||
|
||||
## Support
|
||||
- Check logs in `/var/log/llm-proxy/`
|
||||
- Monitor dashboard at `http://your-server:8080`
|
||||
- Review database metrics in dashboard
|
||||
- Enable debug logging for troubleshooting
|
||||
Reference in New Issue
Block a user