Files
GopherGate/README.md
hobokenchicken 633b69a07b
Some checks failed
CI / Check (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Formatting (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Release Build (push) Has been cancelled
docs: sync documentation with current implementation and archive stale plan
2026-03-06 14:28:04 -05:00

135 lines
4.5 KiB
Markdown

# LLM Proxy Gateway
A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok, Ollama) with built-in token tracking, real-time cost calculation, multi-user authentication, and a management dashboard.
## Features
- **Unified API:** OpenAI-compatible `/v1/chat/completions` and `/v1/models` endpoints.
- **Multi-Provider Support:**
- **OpenAI:** GPT-4o, GPT-4o Mini, o1, o3 reasoning models.
- **Google Gemini:** Gemini 2.0 Flash, Pro, and vision models.
- **DeepSeek:** DeepSeek Chat and Reasoner models.
- **xAI Grok:** Grok-beta models.
- **Ollama:** Local LLMs running on your network.
- **Observability & Tracking:**
- **Real-time Costing:** Fetches live pricing and context specs from `models.dev` on startup.
- **Token Counting:** Precise estimation using `tiktoken-rs`.
- **Database Logging:** Every request logged to SQLite for historical analysis.
- **Streaming Support:** Full SSE (Server-Sent Events) with `[DONE]` termination for client compatibility.
- **Multimodal (Vision):** Image processing (Base64 and remote URLs) across compatible providers.
- **Multi-User Access Control:**
- **Admin Role:** Full access to all dashboard features, user management, and system configuration.
- **Viewer Role:** Read-only access to usage analytics, costs, and monitoring.
- **Client API Keys:** Create and manage multiple client tokens for external integrations.
- **Reliability:**
- **Circuit Breaking:** Automatically protects when providers are down.
- **Rate Limiting:** Per-client and global rate limits.
- **Cache-Aware Costing:** Tracks cache hit/miss tokens for accurate billing.
## Security
LLM Proxy is designed with security in mind:
- **HMAC Session Tokens:** Management dashboard sessions are secured using HMAC-SHA256 signed tokens.
- **Encrypted Provider Keys:** Sensitive LLM provider API keys are stored encrypted (AES-256-GCM) in the database.
- **Session Refresh:** Activity-based session extension prevents session hijacking while maintaining user convenience.
- **XSS Prevention:** Standardized frontend escaping using `window.api.escapeHtml`.
**Note:** You must define a `SESSION_SECRET` in your `.env` file for secure session signing.
## Tech Stack
- **Runtime:** Rust with Tokio.
- **Web Framework:** Axum.
- **Database:** SQLx with SQLite.
- **Frontend:** Vanilla JS/CSS with Chart.js for visualizations.
## Getting Started
### Prerequisites
- Rust (1.80+)
- SQLite3
- Docker (optional, for containerized deployment)
### Quick Start
1. Clone and build:
```bash
git clone ssh://git.dustin.coffee:2222/hobokenchicken/llm-proxy.git
cd llm-proxy
cargo build --release
```
2. Configure environment:
```bash
cp .env.example .env
# Edit .env and add your API keys:
# SESSION_SECRET=... (Generate a strong random secret)
# OPENAI_API_KEY=sk-...
# GEMINI_API_KEY=AIza...
```
3. Run the proxy:
```bash
cargo run --release
```
The server starts on `http://localhost:8080` by default.
### Deployment (Docker)
A multi-stage `Dockerfile` is provided for efficient deployment:
```bash
# Build the container
docker build -t llm-proxy .
# Run the container
docker run -p 8080:8080 \
-e SESSION_SECRET=your-secure-secret \
-v ./data:/app/data \
llm-proxy
```
## Management Dashboard
Access the dashboard at `http://localhost:8080`. The dashboard architecture has been refactored into modular sub-components for better maintainability:
- **Auth (`/api/auth`):** Login, session management, and password changes.
- **Usage (`/api/usage`):** Summary stats, time-series analytics, and provider breakdown.
- **Clients (`/api/clients`):** API key management and per-client usage tracking.
- **Providers (`/api/providers`):** Provider configuration, status monitoring, and connection testing.
- **System (`/api/system`):** Health metrics, live logs, database backups, and global settings.
- **Monitoring:** Live request stream via WebSocket.
### Default Credentials
- **Username:** `admin`
- **Password:** `admin123`
Change the admin password in the dashboard after first login!
## API Usage
The proxy is a drop-in replacement for OpenAI. Configure your client:
### Python
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="YOUR_CLIENT_API_KEY" # Create in dashboard
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
```
## License
MIT OR Apache-2.0