# LLM Proxy Gateway A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok, Ollama) with built-in token tracking, real-time cost calculation, multi-user authentication, and a management dashboard. ## Features - **Unified API:** OpenAI-compatible `/v1/chat/completions` and `/v1/models` endpoints. - **Multi-Provider Support:** - **OpenAI:** GPT-4o, GPT-4o Mini, o1, o3 reasoning models. - **Google Gemini:** Gemini 2.0 Flash, Pro, and vision models. - **DeepSeek:** DeepSeek Chat and Reasoner models. - **xAI Grok:** Grok-beta models. - **Ollama:** Local LLMs running on your network. - **Observability & Tracking:** - **Real-time Costing:** Fetches live pricing and context specs from `models.dev` on startup. - **Token Counting:** Precise estimation using `tiktoken-rs`. - **Database Logging:** Every request logged to SQLite for historical analysis. - **Streaming Support:** Full SSE (Server-Sent Events) with `[DONE]` termination for client compatibility. - **Multimodal (Vision):** Image processing (Base64 and remote URLs) across compatible providers. - **Multi-User Access Control:** - **Admin Role:** Full access to all dashboard features, user management, and system configuration. - **Viewer Role:** Read-only access to usage analytics, costs, and monitoring. - **Client API Keys:** Create and manage multiple client tokens for external integrations. - **Reliability:** - **Circuit Breaking:** Automatically protects when providers are down. - **Rate Limiting:** Per-client and global rate limits. - **Cache-Aware Costing:** Tracks cache hit/miss tokens for accurate billing. ## Security LLM Proxy is designed with security in mind: - **HMAC Session Tokens:** Management dashboard sessions are secured using HMAC-SHA256 signed tokens. - **Encrypted Provider Keys:** Sensitive LLM provider API keys are stored encrypted (AES-256-GCM) in the database. - **Session Refresh:** Activity-based session extension prevents session hijacking while maintaining user convenience. - **XSS Prevention:** Standardized frontend escaping using `window.api.escapeHtml`. **Note:** You must define a `SESSION_SECRET` in your `.env` file for secure session signing. ## Tech Stack - **Runtime:** Rust with Tokio. - **Web Framework:** Axum. - **Database:** SQLx with SQLite. - **Frontend:** Vanilla JS/CSS with Chart.js for visualizations. ## Getting Started ### Prerequisites - Rust (1.80+) - SQLite3 - Docker (optional, for containerized deployment) ### Quick Start 1. Clone and build: ```bash git clone ssh://git.dustin.coffee:2222/hobokenchicken/llm-proxy.git cd llm-proxy cargo build --release ``` 2. Configure environment: ```bash cp .env.example .env # Edit .env and add your API keys: # SESSION_SECRET=... (Generate a strong random secret) # OPENAI_API_KEY=sk-... # GEMINI_API_KEY=AIza... ``` 3. Run the proxy: ```bash cargo run --release ``` The server starts on `http://localhost:8080` by default. ### Deployment (Docker) A multi-stage `Dockerfile` is provided for efficient deployment: ```bash # Build the container docker build -t llm-proxy . # Run the container docker run -p 8080:8080 \ -e SESSION_SECRET=your-secure-secret \ -v ./data:/app/data \ llm-proxy ``` ## Management Dashboard Access the dashboard at `http://localhost:8080`. The dashboard architecture has been refactored into modular sub-components for better maintainability: - **Auth (`/api/auth`):** Login, session management, and password changes. - **Usage (`/api/usage`):** Summary stats, time-series analytics, and provider breakdown. - **Clients (`/api/clients`):** API key management and per-client usage tracking. - **Providers (`/api/providers`):** Provider configuration, status monitoring, and connection testing. - **System (`/api/system`):** Health metrics, live logs, database backups, and global settings. - **Monitoring:** Live request stream via WebSocket. ### Default Credentials - **Username:** `admin` - **Password:** `admin123` Change the admin password in the dashboard after first login! ## API Usage The proxy is a drop-in replacement for OpenAI. Configure your client: ### Python ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:8080/v1", api_key="YOUR_CLIENT_API_KEY" # Create in dashboard ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] ) ``` ## License MIT OR Apache-2.0