docs: add comprehensive project README

Includes features overview, tech stack, getting started guide, and API usage examples.
2026-02-26 13:24:46 -05:00
parent 14c27df8fe
commit 87a3ecba07
1 changed files with 91 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,91 @@
+# LLM Proxy Gateway
+
+A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok) with built-in token tracking, real-time cost calculation, and a management dashboard.
+
+## 🚀 Features
+
+-   **Unified API:** Fully OpenAI-compatible `/v1/chat/completions` endpoint.
+-   **Multi-Provider Support:**
+    *   **OpenAI:** Standard models (GPT-4o, GPT-3.5, etc.) and reasoning models (o1, o3).
+    *   **Google Gemini:** Support for the latest Gemini 2.0 models.
+    *   **DeepSeek:** High-performance, low-cost integration.
+    *   **xAI Grok:** Integration for Grok-series models.
+-   **Observability & Tracking:**
+    *   **Real-time Costing:** Fetches live pricing and context specs from `models.dev` on startup.
+    *   **Token Counting:** Precise estimation using `tiktoken-rs`.
+    *   **Database Logging:** Every request is logged to SQLite for historical analysis.
+    *   **Streaming Support:** Full SSE (Server-Sent Events) support with aggregated token tracking.
+-   **Multimodal (Vision):** Support for image processing (Base64 and remote URLs) across compatible providers.
+-   **Reliability:**
+    *   **Circuit Breaking:** Automatically protects your system when providers are down.
+    *   **Rate Limiting:** Granular per-client and global rate limits.
+-   **Management Dashboard:** A modern, real-time web interface to monitor usage, costs, and system health.
+
+## 🛠️ Tech Stack
+
+-   **Runtime:** Rust (2024 Edition) with [Tokio](https://tokio.rs/).
+-   **Web Framework:** [Axum](https://github.com/tokio-rs/axum).
+-   **Database:** [SQLx](https://github.com/launchbadge/sqlx) with SQLite.
+-   **Frontend:** Vanilla JS/CSS (no heavyweight framework required).
+
+## 🚦 Getting Started
+
+### Prerequisites
+
+-   Rust (1.80+)
+-   SQLite3
+
+### Installation
+
+1.  Clone the repository:
+    ```bash
+    git clone https://github.com/yourusername/llm-proxy.git
+    cd llm-proxy
+    ```
+
+2.  Set up your environment:
+    ```bash
+    cp .env.example .env
+    # Edit .env and add your API keys
+    ```
+
+3.  Configure providers and server:
+    Edit `config.toml` to customize models, pricing fallbacks, and port settings.
+
+4.  Run the proxy:
+    ```bash
+    cargo run --release
+    ```
+
+The server will start at `http://localhost:3000` (by default).
+
+## 📊 Management Dashboard
+
+Access the built-in dashboard at `http://localhost:3000` to see:
+-   **Usage Summary:** Total requests, tokens, and USD spent.
+-   **Trend Charts:** 24-hour request and cost distributions.
+-   **Live Logs:** Real-time stream of incoming LLM requests via WebSockets.
+-   **Provider Health:** Monitor which providers are online or degraded.
+
+## 🔌 API Usage
+
+The proxy is designed to be a drop-in replacement for OpenAI. Simply change your base URL:
+
+**Example Request (Python):**
+```python
+from openai import OpenAI
+
+client = OpenAI(
+    base_url="http://localhost:3000/v1",
+    api_key="your-proxy-client-id" # Hashed sk- keys are managed in the dashboard
+)
+
+response = client.chat.completions.create(
+    model="gpt-4o",
+    messages=[{"role": "user", "content": "Hello from the proxy!"}]
+)
+```
+
+## ⚖️ License
+
+MIT OR Apache-2.0