docs: add comprehensive project README
Includes features overview, tech stack, getting started guide, and API usage examples.
This commit is contained in:
91
README.md
Normal file
91
README.md
Normal file
@@ -0,0 +1,91 @@
|
||||
# LLM Proxy Gateway
|
||||
|
||||
A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok) with built-in token tracking, real-time cost calculation, and a management dashboard.
|
||||
|
||||
## 🚀 Features
|
||||
|
||||
- **Unified API:** Fully OpenAI-compatible `/v1/chat/completions` endpoint.
|
||||
- **Multi-Provider Support:**
|
||||
* **OpenAI:** Standard models (GPT-4o, GPT-3.5, etc.) and reasoning models (o1, o3).
|
||||
* **Google Gemini:** Support for the latest Gemini 2.0 models.
|
||||
* **DeepSeek:** High-performance, low-cost integration.
|
||||
* **xAI Grok:** Integration for Grok-series models.
|
||||
- **Observability & Tracking:**
|
||||
* **Real-time Costing:** Fetches live pricing and context specs from `models.dev` on startup.
|
||||
* **Token Counting:** Precise estimation using `tiktoken-rs`.
|
||||
* **Database Logging:** Every request is logged to SQLite for historical analysis.
|
||||
* **Streaming Support:** Full SSE (Server-Sent Events) support with aggregated token tracking.
|
||||
- **Multimodal (Vision):** Support for image processing (Base64 and remote URLs) across compatible providers.
|
||||
- **Reliability:**
|
||||
* **Circuit Breaking:** Automatically protects your system when providers are down.
|
||||
* **Rate Limiting:** Granular per-client and global rate limits.
|
||||
- **Management Dashboard:** A modern, real-time web interface to monitor usage, costs, and system health.
|
||||
|
||||
## 🛠️ Tech Stack
|
||||
|
||||
- **Runtime:** Rust (2024 Edition) with [Tokio](https://tokio.rs/).
|
||||
- **Web Framework:** [Axum](https://github.com/tokio-rs/axum).
|
||||
- **Database:** [SQLx](https://github.com/launchbadge/sqlx) with SQLite.
|
||||
- **Frontend:** Vanilla JS/CSS (no heavyweight framework required).
|
||||
|
||||
## 🚦 Getting Started
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Rust (1.80+)
|
||||
- SQLite3
|
||||
|
||||
### Installation
|
||||
|
||||
1. Clone the repository:
|
||||
```bash
|
||||
git clone https://github.com/yourusername/llm-proxy.git
|
||||
cd llm-proxy
|
||||
```
|
||||
|
||||
2. Set up your environment:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env and add your API keys
|
||||
```
|
||||
|
||||
3. Configure providers and server:
|
||||
Edit `config.toml` to customize models, pricing fallbacks, and port settings.
|
||||
|
||||
4. Run the proxy:
|
||||
```bash
|
||||
cargo run --release
|
||||
```
|
||||
|
||||
The server will start at `http://localhost:3000` (by default).
|
||||
|
||||
## 📊 Management Dashboard
|
||||
|
||||
Access the built-in dashboard at `http://localhost:3000` to see:
|
||||
- **Usage Summary:** Total requests, tokens, and USD spent.
|
||||
- **Trend Charts:** 24-hour request and cost distributions.
|
||||
- **Live Logs:** Real-time stream of incoming LLM requests via WebSockets.
|
||||
- **Provider Health:** Monitor which providers are online or degraded.
|
||||
|
||||
## 🔌 API Usage
|
||||
|
||||
The proxy is designed to be a drop-in replacement for OpenAI. Simply change your base URL:
|
||||
|
||||
**Example Request (Python):**
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
base_url="http://localhost:3000/v1",
|
||||
api_key="your-proxy-client-id" # Hashed sk- keys are managed in the dashboard
|
||||
)
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello from the proxy!"}]
|
||||
)
|
||||
```
|
||||
|
||||
## ⚖️ License
|
||||
|
||||
MIT OR Apache-2.0
|
||||
Reference in New Issue
Block a user