# LLM Proxy Gateway

A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok) with built-in token tracking, real-time cost calculation, and a management dashboard.

## 🚀 Features

-   **Unified API:** Fully OpenAI-compatible `/v1/chat/completions` endpoint.
-   **Multi-Provider Support:**
    *   **OpenAI:** Standard models (GPT-4o, GPT-3.5, etc.) and reasoning models (o1, o3).
    *   **Google Gemini:** Support for the latest Gemini 2.0 models.
    *   **DeepSeek:** High-performance, low-cost integration.
    *   **xAI Grok:** Integration for Grok-series models.
    *   **Ollama:** Support for local LLMs running on your machine or another host.
-   **Observability & Tracking:**
    *   **Real-time Costing:** Fetches live pricing and context specs from `models.dev` on startup.
    *   **Token Counting:** Precise estimation using `tiktoken-rs`.
    *   **Database Logging:** Every request is logged to SQLite for historical analysis.
    *   **Streaming Support:** Full SSE (Server-Sent Events) support with aggregated token tracking.
-   **Multimodal (Vision):** Support for image processing (Base64 and remote URLs) across compatible providers.
-   **Reliability:**
    *   **Circuit Breaking:** Automatically protects your system when providers are down.
    *   **Rate Limiting:** Granular per-client and global rate limits.
-   **Management Dashboard:** A modern, real-time web interface to monitor usage, costs, and system health.

## 🛠️ Tech Stack

-   **Runtime:** Rust (2024 Edition) with [Tokio](https://tokio.rs/).
-   **Web Framework:** [Axum](https://github.com/tokio-rs/axum).
-   **Database:** [SQLx](https://github.com/launchbadge/sqlx) with SQLite.
-   **Frontend:** Vanilla JS/CSS (no heavyweight framework required).

## 🚦 Getting Started

### Prerequisites

-   Rust (1.80+)
-   SQLite3

### Installation

1.  Clone the repository:
    ```bash
    git clone https://github.com/yourusername/llm-proxy.git
    cd llm-proxy
    ```

2.  Set up your environment:
    ```bash
    cp .env.example .env
    # Edit .env and add your API keys
    ```

3.  Configure providers and server:
    Edit `config.toml` to customize models, pricing fallbacks, and port settings.

    **Ollama Example (config.toml):**
    ```toml
    [providers.ollama]
    enabled = true
    base_url = "http://192.168.1.50:11434/v1"
    models = ["llama3", "mistral"]
    ```

4.  Run the proxy:
    ```bash
    cargo run --release
    ```

The server will start at `http://localhost:3000` (by default).

## 📊 Management Dashboard

Access the built-in dashboard at `http://localhost:3000` to see:
-   **Usage Summary:** Total requests, tokens, and USD spent.
-   **Trend Charts:** 24-hour request and cost distributions.
-   **Live Logs:** Real-time stream of incoming LLM requests via WebSockets.
-   **Provider Health:** Monitor which providers are online or degraded.

## 🔌 API Usage

The proxy is designed to be a drop-in replacement for OpenAI. Simply change your base URL:

**Example Request (Python):**
```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/v1",
    api_key="your-proxy-client-id" # Hashed sk- keys are managed in the dashboard
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from the proxy!"}]
)
```

## ⚖️ License

MIT OR Apache-2.0