hobokenchicken/GopherGate

Fork 0

Go to file

hobokenchicken 232f092f27

CI / Check (push) Has been cancelled

Details

CI / Clippy (push) Has been cancelled

Details

CI / Formatting (push) Has been cancelled

Details

CI / Test (push) Has been cancelled

Details

CI / Release Build (push) Has been cancelled

Details

fix(server): map provider names to registry keys for /v1/models

The model registry from models.dev uses 'google' and 'xai' as provider
IDs, but internal providers use 'gemini' and 'grok'. Added mapping so
all provider models appear in the listing.

2026-03-02 14:14:56 -05:00

.github/workflows

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

data

fixed login page again

2026-03-01 03:47:27 -05:00

src

fix(server): map provider names to registry keys for /v1/models

2026-03-02 14:14:56 -05:00

static

fix(dashboard): constrain chart containers to prevent infinite canvas growth

2026-03-02 13:52:28 -05:00

.env

chore: initial clean commit

2026-02-26 13:56:21 -05:00

.env.example

chore: initial clean commit

2026-02-26 13:56:21 -05:00

.gitignore

chore: initial clean commit

2026-02-26 13:56:21 -05:00

Cargo.lock

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

Cargo.toml

fix: restore let-chains and add rust-version = 1.87 to Cargo.toml

2026-03-02 08:31:37 -05:00

clippy.toml

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

DASHBOARD_README.md

chore: initial clean commit

2026-02-26 13:56:21 -05:00

deploy.sh

chore: initial clean commit

2026-02-26 13:56:21 -05:00

deployment.md

chore: initial clean commit

2026-02-26 13:56:21 -05:00

Dockerfile

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

OPTIMIZATION.md

chore: initial clean commit

2026-02-26 13:56:21 -05:00

README.md

chore: initial clean commit

2026-02-26 13:56:21 -05:00

rustfmt.toml

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

test_dashboard.sh

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

test_server.sh

chore: initial clean commit

2026-02-26 13:56:21 -05:00

README.md

LLM Proxy Gateway

A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok) with built-in token tracking, real-time cost calculation, and a management dashboard.

🚀 Features

Unified API: Fully OpenAI-compatible /v1/chat/completions endpoint.
Multi-Provider Support:
- OpenAI: Standard models (GPT-4o, GPT-3.5, etc.) and reasoning models (o1, o3).
- Google Gemini: Support for the latest Gemini 2.0 models.
- DeepSeek: High-performance, low-cost integration.
- xAI Grok: Integration for Grok-series models.
- Ollama: Support for local LLMs running on your machine or another host.
Observability & Tracking:
- Real-time Costing: Fetches live pricing and context specs from models.dev on startup.
- Token Counting: Precise estimation using tiktoken-rs.
- Database Logging: Every request is logged to SQLite for historical analysis.
- Streaming Support: Full SSE (Server-Sent Events) support with aggregated token tracking.
Multimodal (Vision): Support for image processing (Base64 and remote URLs) across compatible providers.
Reliability:
- Circuit Breaking: Automatically protects your system when providers are down.
- Rate Limiting: Granular per-client and global rate limits.
Management Dashboard: A modern, real-time web interface to monitor usage, costs, and system health.

🛠️ Tech Stack

Runtime: Rust (2024 Edition) with Tokio.
Web Framework: Axum.
Database: SQLx with SQLite.
Frontend: Vanilla JS/CSS (no heavyweight framework required).

🚦 Getting Started

Prerequisites

Rust (1.80+)
SQLite3

Installation

Clone the repository:

git clone https://github.com/yourusername/llm-proxy.git
cd llm-proxy

Set up your environment:

cp .env.example .env
# Edit .env and add your API keys

Configure providers and server: Edit config.toml to customize models, pricing fallbacks, and port settings.

Ollama Example (config.toml):
```
[providers.ollama]
enabled = true
base_url = "http://192.168.1.50:11434/v1"
models = ["llama3", "mistral"]
```
Run the proxy:
```
cargo run --release
```

The server will start at http://localhost:3000 (by default).

📊 Management Dashboard

Access the built-in dashboard at http://localhost:3000 to see:

Usage Summary: Total requests, tokens, and USD spent.
Trend Charts: 24-hour request and cost distributions.
Live Logs: Real-time stream of incoming LLM requests via WebSockets.
Provider Health: Monitor which providers are online or degraded.

🔌 API Usage

The proxy is designed to be a drop-in replacement for OpenAI. Simply change your base URL:

Example Request (Python):

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/v1",
    api_key="your-proxy-client-id" # Hashed sk- keys are managed in the dashboard
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from the proxy!"}]
)

⚖️ License

MIT OR Apache-2.0

Languages

JavaScript 54.7%

Go 34.6%

CSS 8.3%

HTML 2.2%

Dockerfile 0.2%