hobokenchicken/GopherGate

Fork 0

Go to file

hobokenchicken 0d32d953d2

CI / Check (push) Has been cancelled

Details

CI / Clippy (push) Has been cancelled

Details

CI / Formatting (push) Has been cancelled

Details

CI / Test (push) Has been cancelled

Details

CI / Release Build (push) Has been cancelled

Details

fix(dashboard): accurately map used models to actual providers

This commit modifies the /api/models endpoint so that when fetching 'used models' for the Cost Management view, it accurately pairs each model with the exact provider it was routed through (by querying SELECT DISTINCT provider, model FROM llm_requests). Previously, it relied on the global registry's mapping, which could falsely attribute usage to unconfigured or alternate providers.

2026-03-07 01:12:41 +00:00

.github/workflows

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

.opencode/plans/archives

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

data

merge

2026-03-06 15:35:30 -05:00

migrations

feat(security): implement AES-256-GCM encryption for API keys and HMAC-signed session tokens

2026-03-06 14:17:56 -05:00

src

fix(dashboard): accurately map used models to actual providers

2026-03-07 01:12:41 +00:00

static

fix(dashboard): handle stale sessions and prevent form GET submission

2026-03-07 00:15:20 +00:00

.env

fixed ui

2026-03-07 00:17:44 +00:00

.env.backup

merge

2026-03-06 15:35:30 -05:00

.env.example

feat(security): implement AES-256-GCM encryption for API keys and HMAC-signed session tokens

2026-03-06 14:17:56 -05:00

.gitignore

Revert "feat(auth): refactor token resolution into shared TokenResolution and centralize in middleware; simplify AuthenticatedClient to carry resolved DB ID"

2026-03-05 19:53:50 +00:00

Cargo.lock

feat(security): implement AES-256-GCM encryption for API keys and HMAC-signed session tokens

2026-03-06 14:17:56 -05:00

Cargo.toml

feat(security): implement AES-256-GCM encryption for API keys and HMAC-signed session tokens

2026-03-06 14:17:56 -05:00

clippy.toml

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

CODE_REVIEW_PLAN.md

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

DASHBOARD_README.md

docs: update README, deployment guide, and dashboard docs

2026-03-03 13:06:37 -05:00

DATABASE_REVIEW.md

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

deploy.sh

fix(config): wire up LLM_PROXY__CONFIG_PATH env var and fix database path in service

2026-03-03 10:11:09 -05:00

deployment.md

docs: update README, deployment guide, and dashboard docs

2026-03-03 13:06:37 -05:00

Dockerfile

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

OPTIMIZATION.md

chore: initial clean commit

2026-02-26 13:56:21 -05:00

PLAN.md

merge

2026-03-06 15:35:30 -05:00

README.md

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

RUST_BACKEND_REVIEW.md

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

rustfmt.toml

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

SECURITY_AUDIT.md

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

server.log

merge

2026-03-06 15:35:30 -05:00

server.pid

merge

2026-03-06 15:35:30 -05:00

test_dashboard.sh

refactor: comprehensive audit — fix bugs, harden security, deduplicate providers, add CI/Docker

2026-03-02 00:35:45 -05:00

test_server.sh

chore: initial clean commit

2026-02-26 13:56:21 -05:00

timeline.mmd

docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00

README.md

LLM Proxy Gateway

A unified, high-performance LLM proxy gateway built in Rust. It provides a single OpenAI-compatible API to access multiple providers (OpenAI, Gemini, DeepSeek, Grok, Ollama) with built-in token tracking, real-time cost calculation, multi-user authentication, and a management dashboard.

Features

Unified API: OpenAI-compatible /v1/chat/completions and /v1/models endpoints.
Multi-Provider Support:
- OpenAI: GPT-4o, GPT-4o Mini, o1, o3 reasoning models.
- Google Gemini: Gemini 2.0 Flash, Pro, and vision models.
- DeepSeek: DeepSeek Chat and Reasoner models.
- xAI Grok: Grok-beta models.
- Ollama: Local LLMs running on your network.
Observability & Tracking:
- Real-time Costing: Fetches live pricing and context specs from models.dev on startup.
- Token Counting: Precise estimation using tiktoken-rs.
- Database Logging: Every request logged to SQLite for historical analysis.
- Streaming Support: Full SSE (Server-Sent Events) with [DONE] termination for client compatibility.
Multimodal (Vision): Image processing (Base64 and remote URLs) across compatible providers.
Multi-User Access Control:
- Admin Role: Full access to all dashboard features, user management, and system configuration.
- Viewer Role: Read-only access to usage analytics, costs, and monitoring.
- Client API Keys: Create and manage multiple client tokens for external integrations.
Reliability:
- Circuit Breaking: Automatically protects when providers are down.
- Rate Limiting: Per-client and global rate limits.
- Cache-Aware Costing: Tracks cache hit/miss tokens for accurate billing.

Security

LLM Proxy is designed with security in mind:

HMAC Session Tokens: Management dashboard sessions are secured using HMAC-SHA256 signed tokens.
Encrypted Provider Keys: Sensitive LLM provider API keys are stored encrypted (AES-256-GCM) in the database.
Session Refresh: Activity-based session extension prevents session hijacking while maintaining user convenience.
XSS Prevention: Standardized frontend escaping using window.api.escapeHtml.

Note: You must define a SESSION_SECRET in your .env file for secure session signing.

Tech Stack

Runtime: Rust with Tokio.
Web Framework: Axum.
Database: SQLx with SQLite.
Frontend: Vanilla JS/CSS with Chart.js for visualizations.

Getting Started

Prerequisites

Rust (1.80+)
SQLite3
Docker (optional, for containerized deployment)

Quick Start

Clone and build:

git clone ssh://git.dustin.coffee:2222/hobokenchicken/llm-proxy.git
cd llm-proxy
cargo build --release

Configure environment:

cp .env.example .env
# Edit .env and add your API keys:
# SESSION_SECRET=... (Generate a strong random secret)
# OPENAI_API_KEY=sk-...
# GEMINI_API_KEY=AIza...

Run the proxy:
```
cargo run --release
```

The server starts on http://localhost:8080 by default.

Deployment (Docker)

A multi-stage Dockerfile is provided for efficient deployment:

# Build the container
docker build -t llm-proxy .

# Run the container
docker run -p 8080:8080 \
  -e SESSION_SECRET=your-secure-secret \
  -v ./data:/app/data \
  llm-proxy

Management Dashboard

Access the dashboard at http://localhost:8080. The dashboard architecture has been refactored into modular sub-components for better maintainability:

Auth (/api/auth): Login, session management, and password changes.
Usage (/api/usage): Summary stats, time-series analytics, and provider breakdown.
Clients (/api/clients): API key management and per-client usage tracking.
Providers (/api/providers): Provider configuration, status monitoring, and connection testing.
System (/api/system): Health metrics, live logs, database backups, and global settings.
Monitoring: Live request stream via WebSocket.

Default Credentials

Username: admin
Password: admin123

Change the admin password in the dashboard after first login!

API Usage

The proxy is a drop-in replacement for OpenAI. Configure your client:

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="YOUR_CLIENT_API_KEY"  # Create in dashboard
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

License

MIT OR Apache-2.0

Languages

JavaScript 54.7%

Go 34.6%

CSS 8.3%

HTML 2.2%

Dockerfile 0.2%