Files
GopherGate/README.md
T
hobokenchicken c009d401fb
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
docs: add Responses API endpoint to README
2026-05-05 09:36:51 -04:00

207 lines
6.6 KiB
Markdown

# GopherGate
A unified, high-performance LLM proxy gateway built in Go. It provides OpenAI-compatible `/v1/chat/completions`, `/v1/images/generations`, `/v1/responses`, and `/v1/models` endpoints to access multiple providers (OpenAI, Gemini, DeepSeek, Moonshot, Grok, Ollama) with built-in token tracking, real-time cost calculation, multi-user authentication, and a management dashboard.
## Features
- **Unified API:** OpenAI-compatible `/v1/chat/completions`, `/v1/images/generations`, `/v1/responses`, and `/v1/models` endpoints.
- The `/v1/responses` endpoint (OpenAI Responses API) is currently supported for OpenAI models only. Non-OpenAI providers (Gemini, DeepSeek, Moonshot, Grok, Ollama) return a "not supported" response.
- **Multi-Provider Support:**
- **OpenAI:** GPT-4o, GPT-4o Mini, o1, o3 reasoning models, DALL-E 2/3 image generation.
- **Google Gemini:** Gemini 2.0 Flash, Pro, and vision models (with native CoT support), Imagen 3 image generation.
- **DeepSeek:** DeepSeek Chat and Reasoner (R1) models.
- **Moonshot:** Kimi K2.5 and other Kimi models.
- **xAI Grok:** Grok-4 models.
- **Ollama:** Local LLMs running on your network.
- **Observability & Tracking:**
- **Asynchronous Logging:** Non-blocking request logging to SQLite using background workers.
- **Token Counting:** Precise estimation and tracking of prompt, completion, and reasoning tokens.
- **Database Persistence:** Every request logged to SQLite for historical analysis and dashboard analytics.
- **Streaming Support:** Full SSE (Server-Sent Events) support for all providers.
- **Multimodal (Vision):** Image processing (Base64 and remote URLs) across compatible providers.
- **Image Generation:** DALL-E 2/3 (OpenAI) and Imagen 3 (Gemini) via OpenAI-compatible `/v1/images/generations` endpoint.
- **Multi-User Access Control:**
- **Admin Role:** Full access to all dashboard features, user management, and system configuration.
- **Viewer Role:** Read-only access to usage analytics, costs, and monitoring.
- **Client API Keys:** Create and manage multiple client tokens for external integrations.
- **Reliability:**
- **Circuit Breaking:** Automatically protects when providers are down (coming soon).
- **Rate Limiting:** Per-client and global rate limits (coming soon).
## Security
GopherGate is designed with security in mind:
- **Signed Session Tokens:** Management dashboard sessions are secured using HMAC-SHA256 signed tokens.
- **Encrypted Storage:** Support for encrypted provider API keys in the database.
- **Auth Middleware:** Secure client authentication via database-backed API keys.
**Note:** You must define an `LLM_PROXY__ENCRYPTION_KEY` in your `.env` file for secure session signing and encryption.
## Tech Stack
- **Runtime:** Go 1.22+
- **Web Framework:** Gin Gonic
- **Database:** sqlx with SQLite (CGO-free via `modernc.org/sqlite`)
- **Frontend:** Vanilla JS/CSS with Chart.js for visualizations
## Getting Started
### Prerequisites
- Go (1.22+)
- SQLite3 (optional, driver is built-in)
- Docker (optional, for containerized deployment)
### Quick Start
1. Clone and build:
```bash
git clone <repository-url>
cd gophergate
go build -o gophergate ./cmd/gophergate
```
2. Configure environment:
```bash
cp .env.example .env
# Edit .env and add your configuration:
# LLM_PROXY__ENCRYPTION_KEY=... (32-byte hex or base64 string)
# OPENAI_API_KEY=sk-...
# GEMINI_API_KEY=AIza...
# MOONSHOT_API_KEY=...
# For Ollama (optional): Set base URL and enable
# LLM_PROXY__PROVIDERS__OLLAMA__BASE_URL=http://localhost:11434/v1
# LLM_PROXY__PROVIDERS__OLLAMA__ENABLED=true
# LLM_PROXY__PROVIDERS__OLLAMA__MODELS=llama3,gemma2,mistral
```
3. Run the proxy:
```bash
./gophergate
```
The server starts on `http://0.0.0.0:8080` by default.
### Deployment (Docker)
```bash
# Build the container
docker build -t gophergate .
# Run the container
docker run -p 8080:8080 \
-e LLM_PROXY__ENCRYPTION_KEY=your-secure-key \
-v ./data:/app/data \
gophergate
```
## Management Dashboard
Access the dashboard at `http://localhost:8080`.
- **Auth:** Login, session management, and status tracking.
- **Usage:** Summary stats, time-series analytics, and provider breakdown.
- **Clients:** API key management and per-client usage tracking.
- **Providers:** Provider configuration and status monitoring.
- **Users:** Admin-only user management for dashboard access.
- **Monitoring:** Live request stream via WebSocket.
### Default Credentials
- **Username:** `admin`
- **Password:** `admin123` (You will be prompted to change this on first login)
**Forgot Password?**
You can reset the admin password to default by running:
```bash
./gophergate -reset-admin
```
## API Usage
The proxy is a drop-in replacement for OpenAI. Configure your client:
Moonshot models are available through the same OpenAI-compatible endpoint. For
example, use `kimi-k2.5` as the model name after setting `MOONSHOT_API_KEY` in
your environment.
Ollama models (like `llama3`, `gemma2`, `mistral`) are also available through the same
endpoint after enabling Ollama in configuration and setting the base URL to your
Ollama server (default: `http://localhost:11434/v1`).
### Python
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="YOUR_CLIENT_API_KEY" # Create in dashboard
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
```
### Responses API
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="YOUR_CLIENT_API_KEY"
)
# OpenAI Responses API (supported for OpenAI models only)
response = client.responses.create(
model="gpt-4o",
input="Explain quantum computing in one paragraph.",
instructions="You are a helpful assistant.",
temperature=0.7,
max_output_tokens=500
)
print(response.output_text)
```
**Note:** The `/v1/responses` endpoint is currently supported for OpenAI models only. Requests routed to Gemini, DeepSeek, Moonshot, Grok, or Ollama models return a "not supported" error.
### Image Generation (DALL-E / Imagen)
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="YOUR_CLIENT_API_KEY"
)
# DALL-E 3 (OpenAI)
resp = client.images.generate(
model="dall-e-3",
prompt="A cute gopher wearing a top hat",
n=1,
size="1024x1024"
)
print(resp.data[0].url)
# Imagen 3 (Gemini) — uses same endpoint
resp = client.images.generate(
model="imagen-3.0-generate-001",
prompt="A gopher coding in Go",
n=1,
size="1024x1024"
)
print(resp.data[0].url) # Returns data URI (Gemini returns base64)
```
## License
MIT