Updated README, architecture, and TODO to reflect full feature parity, system metrics, and registry integration.
63 lines
3.7 KiB
Markdown
63 lines
3.7 KiB
Markdown
# Backend Architecture (Go)
|
|
|
|
The LLM Proxy backend is implemented in Go, focusing on high performance, clear concurrency patterns, and maintainability.
|
|
|
|
## Core Technologies
|
|
|
|
- **Runtime:** Go 1.22+
|
|
- **Web Framework:** [Gin Gonic](https://github.com/gin-gonic/gin) - Fast and lightweight HTTP routing.
|
|
- **Database:** [sqlx](https://github.com/jmoiron/sqlx) - Lightweight wrapper for standard `database/sql`.
|
|
- **SQLite Driver:** [modernc.org/sqlite](https://modernc.org/sqlite) - CGO-free SQLite implementation for ease of cross-compilation.
|
|
- **Config:** [Viper](https://github.com/spf13/viper) - Robust configuration management supporting environment variables and files.
|
|
- **Metrics:** [gopsutil](https://github.com/shirou/gopsutil) - System-level resource monitoring.
|
|
|
|
## Project Structure
|
|
|
|
```text
|
|
├── cmd/
|
|
│ └── llm-proxy/ # Entry point (main.go)
|
|
├── internal/
|
|
│ ├── config/ # Configuration loading and validation
|
|
│ ├── db/ # Database schema, migrations, and models
|
|
│ ├── middleware/ # Auth and logging middleware
|
|
│ ├── models/ # Unified request/response structs
|
|
│ ├── providers/ # LLM provider implementations (OpenAI, Gemini, etc.)
|
|
│ ├── server/ # HTTP server, dashboard handlers, and WebSocket hub
|
|
│ └── utils/ # Common utilities (registry, pricing, etc.)
|
|
└── static/ # Frontend assets (served by the backend)
|
|
```
|
|
|
|
## Key Components
|
|
|
|
### 1. Provider Interface (`internal/providers/provider.go`)
|
|
Standardized interface for all LLM backends. Implementations handle mapping between the unified format and provider-specific APIs (OpenAI, Gemini, DeepSeek, Grok).
|
|
|
|
### 2. Model Registry & Pricing (`internal/utils/registry.go`)
|
|
Integrates with `models.dev/api.json` to provide real-time model metadata and pricing.
|
|
- **Fuzzy Matching:** Supports matching versioned model IDs (e.g., `gpt-4o-2024-08-06`) to base registry entries.
|
|
- **Automatic Refreshes:** The registry is fetched at startup and refreshed every 24 hours via a background goroutine.
|
|
|
|
### 3. Asynchronous Logging (`internal/server/logging.go`)
|
|
Uses a buffered channel and background worker to log every request to SQLite without blocking the client response. It also broadcasts logs to the WebSocket hub for real-time dashboard updates.
|
|
|
|
### 4. Session Management (`internal/server/sessions.go`)
|
|
Implements HMAC-SHA256 signed tokens for dashboard authentication. Tokens secure the management interface while standard Bearer tokens are used for LLM API access.
|
|
|
|
### 5. WebSocket Hub (`internal/server/websocket.go`)
|
|
A centralized hub for managing WebSocket connections, allowing real-time broadcast of system events, system metrics, and request logs to the dashboard.
|
|
|
|
## Concurrency Model
|
|
|
|
Go's goroutines and channels are used extensively:
|
|
- **Streaming:** Each streaming request uses a goroutine to read and parse the provider's response, feeding chunks into a channel for SSE delivery.
|
|
- **Logging:** A single background worker processes the `logChan` to perform serial database writes.
|
|
- **WebSocket:** The `Hub` runs in a dedicated goroutine, handling registration and broadcasting.
|
|
- **Maintenance:** Background tasks handle registry refreshes and status monitoring.
|
|
|
|
## Security
|
|
|
|
- **Encryption Key:** A mandatory 32-byte key is used for both session signing and encryption of sensitive data.
|
|
- **Auth Middleware:** Scoped to `/v1` routes to verify client API keys against the database.
|
|
- **Bcrypt:** Passwords for dashboard users are hashed using Bcrypt with a work factor of 12.
|
|
- **Database Hardening:** Automatic migrations ensure the schema is always current with the code.
|