This commit replaces the Axum/Rust backend with a Gin/Go implementation. The original Rust code has been archived in the 'rust' branch.
3.1 KiB
3.1 KiB
Backend Architecture (Go)
The LLM Proxy backend is implemented in Go, focusing on high performance, clear concurrency patterns, and maintainability.
Core Technologies
- Runtime: Go 1.22+
- Web Framework: Gin Gonic - Fast and lightweight HTTP routing.
- Database: sqlx - Lightweight wrapper for standard
database/sql. - SQLite Driver: modernc.org/sqlite - CGO-free SQLite implementation for ease of cross-compilation.
- Config: Viper - Robust configuration management supporting environment variables and files.
Project Structure
├── cmd/
│ └── llm-proxy/ # Entry point (main.go)
├── internal/
│ ├── config/ # Configuration loading and validation
│ ├── db/ # Database schema, migrations, and models
│ ├── middleware/ # Auth and logging middleware
│ ├── models/ # Unified request/response structs
│ ├── providers/ # LLM provider implementations (OpenAI, Gemini, etc.)
│ ├── server/ # HTTP server, dashboard handlers, and WebSocket hub
│ └── utils/ # Common utilities (multimodal, etc.)
└── static/ # Frontend assets (served by the backend)
Key Components
1. Provider Interface (internal/providers/provider.go)
Standardized interface for all LLM backends:
type Provider interface {
Name() string
ChatCompletion(ctx context.Context, req *models.UnifiedRequest) (*models.ChatCompletionResponse, error)
ChatCompletionStream(ctx context.Context, req *models.UnifiedRequest) (<-chan *models.ChatCompletionStreamResponse, error)
}
2. Asynchronous Logging (internal/server/logging.go)
Uses a buffered channel and background worker to log every request to SQLite without blocking the client response. It also broadcasts logs to the WebSocket hub for real-time dashboard updates.
3. Session Management (internal/server/sessions.go)
Implements HMAC-SHA256 signed tokens for dashboard authentication. Sessions are stored in-memory with configurable TTL.
4. WebSocket Hub (internal/server/websocket.go)
A centralized hub for managing WebSocket connections, allowing real-time broadcast of system events and request logs to the dashboard.
Concurrency Model
Go's goroutines and channels are used extensively:
- Streaming: Each streaming request uses a goroutine to read and parse the provider's response, feeding chunks into a channel.
- Logging: A single background worker processes the
logChanto perform database writes. - WebSocket: The
Hubruns in a dedicated goroutine, handling registration and broadcasting.
Security
- Encryption Key: A mandatory 32-byte key is used for both session signing and encryption of sensitive data in the database.
- Auth Middleware: Verifies client API keys against the database before proxying requests to LLM providers.
- Bcrypt: Passwords for dashboard users are hashed using Bcrypt with a work factor of 12.