docs: sync documentation with current implementation and archive stale plan

2026-03-06 14:28:04 -05:00
parent 975ae124d1
commit 633b69a07b
8 changed files with 767 additions and 78 deletions
@@ -26,6 +26,17 @@ A unified, high-performance LLM proxy gateway built in Rust. It provides a singl
  - **Rate Limiting:** Per-client and global rate limits.
  - **Cache-Aware Costing:** Tracks cache hit/miss tokens for accurate billing.

+## Security
+
+LLM Proxy is designed with security in mind:
+
+- **HMAC Session Tokens:** Management dashboard sessions are secured using HMAC-SHA256 signed tokens.
+- **Encrypted Provider Keys:** Sensitive LLM provider API keys are stored encrypted (AES-256-GCM) in the database.
+- **Session Refresh:** Activity-based session extension prevents session hijacking while maintaining user convenience.
+- **XSS Prevention:** Standardized frontend escaping using `window.api.escapeHtml`.
+
+**Note:** You must define a `SESSION_SECRET` in your `.env` file for secure session signing.
+
 ## Tech Stack

 - **Runtime:** Rust with Tokio.
@@ -39,6 +50,7 @@ A unified, high-performance LLM proxy gateway built in Rust. It provides a singl

 - Rust (1.80+)
 - SQLite3
+- Docker (optional, for containerized deployment)

 ### Quick Start

@@ -53,10 +65,9 @@ A unified, high-performance LLM proxy gateway built in Rust. It provides a singl
   ```bash
   cp .env.example .env
   # Edit .env and add your API keys:
+   # SESSION_SECRET=... (Generate a strong random secret)
   # OPENAI_API_KEY=sk-...
   # GEMINI_API_KEY=AIza...
-   # DEEPSEEK_API_KEY=sk-...
-   # GROK_API_KEY=gk-...  (optional)
   ```

 3. Run the proxy:
@@ -66,50 +77,31 @@ A unified, high-performance LLM proxy gateway built in Rust. It provides a singl

 The server starts on `http://localhost:8080` by default.

-### Configuration
+### Deployment (Docker)

-Edit `config.toml` to customize providers, models, and settings:
+A multi-stage `Dockerfile` is provided for efficient deployment:

-```toml
-[server]
-port = 8080
-host = "0.0.0.0"
+```bash
+# Build the container
+docker build -t llm-proxy .

-[database]
-path = "./data/llm_proxy.db"
-
-[providers.openai]
-enabled = true
-default_model = "gpt-4o"
-
-[providers.gemini]
-enabled = true
-default_model = "gemini-2.0-flash"
-
-[providers.deepseek]
-enabled = true
-default_model = "deepseek-reasoner"
-
-[providers.grok]
-enabled = false
-default_model = "grok-beta"
-
-[providers.ollama]
-enabled = false
-base_url = "http://localhost:11434/v1"
+# Run the container
+docker run -p 8080:8080 \
+  -e SESSION_SECRET=your-secure-secret \
+  -v ./data:/app/data \
+  llm-proxy
 ```

 ## Management Dashboard

-Access the dashboard at `http://localhost:8080`:
+Access the dashboard at `http://localhost:8080`. The dashboard architecture has been refactored into modular sub-components for better maintainability:

- **Overview:** Real-time request counters, system health, provider status.
- **Analytics:** Time-series charts, filterable by date, client, provider, and model.
- **Costs:** Budget tracking, cost breakdown by provider/client/model, projections.
- **Clients:** Create, revoke, and rotate API tokens; per-client usage stats.
- **Providers:** Enable/disable providers, test connections, configure API keys.
- **Monitoring:** Live request stream via WebSocket, response times, error rates.
- **Users:** Admin/user management with role-based access control.
+- **Auth (`/api/auth`):** Login, session management, and password changes.
+- **Usage (`/api/usage`):** Summary stats, time-series analytics, and provider breakdown.
+- **Clients (`/api/clients`):** API key management and per-client usage tracking.
+- **Providers (`/api/providers`):** Provider configuration, status monitoring, and connection testing.
+- **System (`/api/system`):** Health metrics, live logs, database backups, and global settings.
+- **Monitoring:** Live request stream via WebSocket.

 ### Default Credentials

@@ -137,46 +129,6 @@ response = client.chat.completions.create(
 )
 ```

-### Open WebUI
-```
-API Base URL: http://your-server:8080/v1
-API Key: YOUR_CLIENT_API_KEY
-```
-
-### cURL
-```bash
-curl -X POST http://localhost:8080/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer YOUR_CLIENT_API_KEY" \
-  -d '{
-    "model": "gpt-4o",
-    "messages": [{"role": "user", "content": "Hello!"}],
-    "stream": false
-  }'
-```
-
-## Model Discovery
-
-The proxy exposes `/v1/models` for OpenAI-compatible client model discovery:
-
-```bash
-curl http://localhost:8080/v1/models \
-  -H "Authorization: Bearer YOUR_CLIENT_API_KEY"
-```
-
-## Troubleshooting
-
-### Streaming Issues
-If clients timeout or show "TransferEncodingError", ensure:
-1. Proxy buffering is disabled in nginx: `proxy_buffering off;`
-2. Chunked transfer is enabled: `chunked_transfer_encoding on;`
-3. Timeouts are sufficient: `proxy_read_timeout 7200s;`
-
-### Provider Errors
- Check API keys are set in `.env`
- Test provider in dashboard (Settings → Providers → Test)
- Review logs: `journalctl -u llm-proxy -f`
-
 ## License

 MIT OR Apache-2.0