From 0ce5f4f4908c581eab60cd27a9c8b015150ffa2f Mon Sep 17 00:00:00 2001
From: hobokenchicken <dustin@dustin.coffee>
Date: Thu, 19 Mar 2026 13:26:31 -0400
Subject: [PATCH] docs: finalize documentation for Go migration

Updated README, architecture, and TODO to reflect full feature parity, system metrics, and registry integration.
---
 BACKEND_ARCHITECTURE.md | 37 +++++++++++++++++++------------------
 README.md               |  8 +++++++-
 TODO.md                 | 16 +++++++++++-----
 3 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/BACKEND_ARCHITECTURE.md b/BACKEND_ARCHITECTURE.md
index 0ee3e0b9..4be40dfd 100644
--- a/BACKEND_ARCHITECTURE.md
+++ b/BACKEND_ARCHITECTURE.md
@@ -9,6 +9,7 @@ The LLM Proxy backend is implemented in Go, focusing on high performance, clear
 - **Database:** [sqlx](https://github.com/jmoiron/sqlx) - Lightweight wrapper for standard `database/sql`.
 - **SQLite Driver:** [modernc.org/sqlite](https://modernc.org/sqlite) - CGO-free SQLite implementation for ease of cross-compilation.
 - **Config:** [Viper](https://github.com/spf13/viper) - Robust configuration management supporting environment variables and files.
+- **Metrics:** [gopsutil](https://github.com/shirou/gopsutil) - System-level resource monitoring.
 
 ## Project Structure
 
@@ -22,40 +23,40 @@ The LLM Proxy backend is implemented in Go, focusing on high performance, clear
 │   ├── models/             # Unified request/response structs
 │   ├── providers/          # LLM provider implementations (OpenAI, Gemini, etc.)
 │   ├── server/             # HTTP server, dashboard handlers, and WebSocket hub
-│   └── utils/              # Common utilities (multimodal, etc.)
+│   └── utils/              # Common utilities (registry, pricing, etc.)
 └── static/                 # Frontend assets (served by the backend)
 ```
 
 ## Key Components
 
 ### 1. Provider Interface (`internal/providers/provider.go`)
-Standardized interface for all LLM backends:
-```go
-type Provider interface {
-	Name() string
-	ChatCompletion(ctx context.Context, req *models.UnifiedRequest) (*models.ChatCompletionResponse, error)
-	ChatCompletionStream(ctx context.Context, req *models.UnifiedRequest) (<-chan *models.ChatCompletionStreamResponse, error)
-}
-```
+Standardized interface for all LLM backends. Implementations handle mapping between the unified format and provider-specific APIs (OpenAI, Gemini, DeepSeek, Grok).
 
-### 2. Asynchronous Logging (`internal/server/logging.go`)
+### 2. Model Registry & Pricing (`internal/utils/registry.go`)
+Integrates with `models.dev/api.json` to provide real-time model metadata and pricing. 
+- **Fuzzy Matching:** Supports matching versioned model IDs (e.g., `gpt-4o-2024-08-06`) to base registry entries.
+- **Automatic Refreshes:** The registry is fetched at startup and refreshed every 24 hours via a background goroutine.
+
+### 3. Asynchronous Logging (`internal/server/logging.go`)
 Uses a buffered channel and background worker to log every request to SQLite without blocking the client response. It also broadcasts logs to the WebSocket hub for real-time dashboard updates.
 
-### 3. Session Management (`internal/server/sessions.go`)
-Implements HMAC-SHA256 signed tokens for dashboard authentication. Sessions are stored in-memory with configurable TTL.
+### 4. Session Management (`internal/server/sessions.go`)
+Implements HMAC-SHA256 signed tokens for dashboard authentication. Tokens secure the management interface while standard Bearer tokens are used for LLM API access.
 
-### 4. WebSocket Hub (`internal/server/websocket.go`)
-A centralized hub for managing WebSocket connections, allowing real-time broadcast of system events and request logs to the dashboard.
+### 5. WebSocket Hub (`internal/server/websocket.go`)
+A centralized hub for managing WebSocket connections, allowing real-time broadcast of system events, system metrics, and request logs to the dashboard.
 
 ## Concurrency Model
 
 Go's goroutines and channels are used extensively:
-- **Streaming:** Each streaming request uses a goroutine to read and parse the provider's response, feeding chunks into a channel.
-- **Logging:** A single background worker processes the `logChan` to perform database writes.
+- **Streaming:** Each streaming request uses a goroutine to read and parse the provider's response, feeding chunks into a channel for SSE delivery.
+- **Logging:** A single background worker processes the `logChan` to perform serial database writes.
 - **WebSocket:** The `Hub` runs in a dedicated goroutine, handling registration and broadcasting.
+- **Maintenance:** Background tasks handle registry refreshes and status monitoring.
 
 ## Security
 
-- **Encryption Key:** A mandatory 32-byte key is used for both session signing and encryption of sensitive data in the database.
-- **Auth Middleware:** Verifies client API keys against the database before proxying requests to LLM providers.
+- **Encryption Key:** A mandatory 32-byte key is used for both session signing and encryption of sensitive data.
+- **Auth Middleware:** Scoped to `/v1` routes to verify client API keys against the database.
 - **Bcrypt:** Passwords for dashboard users are hashed using Bcrypt with a work factor of 12.
+- **Database Hardening:** Automatic migrations ensure the schema is always current with the code.
diff --git a/README.md b/README.md
index 2a3176f8..a4ad34a0 100644
--- a/README.md
+++ b/README.md
@@ -102,7 +102,13 @@ Access the dashboard at `http://localhost:8080`.
 ### Default Credentials
 
 - **Username:** `admin`
-- **Password:** `admin` (You will be prompted to change this or should change it manually in the dashboard)
+- **Password:** `admin123` (You will be prompted to change this on first login)
+
+**Forgot Password?**
+You can reset the admin password to default by running:
+```bash
+./llm-proxy -reset-admin
+```
 
 ## API Usage
 
diff --git a/TODO.md b/TODO.md
index 879a6b43..bb1df460 100644
--- a/TODO.md
+++ b/TODO.md
@@ -2,9 +2,9 @@
 
 ## Completed Tasks
 - [x] Initial Go project setup
-- [x] Database schema & migrations
+- [x] Database schema & migrations (hardcoded in `db.go`)
 - [x] Configuration loader (Viper)
-- [x] Auth Middleware
+- [x] Auth Middleware (scoped to `/v1`)
 - [x] Basic Provider implementations (OpenAI, Gemini, DeepSeek, Grok)
 - [x] Streaming Support (SSE & Gemini custom streaming)
 - [x] Archive Rust files to `rust` branch
@@ -12,16 +12,21 @@
 - [x] Enhanced `helpers.go` for Multimodal & Tool Calling (OpenAI compatible)
 - [x] Enhanced `server.go` for robust request conversion
 - [x] Dashboard Management APIs (Clients, Tokens, Users, Providers)
-- [x] Dashboard Analytics & Usage Summary
-- [x] WebSocket for real-time dashboard updates
+- [x] Dashboard Analytics & Usage Summary (Fixed SQL robustness)
+- [x] WebSocket for real-time dashboard updates (Hub with client counting)
 - [x] Asynchronous Request Logging to SQLite
 - [x] Update documentation (README, deployment, architecture)
+- [x] Cost Tracking accuracy (Registry integration with `models.dev`)
+- [x] Model Listing endpoint (`/v1/models`) with provider filtering
+- [x] System Metrics endpoint (`/api/system/metrics` using `gopsutil`)
+- [x] Fixed dashboard 404s and 500s
 
 ## Feature Parity Checklist (High Priority)
 
 ### OpenAI Provider
 - [x] Tool Calling
 - [x] Multimodal (Images) support
+- [x] Accurate usage parsing (cached & reasoning tokens)
 - [ ] Reasoning Content (CoT) support for `o1`, `o3` (need to ensure it's parsed in responses)
 - [ ] Support for `/v1/responses` API (required for some gpt-5/o1 models)
 
@@ -35,15 +40,16 @@
 - [x] Reasoning Content (CoT) support
 - [x] Parameter sanitization for `deepseek-reasoner`
 - [x] Tool Calling support
+- [x] Accurate usage parsing (cache hits & reasoning)
 
 ### Grok Provider
 - [x] Tool Calling support
 - [x] Multimodal support
+- [x] Accurate usage parsing (via OpenAI helper)
 
 ## Infrastructure & Middleware
 - [ ] Implement Rate Limiting (`golang.org/x/time/rate`)
 - [ ] Implement Circuit Breaker (`github.com/sony/gobreaker`)
-- [ ] Implement Model Cost Calculation logic (needs registry/pricing integration)
 
 ## Verification
 - [ ] Unit tests for feature-specific mapping (CoT, Tools, Images)