The app never read the LLM_PROXY__CONFIG_PATH env var, so the systemd
service couldn't find /etc/llm-proxy/config.toml and fell back to
./data/llm_proxy.db (owned by root, readonly for llmproxy user).
- Add LLM_PROXY__CONFIG_PATH support to config loader (checks env var
before falling back to ./config.toml)
- Add LLM_PROXY__DATABASE__PATH to service env so the DB path always
resolves to /var/lib/llm-proxy/llm_proxy.db regardless of config
- Add GIT_REPO config variable with actual Gitea remote URL
- Auto-detect nologin path for Arch/Debian compatibility
- Rewrite update() to build before stopping service (minimizes downtime)
- Abort update cleanly if build fails without interrupting service
- Fix all placeholder github.com/yourusername URLs
- Add status and logs convenience commands
Replace login-style 'form-group' class (absolute-positioned floating
labels) with dashboard-standard 'form-control' class (block labels,
proper input styling). Add missing 'modal-title' class to headings
and remove incorrect 'form-control' class from individual inputs.
The create, edit, and reset-password modals were using 'modal-overlay'
and 'modal' classes instead of the standard 'modal active' and
'modal-content' pattern, making them invisible due to CSS rules that
hide .modal elements without the .active class.
Add complete multi-user support with role-based access control:
Backend:
- Add users CRUD endpoints (GET/POST/PUT/DELETE /api/users) with admin-only guards
- Add display_name column to users table with ALTER TABLE migration
- Fix auth to use session-based user identity (not hardcoded 'admin')
- Add POST /api/auth/logout to revoke server-side sessions
- Add require_admin() and extract_session() helpers for clean RBAC
- Guard all mutating endpoints (clients, providers, models, settings, backup)
Frontend:
- Add Users management page with create/edit/reset-password/delete modals
- Add role gating: hide edit/delete buttons for viewers on clients, providers, models
- Settings page hides auth tokens and admin actions for viewers
- Logout now revokes server session before clearing localStorage
- Sidebar shows real display_name and formatted role (Administrator/Viewer)
- Fix sidebar header: single logo with onerror fallback, renamed to 'LLM Proxy'
- Add badge and btn-action CSS classes for role pills and action buttons
- Bump cache-bust to v=7
- All usage endpoints now accept ?period=today|24h|7d|30d|all|custom
with optional &from=ISO&to=ISO for custom ranges
- Time-series chart adapts granularity: hourly for today/24h, daily for
7d/30d/all
- Analytics and Costs pages have period selector buttons with custom
date-range picker
- Pricing table on Costs page now only shows models that have actually
been used (GET /models?used_only=true)
- Cache-bust version bumped to v=6
Add client_tokens table with auto-generated sk-{hex} tokens so clients
created in the dashboard get working API keys. Auth flow: DB token lookup
first, then env token fallback, then permissive mode. Includes token
management CRUD endpoints and copy-once reveal modal in the frontend.
Add PUT /api/clients/{id} with dynamic UPDATE for name, description,
is_active, and rate_limit_per_minute. Expose description and
rate_limit_per_minute in client list/detail responses. Replace the
frontend editClient stub with a modal that fetches, edits, and saves
client data.
Track cache_read_tokens and cache_write_tokens end-to-end: parse from
provider responses (OpenAI, DeepSeek, Grok, Gemini), persist to SQLite,
apply cache-aware pricing from the model registry, and surface in API
responses and the dashboard.
- Add cache fields to ProviderResponse, StreamUsage, RequestLog structs
- Parse cached_tokens (OpenAI/Grok), prompt_cache_hit/miss (DeepSeek),
cachedContentTokenCount (Gemini) from provider responses
- Send stream_options.include_usage for streaming; capture real usage
from final SSE chunk in AggregatingStream
- ALTER TABLE migration for cache_read_tokens/cache_write_tokens columns
- Cache-aware cost formula using registry cache_read/cache_write rates
- Update Provider trait calculate_cost signature across all providers
- Add cache_read_tokens/cache_write_tokens to Usage API response
- Dashboard: cache hit rate card, cache columns in pricing and usage
tables, cache token aggregation in SQL queries
- Remove API debug panel and verbose console logging from api.js
- Bump static asset cache-bust to v5
The model registry from models.dev uses 'google' and 'xai' as provider
IDs, but internal providers use 'gemini' and 'grok'. Added mapping so
all provider models appear in the listing.
Open WebUI and other OpenAI-compatible clients call GET /v1/models to
discover available models. Lists all models from enabled providers via
the model registry, respects disabled models, and handles Ollama models
from TOML config.
Added fixed height and position:relative to .chart-container so Chart.js
responsive mode has a bounded parent. Removed height attributes from all
canvas elements that conflict with Chart.js responsive sizing.
chart.tooltip is undefined during initial draw for radial chart types
(doughnut/pie), and chart.scales.y doesn't exist on non-cartesian charts.
This crashed chart creation, causing 'Failed to load' messages on all pages.
API endpoints return valid data via curl but browser JS shows 'Failed to load'.
Added verbose logging and a fixed-position debug panel to api.js that displays
the exact error type (JSON parse, API error, or network exception) on-page.
Bumped cache-bust to ?v=4.
- Add in-memory ModelConfigCache (30s refresh, explicit invalidation)
replacing 2 SQLite queries per request (model lookup + cost override)
- Configure all 5 provider HTTP clients with proper timeouts (300s),
connection pooling (4 idle/host, 90s idle timeout), and TCP keepalive
- Move client_usage update to tokio::spawn in non-streaming path
- Use fast chars/4 heuristic for token estimation on large inputs (>1KB)
- Generate single UUID/timestamp per SSE stream instead of per chunk
- Add shared LazyLock<Client> for image fetching in multimodal module
- Add proxy overhead timing instrumentation for both request paths
- Fix test helper to include new model_config_cache field
Backend: wrap SUM() queries with COALESCE in handle_time_series,
handle_clients_usage, and handle_detailed_usage to prevent NULL-induced
panics when no data exists for a time window.
Frontend: add showEmptyChart() empty-state messages and error feedback
across overview, analytics, costs, and clients pages. Rewrite analytics
loadCharts() to use Promise.allSettled() so each chart renders
independently on partial API failures.
messages_to_openai_json_text_only() was missing tool-calling support,
causing DeepSeek 400 errors when conversations included tool turns.
Now mirrors messages_to_openai_json() logic for tool-role messages
(tool_call_id, name) and assistant tool_calls, with images replaced
by "[Image]" text.
- Add /api/system/metrics endpoint reading real data from /proc (CPU, memory, disk, network, load avg, uptime, connections)
- Replace hardcoded fake monitoring metrics with live API data
- Replace random chart data with real latency/error-rate/client-request charts from DB logs
- Fix light-mode colors leaking into dark theme (monitoring stream bg, settings tokens, warning card)
- Add 'models' to page title map, fix System Health card structure
- Move inline styles to CSS classes (monitoring-layout, monitoring-stream, token-item, warning-card)
- Prevent duplicate style injection in monitoring page
INSERT into llm_requests was failing with FOREIGN KEY constraint (code 787)
because client_id (e.g. 'client_sk-hobok') didn't exist in the clients table.
Add INSERT OR IGNORE for the client row within the same transaction.
- overview.js: fix time-series chart crash (data is {series:[...]}, not array; field is 'time' not 'hour')
- monitoring.js: use fallback field names (total_tokens/tokens, duration_ms/duration) for WebSocket vs API compat
- monitoring.js: disable localhost demo data injection that mixed fake data with real
- websocket.js: fix duplicate condition and field name mismatches in dead-code handlers
- logging/mod.rs: add info! logs for successful DB insert and broadcast count for diagnostics
Implement full OpenAI-compatible tool-calling support across the proxy,
enabling OpenCode to use llm-proxy as its sole LLM backend.
- Add 9 tool-calling types (Tool, FunctionDef, ToolChoice, ToolCall, etc.)
- Update ChatCompletionRequest/ChatMessage/ChatStreamDelta with tool fields
- Update UnifiedRequest/UnifiedMessage to carry tool data through the pipeline
- Shared helpers: messages_to_openai_json handles tool messages, build_openai_body
includes tools/tool_choice, parse/stream extract tool_calls from responses
- Gemini: full OpenAI<->Gemini format translation (functionDeclarations,
functionCall/functionResponse, synthetic call IDs, tool_config mapping)
- Gemini: extract duplicated message-conversion into shared convert_messages()
- Server: SSE streams include tool_calls deltas, finish_reason='tool_calls'
- AggregatingStream: accumulate tool call deltas across stream chunks
- OpenAI provider: add o4- prefix to supports_model()
- Added 'users' table to database with bcrypt hashing.
- Refactored login to verify against the database.
- Implemented 'Security' section in settings to allow changing the admin password.
- Initialized system with default user 'admin' and password 'admin'.
- Added 'credit_balance' and 'low_credit_threshold' to 'provider_configs' table.
- Updated dashboard backend to support reading and updating provider credits.
- Implemented real-time credit deduction from provider balances on successful requests.
- Added visual balance indicators and configuration modal to the 'Providers' dashboard tab.
- Added 'provider_configs' and 'model_configs' tables to database.
- Refactored ProviderManager to support thread-safe dynamic updates and database overrides.
- Implemented 'Models' tab in dashboard to manage model visibility, mapping, and pricing.
- Added provider configuration modal to 'Providers' tab.
- Integrated database overrides into chat completion logic (enabled state, mapping, and cost).
- Set Grok to enabled: true by default.
- Updated AppState to include raw AppConfig.
- Refactored dashboard to show all supported providers, including their configuration and initialization status (online, disabled, or error).