- Removed binary 'gophergate'
- Removed 'data/llm_proxy.db' from source control (kept locally)
- Removed old database backups in 'data/backups/'
- Updated .gitignore to exclude data directory and gophergate binary
- Added AES-GCM encryption/decryption for provider API keys in the database.
- Implemented RefreshProviders to load provider configs from the database with precedence over environment variables.
- Updated dashboard handlers to encrypt keys on save and trigger in-memory provider refresh.
- Updated Grok test model to grok-3-mini for better compatibility.
- Fixed status indicator UI mapping in websocket.js and index.html.
- Added missing CSS for connection status indicator and pulse animation.
- Made initial model registry fetch asynchronous to prevent blocking server startup.
- Improved configuration loading to correctly handle LLM_PROXY__SERVER__PORT from environment.
Added JSON tags to the User struct to match frontend expectations and excluded sensitive fields.
Updated session management to include and persist DisplayName.
Unified user field names (using display_name) across backend, sessions, and frontend UI.
Updated handleGetProviders to include available models and last-used timestamps. Refined Model Pricing table to strictly filter by core providers and actual usage.
Added /api/system/metrics with CPU/Mem/Disk/Load data using gopsutil. Updated Hub to track active WebSocket listeners. Verified log format for monitoring charts.
Filtered registry iteration to only include openai, gemini, deepseek, and grok. Improved used_only logic to match specific (model, provider) pairs from logs.
Refined CalculateCost to correctly handle cached token discounts. Added fuzzy matching to model lookup. Robustified SQL date extraction using SUBSTR and LIKE for better SQLite compatibility.
Fixed '2025 years ago' issue by correctly handling zero-value timestamps. Improved SQL scanning logic to handle NULL values more safely across all analytics handlers.
Restricted AuthMiddleware to /v1 group to prevent dashboard session interference. Robustified analytics SQL queries with COALESCE to handle empty datasets.
Improved extraction of reasoning and cached tokens from OpenAI and DeepSeek responses (including streams). Ensured accurate cost calculation using registry metadata.
Updated handleGetModels to merge registry data with DB overrides and implemented handleUpdateModel. Verified API response format matches frontend requirements.
Removed .env and .env.backup from git tracking and consolidated configuration into .env.example. Updated .gitignore to robustly prevent accidental inclusion of sensitive files.
- Sequential next_tool_index is now used for both Responses API 'function_call' events and the proxy's 'tool_uses' JSON extraction.
- This ensures tool_calls arrays in the stream always start at index 0 and are dense, even if standard and embedded calls were somehow mixed.
- Fixed 'payload_idx' logic to correctly align argument chunks with their initialization chunks.
- The OpenAI Responses API uses 'output_index' to identify items in the response.
- If a response starts with text (output_index 0) followed by a tool call (output_index 1), the standard Chat Completions streaming format requires the first tool call to have index 0.
- Previously, the proxy was passing output_index (1) as the tool_call index, causing client-side SDKs to fail parsing the stream and silently drop the tool calls.
- Implemented a local mapping within the stream to ensure tool_call indexes are always dense and start at 0.
- Standard OpenAI clients expect tool_calls to be streamed as two parts:
1. Initialization chunk containing 'id', 'type', and 'name', with empty 'arguments'.
2. Payload chunk(s) containing 'arguments', with 'id', 'type', and 'name' omitted.
- Previously, the proxy was yielding all fields in a single chunk when parsing the custom 'tool_uses' JSON from gpt-5.4, causing strict clients like opencode to fail silently when delegating parallel tasks.
- The proxy now splits the extracted JSON into the correct two-chunk sequence, restoring subagent compatibility.
- Remove overzealous .trim() in strip_internal_metadata which destroyed whitespace between text stream chunks, causing client hangs
- Fix finish_reason logic to only yield once at the end of the stream
- Correctly yield finish_reason: 'tool_calls' instead of 'stop' when tool calls are generated
- The Responses API does not use 'response.item.delta' for tool calls.
- It uses 'response.output_item.added' to initialize the function call.
- It uses 'response.function_call_arguments.delta' for the payload stream.
- Updated the streaming parser to catch these events and correctly yield ToolCallDelta objects.
- This restores proper streaming of tool calls back to the client.
- The Responses API ends streams without a final '[DONE]' message.
- This causes reqwest_eventsource to return Error::StreamEnded.
- Previously, this was treated as a premature termination, triggering an error probe.
- We now explicitly match and break on Err(StreamEnded) for normal completion.
- The OpenAI Responses API actually requires the 'stream: true'
parameter in the JSON body, contrary to some documentation summaries.
- Omitting it caused the API to return a full application/json
response instead of SSE text/event-stream, leading to stream failures
and probe warnings in the proxy logs.
- Implement final buffer flush in streaming path to prevent data loss
- Increase probe response body logging to 500 characters
- Ensure internal metadata is stripped even on final flush
- Fix potential hang when stream ends without explicit [DONE] event
- Add strip_internal_metadata helper to remove prefixes like 'to=multi_tool_use.parallel'
- Clean up Thai text preambles reported in the journal
- Apply metadata stripping to both synchronous and streaming response paths
- Improve visual quality of proxied model responses