Removed .env and .env.backup from git tracking and consolidated configuration into .env.example. Updated .gitignore to robustly prevent accidental inclusion of sensitive files.
- Sequential next_tool_index is now used for both Responses API 'function_call' events and the proxy's 'tool_uses' JSON extraction.
- This ensures tool_calls arrays in the stream always start at index 0 and are dense, even if standard and embedded calls were somehow mixed.
- Fixed 'payload_idx' logic to correctly align argument chunks with their initialization chunks.
- The OpenAI Responses API uses 'output_index' to identify items in the response.
- If a response starts with text (output_index 0) followed by a tool call (output_index 1), the standard Chat Completions streaming format requires the first tool call to have index 0.
- Previously, the proxy was passing output_index (1) as the tool_call index, causing client-side SDKs to fail parsing the stream and silently drop the tool calls.
- Implemented a local mapping within the stream to ensure tool_call indexes are always dense and start at 0.
- Standard OpenAI clients expect tool_calls to be streamed as two parts:
1. Initialization chunk containing 'id', 'type', and 'name', with empty 'arguments'.
2. Payload chunk(s) containing 'arguments', with 'id', 'type', and 'name' omitted.
- Previously, the proxy was yielding all fields in a single chunk when parsing the custom 'tool_uses' JSON from gpt-5.4, causing strict clients like opencode to fail silently when delegating parallel tasks.
- The proxy now splits the extracted JSON into the correct two-chunk sequence, restoring subagent compatibility.
- Remove overzealous .trim() in strip_internal_metadata which destroyed whitespace between text stream chunks, causing client hangs
- Fix finish_reason logic to only yield once at the end of the stream
- Correctly yield finish_reason: 'tool_calls' instead of 'stop' when tool calls are generated
- The Responses API does not use 'response.item.delta' for tool calls.
- It uses 'response.output_item.added' to initialize the function call.
- It uses 'response.function_call_arguments.delta' for the payload stream.
- Updated the streaming parser to catch these events and correctly yield ToolCallDelta objects.
- This restores proper streaming of tool calls back to the client.
- The Responses API ends streams without a final '[DONE]' message.
- This causes reqwest_eventsource to return Error::StreamEnded.
- Previously, this was treated as a premature termination, triggering an error probe.
- We now explicitly match and break on Err(StreamEnded) for normal completion.
- The OpenAI Responses API actually requires the 'stream: true'
parameter in the JSON body, contrary to some documentation summaries.
- Omitting it caused the API to return a full application/json
response instead of SSE text/event-stream, leading to stream failures
and probe warnings in the proxy logs.
- Implement final buffer flush in streaming path to prevent data loss
- Increase probe response body logging to 500 characters
- Ensure internal metadata is stripped even on final flush
- Fix potential hang when stream ends without explicit [DONE] event
- Add strip_internal_metadata helper to remove prefixes like 'to=multi_tool_use.parallel'
- Clean up Thai text preambles reported in the journal
- Apply metadata stripping to both synchronous and streaming response paths
- Improve visual quality of proxied model responses
- Implement cross-delta content buffering in streaming Responses API
- Wait for full 'tool_uses' JSON block before yielding to client
- Handle 'to=multi_tool_use.parallel' preamble by buffering
- Fix stream error probe to not request a new stream
- Remove raw JSON leakage from streaming content
- Add static parse_tool_uses_json helper to extract embedded tool calls
- Update synchronous and streaming Responses API parsers to detect tool_uses blocks
- Strip tool_uses JSON from content to prevent raw JSON leakage to client
- Resolve lifetime issues by avoiding &self capture in streaming closure
- Add 'tools' and 'tool_choice' parameters to streaming Responses API
- Include 'name' field in message items for Responses API input
- Use string content for text-only messages to improve instruction following
- Fix subagents not triggering and files not being created
This commit overhauls the dashboard styling to achieve a 'retro terminal' look matching the Gruvbox color palette. Changes include switching the global font to JetBrains Mono/Fira Code, removing rounded corners on all major elements (cards, buttons, inputs, modals, badges), and replacing modern soft shadows with sharp, hard-offset block shadows.
This commit adds a proper flexbox layout to the '.card-actions' CSS class. Previously, action buttons (like Export and Refresh on the Analytics page) were bunching up because they lacked a flex container with appropriate gap and wrapping rules. It also updates the '.card-header' to wrap gracefully on smaller screens.
This commit corrects the CSS for the login form's floating labels. Previously, the label floated too high and had a solid background that contrasted poorly with the input box. The label now floats exactly on the border line and uses a linear-gradient to seamlessly blend the card background into the input background, fixing the 'misframed' appearance.
This commit enhances the Model Registry UI by adding dropdown filters for Provider, Modality (Text/Image/Audio), and Capabilities (Tool Calling/Reasoning) alongside the existing text search. The filtering logic has been refactored to be non-destructive and apply instantly on the client side.
This commit modifies the /api/models endpoint so that when fetching 'used models' for the Cost Management view, it accurately pairs each model with the exact provider it was routed through (by querying SELECT DISTINCT provider, model FROM llm_requests). Previously, it relied on the global registry's mapping, which could falsely attribute usage to unconfigured or alternate providers.
This commit adds 'unsafe-inline' to the script-src CSP directive. The frontend dashboard heavily relies on inline event handlers (e.g., onclick=...) dynamically generated via template literals in its vanilla JavaScript architecture. Without this directive, modern browsers block these handlers, rendering interactive elements like the Config button completely inert.
This commit updates the Content Security Policy to allow scripts, styles, and fonts from cdn.jsdelivr.net, cdnjs.cloudflare.com, fonts.googleapis.com, and fonts.gstatic.com. This resolves the 'luxon is not defined' error and fixes the broken charts by allowing Chart.js, Luxon, FontAwesome, and Google Fonts to load properly in the dashboard.
This commit resolves the 'Failed to load statistics' issue where dashboard panels appeared empty. The dashboard makes 10+ concurrent API requests on load, which was instantly triggering the global rate limit's burst threshold (default 10). Internal dashboard endpoints are now exempt from this strict LLM-traffic rate limiting since they are already secured by admin authentication.
This commit updates the frontend API client to intercept authentication errors (like a stale session after a server restart) and immediately clear the local storage and show the login screen. It also adds an onsubmit handler to the login form in index.html to prevent the browser from defaulting to a GET request that puts credentials in the URL if JavaScript fails to initialize or encounters an error.
This commit adds the missing auth::require_admin check to all analytics, system info, and configuration list endpoints. It also improves error logging in the usage summary handler to aid in troubleshooting 'Failed to load statistics' errors.
Updated OpenAI Responses API to use a structured input format (array of objects) for better compatibility. Added a proactive error probe to chat_responses_stream to capture and log API error bodies on failure.
This commit adds support for the OpenAI Responses API in both streaming and non-streaming modes. It also implements proactive routing for gpt-5 and codex models and cleans up unused 'session' variable warnings across the dashboard source files.
This commit introduces:
- AES-256-GCM encryption for LLM provider API keys in the database.
- HMAC-SHA256 signed session tokens with activity-based refresh logic.
- Standardized frontend XSS protection using a global escapeHtml utility.
- Hardened security headers and request body size limits.
- Improved database integrity with foreign key enforcement and atomic transactions.
- Integration tests for the full encrypted key storage and proxy usage lifecycle.
This commit fixes the Gemini API 'Invalid value at thought_signature' error by ensuring synthetic 'call_' IDs are not passed into the TYPE_BYTES field. It also adds a pre-pass to correctly resolve function names from tool call IDs for tool responses.
- Ensure ALL assistant messages in history have reasoning_content and string content.
- Use a single space as a professional minimal placeholder for missing reasoning.
- Log full offending request bodies at ERROR level for detailed debugging.
- Fix DeepSeek R1 (reasoner) 400 errors by ensuring assistant messages with
tool_calls in history always have non-null 'content' and 'reasoning_content'.
- Implement deterministic tool call ID truncation (max 40 chars) for OpenAI
compatibility (fixes errors when history contains long Gemini signatures).
- Automatic transition from 'max_tokens' to 'max_completion_tokens' for newer
OpenAI models (o1, o3, gpt-5-nano).
- Added 'reasoning' and 'thought' aliases to reasoning_content for robust
deserialization from various clients.
Newer OpenAI models (o1, o3, gpt-5) have deprecated 'max_tokens' in favor of
'max_completion_tokens'. The provider now automatically maps this parameter
to ensure compatibility and avoid 400 errors.