The 40-character truncation of tool call IDs in helper.go caused collisions
when models (like deepseek-v4-flash) generated longer IDs, leading to
"Duplicate value for 'tool_call_id'" errors. Removed the limit to allow
full unique IDs.
DeepSeek: updated reasoning_content injection to use an empty string
instead of a space, better matching provider expectations for history.
Improved API error reporting across all providers by capturing raw body
content when response parsing fails or returns empty strings.
Previously, provider selection happened on the raw client-requested model
name (e.g. 'dispatcher') which defaulted to OpenAI. After routing resolved
it to 'deepseek-v4-flash', the provider was never re-selected.
Now prefix-stripping + routing runs first, then selectProvider() picks
the correct provider based on the resolved concrete model.
Extracted selectProvider() method from handleChatCompletions' inline
logic. The classifier callback now calls selectProvider(selectorModel)
instead of hardcoding openaiProvider.
This fixes the 'circuit breaker is open' error when dispatcher tries
to use deepseek-v4-flash as its selector model.
RouteToConcrete() recursively resolves group chains until a concrete
model is reached, with cycle detection and max depth (10) guard.
Example: all-purpose -> fast-flow -> deepseek-v4-flash
The dashboard log shows the full chain: 'deepseek-v4-flash (hierarchical:
fast-flow (default (first target)) -> deepseek-v4-flash (default (first target)))'
Schema: Added logic_level (INTEGER) and primary_use (TEXT) columns
to model_groups table with auto-migration for existing databases.
Seed: Three new default groups:
heavy-logic (level 9) — Complex Coding, Logic, Agents
standard-pro (level 5) — General Assistant, Long Docs
fast-flow (level 2) — Classification, JSON, Basic Q&A
Admin API: INSERT/UPDATE handlers now accept and persist the new fields.
Dashboard: Table shows Level and Primary Use columns; form includes
both fields with appropriate inputs and placeholders.
Add Groups() method to Router so handleListModels can append model
group IDs (e.g. 'deepseek-auto', 'openai-auto') to the model list,
marked with owned_by: 'gophergate'. This lets clients discover and
use groups via the standard OpenAI /v1/models endpoint.
When using model groups (e.g. 'deepseek-auto'), the dashboard logged the
group name instead of the concrete resolved model (e.g. 'deepseek-reasoner').
Now:
- logRequest passes the resolved modelID (concrete) + modelGroup (group name)
- RequestLog struct has a new ModelGroup field (omitempty)
- Dashboard displays resolved model (via group) when a group was used
Files changed:
internal/server/logging.go - add ModelGroup field
internal/server/server.go - pass resolved modelID, capture modelGroup
static/js/websocket.js - show group annotation in Recent Activity
static/js/pages/overview.js - show group annotation in overview table
static/js/pages/monitoring.js - show group annotation in stream
Logs what max_tokens the client sends, whether gophergate injects
one from the registry, and the final value forwarded to the provider.
Helps trace output truncation issues.
Go 1.26 changed NullTime.Scan to delegate to convertAssign,
which has no string->time.Time conversion path. The
modernc.org/sqlite driver returns raw strings for aggregate
expressions like MAX(last_used_at), causing silent scan failures
that made all clients/providers show 'Never' for last used.
Fixes by scanning into a string and parsing with time.Parse.
When a client omits max_tokens, providers (DeepSeek, etc.) apply
a low server-side default output cap. Now gophergate looks up the
model in the models.dev registry and injects the model's output
limit, preventing silent truncation.
- Use case-insensitive matching for model names and routing
- Default max_tokens/num_predict to 8192 for all Ollama models to prevent truncation
- Increase default context window and add more large-context model families
- Ensure DeepSeek routing handles Ollama-hosted variants correctly
- Increase Ollama timeout to 5m for larger models (e.g. gemma4)
- Set default max_tokens to 4096 for common Ollama models
- Expand stream scanner buffer to 10MB to prevent truncation
- Improve model routing and prefix stripping in server
- Add Ollama to allowed providers in model list endpoint
- Add model pattern detection for Ollama models (glm-, qwen, gemma, llama, mistral, codellama)
- This fixes 500 errors when using Ollama models via /v1/chat/completions
- Implement OllamaProvider with OpenAI-compatible API integration
- Add Ollama to provider initialization in server.go
- Update config.go to handle Ollama (no API key required)
- Configure .env with Ollama server at 172.20.1.222:11434
- Support models: glm-4.7-flash:latest, qwen3-coder:30b, gemma4:26b
- Added AES-GCM encryption/decryption for provider API keys in the database.
- Implemented RefreshProviders to load provider configs from the database with precedence over environment variables.
- Updated dashboard handlers to encrypt keys on save and trigger in-memory provider refresh.
- Updated Grok test model to grok-3-mini for better compatibility.
- Fixed status indicator UI mapping in websocket.js and index.html.
- Added missing CSS for connection status indicator and pulse animation.
- Made initial model registry fetch asynchronous to prevent blocking server startup.
- Improved configuration loading to correctly handle LLM_PROXY__SERVER__PORT from environment.
Added JSON tags to the User struct to match frontend expectations and excluded sensitive fields.
Updated session management to include and persist DisplayName.
Unified user field names (using display_name) across backend, sessions, and frontend UI.
Updated handleGetProviders to include available models and last-used timestamps. Refined Model Pricing table to strictly filter by core providers and actual usage.
Added /api/system/metrics with CPU/Mem/Disk/Load data using gopsutil. Updated Hub to track active WebSocket listeners. Verified log format for monitoring charts.
Filtered registry iteration to only include openai, gemini, deepseek, and grok. Improved used_only logic to match specific (model, provider) pairs from logs.
Refined CalculateCost to correctly handle cached token discounts. Added fuzzy matching to model lookup. Robustified SQL date extraction using SUBSTR and LIKE for better SQLite compatibility.
Fixed '2025 years ago' issue by correctly handling zero-value timestamps. Improved SQL scanning logic to handle NULL values more safely across all analytics handlers.