hobokenchicken
37949e560b
feat: add model groups dashboard page with CRUD UI
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-05-05 10:55:25 -04:00
hobokenchicken
f04cb6b8f2
feat: add model groups CRUD admin API endpoints
2026-05-05 10:50:33 -04:00
hobokenchicken
10262c0e5a
feat: wire model group router into chat completions handler
2026-05-05 10:47:32 -04:00
hobokenchicken
d345f8c41d
feat: add classifier routing strategy with LLM complexity rating
2026-05-05 10:40:26 -04:00
hobokenchicken
d1f7a57f58
feat: add router package with heuristic strategy
2026-05-05 10:37:36 -04:00
hobokenchicken
dc9af4d79c
feat: add model_groups table and default seed data
2026-05-05 10:33:35 -04:00
hobokenchicken
c009d401fb
docs: add Responses API endpoint to README
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-05-05 09:36:51 -04:00
hobokenchicken
e5ef39f327
feat: add OpenAI Responses API support (POST /v1/responses)
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Add full Responses API endpoint alongside existing Chat Completions,
with identical logging/tracking/cost pipeline.
New:
- internal/models/responses.go — request/response/stream types + ToUsage() bridge
- internal/providers/openai_responses.go — OpenAI Responses/ResponsesStream
Modified:
- provider.go — Responses()+ResponsesStream() added to Provider interface
- helpers.go — BuildOpenAIResponsesBody, parsers, SSE stream reader
- circuit_breaker.go — CB wraps Responses, passthrough for stream
- server.go — POST /v1/responses route + handleResponses handler
- all non-OpenAI providers — stub methods with clear error messages
Logging: ResponsesUsage.ToUsage() bridges to models.Usage, feeding same
logRequest() -> DB insert -> dashboard WS -> client stats -> cost calc
pipeline. No schema or logger changes needed.
2026-05-02 16:38:17 -04:00
hobokenchicken
eb67287b56
fix: raise provider HTTP timeouts from 30s to 10min
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
30-second resty client timeout was killing long streaming responses
mid-generation. Models with large output windows (e.g. deepseek-v4-pro
at 384K max_tokens) routinely exceed 30s. Raised all providers to
10 minutes (Ollama already at 15min, unchanged). Circuit breaker
recovery timeout raised from 30s to 5min.
2026-04-30 10:17:45 -04:00
hobokenchicken
4aa17b4fd2
debug: add max_tokens trace logging to chat completions handler
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Logs what max_tokens the client sends, whether gophergate injects
one from the registry, and the final value forwarded to the provider.
Helps trace output truncation issues.
2026-04-30 10:04:50 -04:00
hobokenchicken
79571c6bdc
fix: replace sql.NullTime with string scan for MAX() aggregate queries
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Go 1.26 changed NullTime.Scan to delegate to convertAssign,
which has no string->time.Time conversion path. The
modernc.org/sqlite driver returns raw strings for aggregate
expressions like MAX(last_used_at), causing silent scan failures
that made all clients/providers show 'Never' for last used.
Fixes by scanning into a string and parsing with time.Parse.
2026-04-30 09:32:11 -04:00
hobokenchicken
d46a333249
feat: inject max_tokens from models.dev registry when not specified in request
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
When a client omits max_tokens, providers (DeepSeek, etc.) apply
a low server-side default output cap. Now gophergate looks up the
model in the models.dev registry and injects the model's output
limit, preventing silent truncation.
2026-04-28 15:36:06 -04:00
hobokenchicken
7446f3463d
fix: add per-image cost tracking for DALL-E and Imagen
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-27 10:42:29 -04:00
hobokenchicken
b1a72f5a10
fix: estimate image gen tokens from prompt length instead of hardcoding
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-27 10:28:39 -04:00
hobokenchicken
5ee539d95c
feat: add image generation for OpenAI DALL-E and Gemini Imagen
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
New `/v1/images/generations` endpoint proxies DALL-E 2/3 (OpenAI)
and Imagen 3 (Gemini). Same auth/logging as chat completions.
- Add ImageGenerationRequest/Response models
- Extend Provider interface with ImageGeneration()
- OpenAI: forward to /v1/images/generations
- Gemini: call /v1beta/models/{model}:predict, map OpenAI params
- Circuit breaker wraps image gen like chat completions
- Model routing: dall-e* -> openai, imagen*/gemini* -> gemini
- Unsupported providers (deepseek/moonshot/grok/ollama) return error
- Fix pre-existing CachedContentTokenCount bug in StreamGemini
2026-04-27 10:06:07 -04:00
hobokenchicken
14e26a4323
feat: capture Gemini cached content tokens in cost tracking
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Add CachedContentTokenCount to UsageMetadata parsing for both
streaming (helpers.go) and non-streaming (gemini.go) requests
- CacheReadTokens now populated from Gemini cachedContentTokenCount
- Add uint32Ptr helper for nil-safe uint32 pointer creation
2026-04-26 21:14:53 -04:00
hobokenchicken
1c3b1c6fe9
fix: FindModel reverse fuzzy match for date-suffixed model IDs
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Add step between exact ID match and forward fuzzy match that checks
if registry model ID starts with the requested name. Fixes models like
'gpt-5.4-mini' not matching 'gpt-5.4-mini-2026-04-01' in registry.
2026-04-26 21:09:56 -04:00
hobokenchicken
5e0c10db01
fix: goimports — strip unused imports from all server files
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-26 15:00:04 -04:00
hobokenchicken
e598150d90
fix: trim imports per file — no unused imports in split files
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-26 14:58:53 -04:00
hobokenchicken
2fa6f0df62
fix: split dashboard.go properly — extract analytics + models_config
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- analytics.go: UsagePeriodFilter, UsageSummary, TimeSeries, ProvidersUsage, ClientsUsage, AnalyticsBreakdown, DetailedUsage
- models_config.go: handleGetModels, handleUpdateModel
- Fix all import blocks with missing closing parens
- Remove leftover fmt.Printf warnings in server.go
2026-04-26 14:57:28 -04:00
hobokenchicken
db76858072
fix: import block syntax in split dashboard files
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Add missing closing ) in clients.go, providers_admin.go, users.go, system.go
- Add SetTimeout(30s) to OpenAI provider (was resty.New() with no timeout)
2026-04-26 14:55:29 -04:00
hobokenchicken
af2c5b95f7
feat: Phase 3 - architecture & maintainability
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Split 1474-line dashboard.go into 5 domain files (clients, providers, users, system)
- Unit tests for ModelRegistry.FindModel and CalculateCost
- go mod tidy + verify (deps clean)
- .gitignore excludes tool cache dirs (.pi-lens/, .opencode/)
2026-04-26 14:52:10 -04:00
hobokenchicken
1f574d8134
feat: Phase 2 - reliability & observability
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Circuit breaker: proper thresholds (3 failures, 30s timeout)
- HTTP timeouts: 30s on all providers (was no timeout)
- Structured logging: slog replaces fmt.Printf throughout
- Stream errors: propagated as SSE error events to client
- Registry fetch: retry with backoff (3 attempts)
- Registry reads in dashboard protected by RWMutex
2026-04-26 14:48:56 -04:00
hobokenchicken
8a8d8d1477
fix: Phase 1 - security & stability patches
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- AuthMiddleware now requires auth on /v1/* routes (returns 401)
- WebSocket origin check configurable via WSAllowedOrigin
- Removed debug fmt.Printf leaks (config, ollama, server)
- Registry access protected by sync.RWMutex (race condition fix)
- Session cleanup goroutine runs every 15 min
- RevokeSession returns error instead of silent no-op
2026-04-26 14:45:22 -04:00
hobokenchicken
da074f52b4
fix: remove global auth middleware, causing webui login issues
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-09 12:21:02 -04:00
hobokenchicken
9b0aa4dbe8
fix: remove unused fmt import in circuit breaker provider
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-09 12:19:14 -04:00
hobokenchicken
212ac14a1b
feat: implement circuit breaker, fix auth vulnerability
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-09 12:17:18 -04:00
hobokenchicken
2929f51556
fixed model visibility
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-09 12:13:53 +00:00
hobokenchicken
e12418cc4c
fix(gemini): ensure strict 1:1 pairing of model calls and function responses
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Gemini requires function results to immediately follow the model message that called them
- Implemented look-ahead grouping to pair assistant calls with their tool results
- Standardized system and orphaned tool message handling for Gemini compatibility
2026-04-07 18:57:13 +00:00
hobokenchicken
be4ec3482a
fix(gemini): group adjacent tool messages and ensure correct role sequence
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Group consecutive 'tool' messages into a single Gemini content message with multiple 'functionResponse' parts
- Ensure assistant tool calls are properly mapped and sent
- Maintain v1beta for preview and newer models
- Added debug logging for API errors
2026-04-07 18:50:48 +00:00
hobokenchicken
e67aafdac1
fix(gemini): improve tool-calling support and handle function_call response
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Support tool definitions in Gemini requests
- Map tool role to 'function' in Gemini content
- Ensure tool results are wrapped in JSON objects for Gemini compatibility
- Parse FunctionCall from Gemini response and map to OpenAI-compatible ToolCalls
- Correctly map finish_reason for tool calls
2026-04-07 18:37:57 +00:00
hobokenchicken
21e5204abd
fix(ollama): improve tool-calling support and restore gemma/llama context limits
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Explicitly set tool_choice: auto when tools are present to aid gemma/llama models
- Sync stop sequences into the options map for broader compatibility
- Restore gemma/llama to the high-context (32k) optimization list
2026-04-07 14:24:23 +00:00
hobokenchicken
4095c68822
fix(ollama): improve model detection and ensure robust token/context limits
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Use case-insensitive matching for model names and routing
- Default max_tokens/num_predict to 8192 for all Ollama models to prevent truncation
- Increase default context window and add more large-context model families
- Ensure DeepSeek routing handles Ollama-hosted variants correctly
2026-04-07 14:05:21 +00:00
hobokenchicken
ef37dc5af0
fix(ollama): significantly increase context and prediction limits
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Increase timeout to 15m
- Set num_ctx to 32k for common models
- Set default num_predict to 8192 for common models
2026-04-07 13:48:02 +00:00
hobokenchicken
fdbb068a6c
fix(ollama): map max_tokens to num_predict and increase context window
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Map MaxTokens to num_predict in options map
- Set default num_ctx to 8192 for common models (gemma, llama, etc.)
- This ensures Ollama doesn't cut off responses early due to default limits
2026-04-07 13:44:17 +00:00
hobokenchicken
dbbf48cb14
fix(ollama): increase timeout and add default max_tokens for large models
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Increase Ollama timeout to 5m for larger models (e.g. gemma4)
- Set default max_tokens to 4096 for common Ollama models
- Expand stream scanner buffer to 10MB to prevent truncation
- Improve model routing and prefix stripping in server
2026-04-07 13:42:10 +00:00
hobokenchicken
1e13b0376b
feat(ollama): improve configuration and dashboard integration
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-04-07 12:53:17 +00:00
hobokenchicken
1b5cd2815e
fix(ollama): improve error handling and add timeouts
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Add timeouts (30s) and retries to resty client
- Add debug logging for Ollama requests and responses
- Import time package for timeout configuration
- This should fix 500 errors and provide better error messages
2026-04-06 15:05:31 -04:00
hobokenchicken
ba4c4af2f8
docs: update documentation for Ollama provider
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Add Ollama configuration instructions to README.md
- Update API usage section with Ollama examples
- Add Ollama to provider list in BACKEND_ARCHITECTURE.md
- All documentation now reflects complete Ollama support
2026-04-06 15:01:55 -04:00
hobokenchicken
e56a284415
docs: update TODO.md with Ollama provider completion
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Mark Ollama as completed provider implementation
- Add Ollama-specific feature checklist
- Update provider list in completed tasks
2026-04-06 14:46:32 -04:00
hobokenchicken
cbc9eeb453
fix(server): add Ollama model detection and registry support
...
- Add Ollama to allowed providers in model list endpoint
- Add model pattern detection for Ollama models (glm-, qwen, gemma, llama, mistral, codellama)
- This fixes 500 errors when using Ollama models via /v1/chat/completions
2026-04-06 14:45:57 -04:00
hobokenchicken
2f6b7deb2c
feat(providers): add Ollama provider support
...
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Implement OllamaProvider with OpenAI-compatible API integration
- Add Ollama to provider initialization in server.go
- Update config.go to handle Ollama (no API key required)
- Configure .env with Ollama server at 172.20.1.222:11434
- Support models: glm-4.7-flash:latest, qwen3-coder:30b, gemma4:26b
2026-04-06 14:38:35 -04:00
hobokenchicken
9375448087
fix(moonshot): resolve 401 Unauthorized errors by trimming API keys and improving request compatibility
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-26 17:09:27 +00:00
hobokenchicken
5be2f6f7aa
fix: use Moonshot test model
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-26 10:12:44 -04:00
hobokenchicken
eebcadcba1
fix: surface moonshot on providers page
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-25 09:35:41 -04:00
hobokenchicken
6b2bd13903
chore: remove tracked binary gophergate
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-25 13:32:51 +00:00
hobokenchicken
5dfda0a10c
merge: resolve conflicts in server.go and integrate moonshot support
2026-03-25 13:32:40 +00:00
hobokenchicken
a8a02d9e1c
feat: add moonshot kimi k2.5 support
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-25 09:28:52 -04:00
hobokenchicken
bd1d17cc4d
feat: add moonshot kimi k2.5 support
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-25 09:27:46 -04:00
hobokenchicken
9207a7231c
chore: update all grok-2 references to grok-4
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-03-25 13:17:06 +00:00