GopherGate

Author	SHA1	Message	Date
newkirk	40f055cb57	fix: correct deepseek pricing, gemini streaming tokens, and group-name logging CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Add promo discount system for deepseek-v4-pro (75% off until 2026-05-31) - Rewrite StreamGemini to handle both SSE and JSON array response formats, fixing 0-token logging for gemini-3-flash and gemini-3-flash-preview - Fall back to model group name for cost lookup when concrete model isnt in the registry (fixes $0 cost on deepseek-auto entries) - Move registry lock before FindModel call to fix data race	2026-05-17 19:49:37 -04:00
hobokenchicken	477a811999	fix: remove tool call ID truncation and improve DeepSeek reasoning handling CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details The 40-character truncation of tool call IDs in helper.go caused collisions when models (like deepseek-v4-flash) generated longer IDs, leading to "Duplicate value for 'tool_call_id'" errors. Removed the limit to allow full unique IDs. DeepSeek: updated reasoning_content injection to use an empty string instead of a space, better matching provider expectations for history. Improved API error reporting across all providers by capturing raw body content when response parsing fails or returns empty strings.	2026-05-11 03:13:33 +00:00
hobokenchicken	d2b9da89d9	fix FindModel: prioritize canonical providers to prevent reseller limit overrides CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details FindModel iterates providers in random map order, so when deepseek-v4-pro exists in both 'deepseek' (output=384000) and 'ollama-cloud' (output=1048576), it sometimes returned the wrong metadata. The proxy then injected max_tokens=1048576 into DeepSeek's API, which rejected it with 400 (valid range is [1, 393216]). Fix: define CanonicalProviders list (deepseek, openai, google, xai, etc.) and search them in priority order before falling back to all providers. Each of the four lookup strategies (exact key, metadata ID, reverse fuzzy, forward fuzzy) checks canonical providers first.	2026-05-07 14:47:17 -04:00
hobokenchicken	28b8271c1d	fix: inject English system prompt for DeepSeek provider CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details DeepSeek models default to Chinese for some prompts. The ensureEnglish() function prepends 'Always respond in English' as a system message when no system prompt is already set. Applied to both ChatCompletion and ChatCompletionStream paths.	2026-05-07 14:03:39 -04:00
hobokenchicken	eb585c0001	fix: switch dispatcher classifier to gpt-5.4-nano CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details gpt-5.4-nano correctly discriminates complexity (1 vs 10) while deepseek-v4-flash rated everything as 1/10.	2026-05-07 14:00:19 -04:00
hobokenchicken	4aea7a3b4c	fix: select provider AFTER routing resolves model groups CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Previously, provider selection happened on the raw client-requested model name (e.g. 'dispatcher') which defaulted to OpenAI. After routing resolved it to 'deepseek-v4-flash', the provider was never re-selected. Now prefix-stripping + routing runs first, then selectProvider() picks the correct provider based on the resolved concrete model.	2026-05-07 13:54:42 -04:00
hobokenchicken	330eaa57d1	fix: update model names to match current models.dev registry CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details heavy-logic: kimi-k2.5 -> kimi-k2.6 standard-pro: gemini-3-flash -> gemini-3-flash-preview	2026-05-07 13:48:33 -04:00
hobokenchicken	0ae30036f0	fix: classifier selector model now routes to correct provider CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Extracted selectProvider() method from handleChatCompletions' inline logic. The classifier callback now calls selectProvider(selectorModel) instead of hardcoding openaiProvider. This fixes the 'circuit breaker is open' error when dispatcher tries to use deepseek-v4-flash as its selector model.	2026-05-07 13:37:19 -04:00
hobokenchicken	3c0b59622e	feat: classifier bucket mapping + dispatcher seed group CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Classifier: When complexity_threshold is set (e.g. 10), uses it as the rating scale and maps ratings proportionally to target buckets instead of 1:1. Formula: idx = rating * len(targets) / (threshold + 1). With threshold=10 and 3 targets: 1-3→target[0], 4-7→target[1], 8-10→target[2]. Seed: Added 'dispatcher' group (classifier, threshold=10, selector=deepseek-v4-flash) that auto-routes to fast-flow/standard-pro/heavy-logic by complexity score. Combined with hierarchical routing, this enables two-level dispatch: dispatcher scores 1-10 → routes to tier group → tier picks concrete model.	2026-05-07 13:18:35 -04:00
hobokenchicken	7517307c11	feat: add hierarchical routing — groups can target other groups CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details RouteToConcrete() recursively resolves group chains until a concrete model is reached, with cycle detection and max depth (10) guard. Example: all-purpose -> fast-flow -> deepseek-v4-flash The dashboard log shows the full chain: 'deepseek-v4-flash (hierarchical: fast-flow (default (first target)) -> deepseek-v4-flash (default (first target)))'	2026-05-07 12:28:31 -04:00
hobokenchicken	a3a6f765e7	feat: add logic_level and primary_use metadata to model groups CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Schema: Added logic_level (INTEGER) and primary_use (TEXT) columns to model_groups table with auto-migration for existing databases. Seed: Three new default groups: heavy-logic (level 9) — Complex Coding, Logic, Agents standard-pro (level 5) — General Assistant, Long Docs fast-flow (level 2) — Classification, JSON, Basic Q&A Admin API: INSERT/UPDATE handlers now accept and persist the new fields. Dashboard: Table shows Level and Primary Use columns; form includes both fields with appropriate inputs and placeholders.	2026-05-07 12:01:28 -04:00
hobokenchicken	79dd122b56	feat: expose model groups in /v1/models endpoint CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Add Groups() method to Router so handleListModels can append model group IDs (e.g. 'deepseek-auto', 'openai-auto') to the model list, marked with owned_by: 'gophergate'. This lets clients discover and use groups via the standard OpenAI /v1/models endpoint.	2026-05-07 11:26:05 -04:00
hobokenchicken	3021e4b2b4	fix: log resolved model name instead of group name in Recent Activity CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details When using model groups (e.g. 'deepseek-auto'), the dashboard logged the group name instead of the concrete resolved model (e.g. 'deepseek-reasoner'). Now: - logRequest passes the resolved modelID (concrete) + modelGroup (group name) - RequestLog struct has a new ModelGroup field (omitempty) - Dashboard displays resolved model (via group) when a group was used Files changed: internal/server/logging.go - add ModelGroup field internal/server/server.go - pass resolved modelID, capture modelGroup static/js/websocket.js - show group annotation in Recent Activity static/js/pages/overview.js - show group annotation in overview table static/js/pages/monitoring.js - show group annotation in stream	2026-05-07 11:16:36 -04:00
hobokenchicken	14de7e9ebf	fix: wrap model-groups API responses in SuccessResponse for api.js client CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-05-05 11:41:23 -04:00
hobokenchicken	f04cb6b8f2	feat: add model groups CRUD admin API endpoints	2026-05-05 10:50:33 -04:00
hobokenchicken	10262c0e5a	feat: wire model group router into chat completions handler	2026-05-05 10:47:32 -04:00
hobokenchicken	d345f8c41d	feat: add classifier routing strategy with LLM complexity rating	2026-05-05 10:40:26 -04:00
hobokenchicken	d1f7a57f58	feat: add router package with heuristic strategy	2026-05-05 10:37:36 -04:00
hobokenchicken	dc9af4d79c	feat: add model_groups table and default seed data	2026-05-05 10:33:35 -04:00
hobokenchicken	e5ef39f327	feat: add OpenAI Responses API support (POST /v1/responses) CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Add full Responses API endpoint alongside existing Chat Completions, with identical logging/tracking/cost pipeline. New: - internal/models/responses.go — request/response/stream types + ToUsage() bridge - internal/providers/openai_responses.go — OpenAI Responses/ResponsesStream Modified: - provider.go — Responses()+ResponsesStream() added to Provider interface - helpers.go — BuildOpenAIResponsesBody, parsers, SSE stream reader - circuit_breaker.go — CB wraps Responses, passthrough for stream - server.go — POST /v1/responses route + handleResponses handler - all non-OpenAI providers — stub methods with clear error messages Logging: ResponsesUsage.ToUsage() bridges to models.Usage, feeding same logRequest() -> DB insert -> dashboard WS -> client stats -> cost calc pipeline. No schema or logger changes needed.	2026-05-02 16:38:17 -04:00
hobokenchicken	eb67287b56	fix: raise provider HTTP timeouts from 30s to 10min CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details 30-second resty client timeout was killing long streaming responses mid-generation. Models with large output windows (e.g. deepseek-v4-pro at 384K max_tokens) routinely exceed 30s. Raised all providers to 10 minutes (Ollama already at 15min, unchanged). Circuit breaker recovery timeout raised from 30s to 5min.	2026-04-30 10:17:45 -04:00
hobokenchicken	4aa17b4fd2	debug: add max_tokens trace logging to chat completions handler CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Logs what max_tokens the client sends, whether gophergate injects one from the registry, and the final value forwarded to the provider. Helps trace output truncation issues.	2026-04-30 10:04:50 -04:00
hobokenchicken	79571c6bdc	fix: replace sql.NullTime with string scan for MAX() aggregate queries CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Go 1.26 changed NullTime.Scan to delegate to convertAssign, which has no string->time.Time conversion path. The modernc.org/sqlite driver returns raw strings for aggregate expressions like MAX(last_used_at), causing silent scan failures that made all clients/providers show 'Never' for last used. Fixes by scanning into a string and parsing with time.Parse.	2026-04-30 09:32:11 -04:00
hobokenchicken	d46a333249	feat: inject max_tokens from models.dev registry when not specified in request CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details When a client omits max_tokens, providers (DeepSeek, etc.) apply a low server-side default output cap. Now gophergate looks up the model in the models.dev registry and injects the model's output limit, preventing silent truncation.	2026-04-28 15:36:06 -04:00
hobokenchicken	7446f3463d	fix: add per-image cost tracking for DALL-E and Imagen CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-27 10:42:29 -04:00
hobokenchicken	b1a72f5a10	fix: estimate image gen tokens from prompt length instead of hardcoding CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-27 10:28:39 -04:00
hobokenchicken	5ee539d95c	feat: add image generation for OpenAI DALL-E and Gemini Imagen CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details New `/v1/images/generations` endpoint proxies DALL-E 2/3 (OpenAI) and Imagen 3 (Gemini). Same auth/logging as chat completions. - Add ImageGenerationRequest/Response models - Extend Provider interface with ImageGeneration() - OpenAI: forward to /v1/images/generations - Gemini: call /v1beta/models/{model}:predict, map OpenAI params - Circuit breaker wraps image gen like chat completions - Model routing: dall-e* -> openai, imagen/gemini -> gemini - Unsupported providers (deepseek/moonshot/grok/ollama) return error - Fix pre-existing CachedContentTokenCount bug in StreamGemini	2026-04-27 10:06:07 -04:00
hobokenchicken	14e26a4323	feat: capture Gemini cached content tokens in cost tracking CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Add CachedContentTokenCount to UsageMetadata parsing for both streaming (helpers.go) and non-streaming (gemini.go) requests - CacheReadTokens now populated from Gemini cachedContentTokenCount - Add uint32Ptr helper for nil-safe uint32 pointer creation	2026-04-26 21:14:53 -04:00
hobokenchicken	1c3b1c6fe9	fix: FindModel reverse fuzzy match for date-suffixed model IDs CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details Add step between exact ID match and forward fuzzy match that checks if registry model ID starts with the requested name. Fixes models like 'gpt-5.4-mini' not matching 'gpt-5.4-mini-2026-04-01' in registry.	2026-04-26 21:09:56 -04:00
hobokenchicken	5e0c10db01	fix: goimports — strip unused imports from all server files CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-26 15:00:04 -04:00
hobokenchicken	e598150d90	fix: trim imports per file — no unused imports in split files CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-26 14:58:53 -04:00
hobokenchicken	2fa6f0df62	fix: split dashboard.go properly — extract analytics + models_config CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - analytics.go: UsagePeriodFilter, UsageSummary, TimeSeries, ProvidersUsage, ClientsUsage, AnalyticsBreakdown, DetailedUsage - models_config.go: handleGetModels, handleUpdateModel - Fix all import blocks with missing closing parens - Remove leftover fmt.Printf warnings in server.go	2026-04-26 14:57:28 -04:00
hobokenchicken	db76858072	fix: import block syntax in split dashboard files CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Add missing closing ) in clients.go, providers_admin.go, users.go, system.go - Add SetTimeout(30s) to OpenAI provider (was resty.New() with no timeout)	2026-04-26 14:55:29 -04:00
hobokenchicken	af2c5b95f7	feat: Phase 3 - architecture & maintainability CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Split 1474-line dashboard.go into 5 domain files (clients, providers, users, system) - Unit tests for ModelRegistry.FindModel and CalculateCost - go mod tidy + verify (deps clean) - .gitignore excludes tool cache dirs (.pi-lens/, .opencode/)	2026-04-26 14:52:10 -04:00
hobokenchicken	1f574d8134	feat: Phase 2 - reliability & observability CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Circuit breaker: proper thresholds (3 failures, 30s timeout) - HTTP timeouts: 30s on all providers (was no timeout) - Structured logging: slog replaces fmt.Printf throughout - Stream errors: propagated as SSE error events to client - Registry fetch: retry with backoff (3 attempts) - Registry reads in dashboard protected by RWMutex	2026-04-26 14:48:56 -04:00
hobokenchicken	8a8d8d1477	fix: Phase 1 - security & stability patches CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - AuthMiddleware now requires auth on /v1/* routes (returns 401) - WebSocket origin check configurable via WSAllowedOrigin - Removed debug fmt.Printf leaks (config, ollama, server) - Registry access protected by sync.RWMutex (race condition fix) - Session cleanup goroutine runs every 15 min - RevokeSession returns error instead of silent no-op	2026-04-26 14:45:22 -04:00
hobokenchicken	da074f52b4	fix: remove global auth middleware, causing webui login issues CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-09 12:21:02 -04:00
hobokenchicken	9b0aa4dbe8	fix: remove unused fmt import in circuit breaker provider CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-09 12:19:14 -04:00
hobokenchicken	212ac14a1b	feat: implement circuit breaker, fix auth vulnerability CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-09 12:17:18 -04:00
hobokenchicken	2929f51556	fixed model visibility CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-09 12:13:53 +00:00
hobokenchicken	e12418cc4c	fix(gemini): ensure strict 1:1 pairing of model calls and function responses CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Gemini requires function results to immediately follow the model message that called them - Implemented look-ahead grouping to pair assistant calls with their tool results - Standardized system and orphaned tool message handling for Gemini compatibility	2026-04-07 18:57:13 +00:00
hobokenchicken	be4ec3482a	fix(gemini): group adjacent tool messages and ensure correct role sequence CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Group consecutive 'tool' messages into a single Gemini content message with multiple 'functionResponse' parts - Ensure assistant tool calls are properly mapped and sent - Maintain v1beta for preview and newer models - Added debug logging for API errors	2026-04-07 18:50:48 +00:00
hobokenchicken	e67aafdac1	fix(gemini): improve tool-calling support and handle function_call response CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Support tool definitions in Gemini requests - Map tool role to 'function' in Gemini content - Ensure tool results are wrapped in JSON objects for Gemini compatibility - Parse FunctionCall from Gemini response and map to OpenAI-compatible ToolCalls - Correctly map finish_reason for tool calls	2026-04-07 18:37:57 +00:00
hobokenchicken	21e5204abd	fix(ollama): improve tool-calling support and restore gemma/llama context limits CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Explicitly set tool_choice: auto when tools are present to aid gemma/llama models - Sync stop sequences into the options map for broader compatibility - Restore gemma/llama to the high-context (32k) optimization list	2026-04-07 14:24:23 +00:00
hobokenchicken	4095c68822	fix(ollama): improve model detection and ensure robust token/context limits CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Use case-insensitive matching for model names and routing - Default max_tokens/num_predict to 8192 for all Ollama models to prevent truncation - Increase default context window and add more large-context model families - Ensure DeepSeek routing handles Ollama-hosted variants correctly	2026-04-07 14:05:21 +00:00
hobokenchicken	ef37dc5af0	fix(ollama): significantly increase context and prediction limits CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Increase timeout to 15m - Set num_ctx to 32k for common models - Set default num_predict to 8192 for common models	2026-04-07 13:48:02 +00:00
hobokenchicken	fdbb068a6c	fix(ollama): map max_tokens to num_predict and increase context window CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Map MaxTokens to num_predict in options map - Set default num_ctx to 8192 for common models (gemma, llama, etc.) - This ensures Ollama doesn't cut off responses early due to default limits	2026-04-07 13:44:17 +00:00
hobokenchicken	dbbf48cb14	fix(ollama): increase timeout and add default max_tokens for large models CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Increase Ollama timeout to 5m for larger models (e.g. gemma4) - Set default max_tokens to 4096 for common Ollama models - Expand stream scanner buffer to 10MB to prevent truncation - Improve model routing and prefix stripping in server	2026-04-07 13:42:10 +00:00
hobokenchicken	1e13b0376b	feat(ollama): improve configuration and dashboard integration CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details	2026-04-07 12:53:17 +00:00
hobokenchicken	1b5cd2815e	fix(ollama): improve error handling and add timeouts CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Add timeouts (30s) and retries to resty client - Add debug logging for Ollama requests and responses - Import time package for timeout configuration - This should fix 500 errors and provide better error messages	2026-04-06 15:05:31 -04:00

1 2

91 Commits