GopherGate

Author	SHA1	Message	Date
hobokenchicken	21e5204abd	fix(ollama): improve tool-calling support and restore gemma/llama context limits CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Explicitly set tool_choice: auto when tools are present to aid gemma/llama models - Sync stop sequences into the options map for broader compatibility - Restore gemma/llama to the high-context (32k) optimization list	2026-04-07 14:24:23 +00:00
hobokenchicken	4095c68822	fix(ollama): improve model detection and ensure robust token/context limits CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Use case-insensitive matching for model names and routing - Default max_tokens/num_predict to 8192 for all Ollama models to prevent truncation - Increase default context window and add more large-context model families - Ensure DeepSeek routing handles Ollama-hosted variants correctly	2026-04-07 14:05:21 +00:00
hobokenchicken	ef37dc5af0	fix(ollama): significantly increase context and prediction limits CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Increase timeout to 15m - Set num_ctx to 32k for common models - Set default num_predict to 8192 for common models	2026-04-07 13:48:02 +00:00
hobokenchicken	fdbb068a6c	fix(ollama): map max_tokens to num_predict and increase context window CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Map MaxTokens to num_predict in options map - Set default num_ctx to 8192 for common models (gemma, llama, etc.) - This ensures Ollama doesn't cut off responses early due to default limits	2026-04-07 13:44:17 +00:00
hobokenchicken	dbbf48cb14	fix(ollama): increase timeout and add default max_tokens for large models CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Increase Ollama timeout to 5m for larger models (e.g. gemma4) - Set default max_tokens to 4096 for common Ollama models - Expand stream scanner buffer to 10MB to prevent truncation - Improve model routing and prefix stripping in server	2026-04-07 13:42:10 +00:00
hobokenchicken	1b5cd2815e	fix(ollama): improve error handling and add timeouts CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Add timeouts (30s) and retries to resty client - Add debug logging for Ollama requests and responses - Import time package for timeout configuration - This should fix 500 errors and provide better error messages	2026-04-06 15:05:31 -04:00
hobokenchicken	2f6b7deb2c	feat(providers): add Ollama provider support CI / Lint (push) Has been cancelled Details CI / Test (push) Has been cancelled Details CI / Build (push) Has been cancelled Details - Implement OllamaProvider with OpenAI-compatible API integration - Add Ollama to provider initialization in server.go - Update config.go to handle Ollama (no API key required) - Configure .env with Ollama server at 172.20.1.222:11434 - Support models: glm-4.7-flash:latest, qwen3-coder:30b, gemma4:26b	2026-04-06 14:38:35 -04:00

7 Commits