10 Commits

Author SHA1 Message Date
newkirk b3354a1bbc Add Xiaomi MiMo provider (mimo-v2.5) support
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
2026-05-29 12:19:24 -04:00
hobokenchicken d2b9da89d9 fix FindModel: prioritize canonical providers to prevent reseller limit overrides
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
FindModel iterates providers in random map order, so when deepseek-v4-pro
exists in both 'deepseek' (output=384000) and 'ollama-cloud' (output=1048576),
it sometimes returned the wrong metadata. The proxy then injected
max_tokens=1048576 into DeepSeek's API, which rejected it with 400
(valid range is [1, 393216]).

Fix: define CanonicalProviders list (deepseek, openai, google, xai, etc.)
and search them in priority order before falling back to all providers.
Each of the four lookup strategies (exact key, metadata ID, reverse fuzzy,
forward fuzzy) checks canonical providers first.
2026-05-07 14:47:17 -04:00
hobokenchicken e5ef39f327 feat: add OpenAI Responses API support (POST /v1/responses)
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Add full Responses API endpoint alongside existing Chat Completions,
with identical logging/tracking/cost pipeline.

New:
- internal/models/responses.go — request/response/stream types + ToUsage() bridge
- internal/providers/openai_responses.go — OpenAI Responses/ResponsesStream

Modified:
- provider.go — Responses()+ResponsesStream() added to Provider interface
- helpers.go — BuildOpenAIResponsesBody, parsers, SSE stream reader
- circuit_breaker.go — CB wraps Responses, passthrough for stream
- server.go — POST /v1/responses route + handleResponses handler
- all non-OpenAI providers — stub methods with clear error messages

Logging: ResponsesUsage.ToUsage() bridges to models.Usage, feeding same
logRequest() -> DB insert -> dashboard WS -> client stats -> cost calc
pipeline. No schema or logger changes needed.
2026-05-02 16:38:17 -04:00
hobokenchicken 5ee539d95c feat: add image generation for OpenAI DALL-E and Gemini Imagen
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
New `/v1/images/generations` endpoint proxies DALL-E 2/3 (OpenAI)
and Imagen 3 (Gemini). Same auth/logging as chat completions.

- Add ImageGenerationRequest/Response models
- Extend Provider interface with ImageGeneration()
- OpenAI: forward to /v1/images/generations
- Gemini: call /v1beta/models/{model}:predict, map OpenAI params
- Circuit breaker wraps image gen like chat completions
- Model routing: dall-e* -> openai, imagen*/gemini* -> gemini
- Unsupported providers (deepseek/moonshot/grok/ollama) return error
- Fix pre-existing CachedContentTokenCount bug in StreamGemini
2026-04-27 10:06:07 -04:00
hobokenchicken 1c3b1c6fe9 fix: FindModel reverse fuzzy match for date-suffixed model IDs
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Add step between exact ID match and forward fuzzy match that checks
if registry model ID starts with the requested name. Fixes models like
'gpt-5.4-mini' not matching 'gpt-5.4-mini-2026-04-01' in registry.
2026-04-26 21:09:56 -04:00
hobokenchicken af2c5b95f7 feat: Phase 3 - architecture & maintainability
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Split 1474-line dashboard.go into 5 domain files (clients, providers, users, system)
- Unit tests for ModelRegistry.FindModel and CalculateCost
- go mod tidy + verify (deps clean)
- .gitignore excludes tool cache dirs (.pi-lens/, .opencode/)
2026-04-26 14:52:10 -04:00
hobokenchicken 1f574d8134 feat: Phase 2 - reliability & observability
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
- Circuit breaker: proper thresholds (3 failures, 30s timeout)
- HTTP timeouts: 30s on all providers (was no timeout)
- Structured logging: slog replaces fmt.Printf throughout
- Stream errors: propagated as SSE error events to client
- Registry fetch: retry with backoff (3 attempts)
- Registry reads in dashboard protected by RWMutex
2026-04-26 14:48:56 -04:00
hobokenchicken 3f76a544e0 fix: improve analytics accuracy and cost calculation
CI / Lint (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Refined CalculateCost to correctly handle cached token discounts. Added fuzzy matching to model lookup. Robustified SQL date extraction using SUBSTR and LIKE for better SQLite compatibility.
2026-03-19 12:58:08 -04:00
hobokenchicken 90874a6721 chore: consolidate env files and update gitignore
CI / Check (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Formatting (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Release Build (push) Has been cancelled
Removed .env and .env.backup from git tracking and consolidated configuration into .env.example. Updated .gitignore to robustly prevent accidental inclusion of sensitive files.
2026-03-19 10:44:22 -04:00
hobokenchicken 6b10d4249c feat: migrate backend from rust to go
CI / Check (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Formatting (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Release Build (push) Has been cancelled
This commit replaces the Axum/Rust backend with a Gin/Go implementation. The original Rust code has been archived in the 'rust' branch.
2026-03-19 10:30:05 -04:00