b7df3108fa
README: Added hierarchical routing, classifier bucket mapping, two-level dispatch, model groups table, DeepSeek language note, deploy script, and updated model names to match current models.dev registry. TODO: Added 15 completed items covering model groups, routing, dispatch, and provider fixes from May 7 session. deployment.md: Added deploy.sh instructions.
3.5 KiB
3.5 KiB
Migration TODO List
Completed Tasks
- Initial Go project setup
- Database schema & migrations (hardcoded in
db.go) - Configuration loader (Viper)
- Auth Middleware (scoped to
/v1) - Basic Provider implementations (OpenAI, Gemini, DeepSeek, Grok, Ollama)
- Streaming Support (SSE & Gemini custom streaming)
- Archive Rust files to
rustbranch - Clean root and set Go version as
main - Enhanced
helpers.gofor Multimodal & Tool Calling (OpenAI compatible) - Enhanced
server.gofor robust request conversion - Dashboard Management APIs (Clients, Tokens, Users, Providers)
- Dashboard Analytics & Usage Summary (Fixed SQL robustness)
- WebSocket for real-time dashboard updates (Hub with client counting)
- Asynchronous Request Logging to SQLite
- Cost Tracking accuracy (Registry integration with
models.dev) - Model Listing endpoint (
/v1/models) with provider filtering - System Metrics endpoint (
/api/system/metricsusinggopsutil) - Fixed dashboard 404s and 500s
- Model groups with heuristic and classifier routing strategies
- Hierarchical routing — groups can target other groups with cycle detection
- Classifier bucket mapping via complexity_threshold (1-10 scale -> N targets)
- Two-level dispatch — classifier router delegates to tier groups
- Model groups exposed in /v1/models endpoint (owned_by: gophergate)
- logic_level and primary_use metadata on model groups
- Model group CRUD dashboard page
- dispatcher, heavy-logic, standard-pro, fast-flow seed groups
- Provider selection moved after routing resolution (fixes group routing)
- Classifier selector model routed to correct provider (selectProvider)
- DeepSeek English system prompt injection (ensureEnglish)
- Deploy script (deploy.sh)
- Recent Activity pane shows resolved model + group annotation
- Model names aligned with models.dev registry
Planned Resolutions (High Priority)
Security Fixes
- Critical: Fix
AuthMiddlewareto reject invalid tokens instead of falling back to insecure prefix derivation.
Feature Parity Checklist (High Priority)
OpenAI Provider
- Tool Calling
- Multimodal (Images) support
- Accurate usage parsing (cached & reasoning tokens)
Feature Parity: OpenAI Provider Enhancements
- Reasoning Content (CoT) Support (
o1/o3):- Infrastructure verified.
reasoning_contentis mapped in request/response structures.
- Infrastructure verified.
- Support for
/v1/responsesAPI:- Implemented new route in
internal/server/server.go.
- Implemented new route in
Gemini Provider
- Tool Calling (mapping to Gemini format)
- Multimodal (Images) support
- Reasoning/Thought support
- Handle Tool Response role in unified format
DeepSeek Provider
- Reasoning Content (CoT) support
- Parameter sanitization for
deepseek-reasoner - Tool Calling support
- Accurate usage parsing (cache hits & reasoning)
Grok Provider
- Tool Calling support
- Multimodal support
- Accurate usage parsing (via OpenAI helper)
Ollama Provider
- OpenAI-compatible API integration
- Streaming support
- Model pattern detection for routing
- Zero cost calculation (local/free models)
Infrastructure & Middleware
- Implement Rate Limiting (
golang.org/x/time/rate) - Implement Circuit Breaker (
github.com/sony/gobreaker)
Verification
- Unit tests for feature-specific mapping (CoT, Tools, Images)
- Integration tests with live LLM APIs