b7df3108fa
README: Added hierarchical routing, classifier bucket mapping, two-level dispatch, model groups table, DeepSeek language note, deploy script, and updated model names to match current models.dev registry. TODO: Added 15 completed items covering model groups, routing, dispatch, and provider fixes from May 7 session. deployment.md: Added deploy.sh instructions.
84 lines
3.5 KiB
Markdown
84 lines
3.5 KiB
Markdown
# Migration TODO List
|
|
|
|
## Completed Tasks
|
|
- [x] Initial Go project setup
|
|
- [x] Database schema & migrations (hardcoded in `db.go`)
|
|
- [x] Configuration loader (Viper)
|
|
- [x] Auth Middleware (scoped to `/v1`)
|
|
- [x] Basic Provider implementations (OpenAI, Gemini, DeepSeek, Grok, Ollama)
|
|
- [x] Streaming Support (SSE & Gemini custom streaming)
|
|
- [x] Archive Rust files to `rust` branch
|
|
- [x] Clean root and set Go version as `main`
|
|
- [x] Enhanced `helpers.go` for Multimodal & Tool Calling (OpenAI compatible)
|
|
- [x] Enhanced `server.go` for robust request conversion
|
|
- [x] Dashboard Management APIs (Clients, Tokens, Users, Providers)
|
|
- [x] Dashboard Analytics & Usage Summary (Fixed SQL robustness)
|
|
- [x] WebSocket for real-time dashboard updates (Hub with client counting)
|
|
- [x] Asynchronous Request Logging to SQLite
|
|
- [x] Cost Tracking accuracy (Registry integration with `models.dev`)
|
|
- [x] Model Listing endpoint (`/v1/models`) with provider filtering
|
|
- [x] System Metrics endpoint (`/api/system/metrics` using `gopsutil`)
|
|
- [x] Fixed dashboard 404s and 500s
|
|
- [x] Model groups with heuristic and classifier routing strategies
|
|
- [x] Hierarchical routing — groups can target other groups with cycle detection
|
|
- [x] Classifier bucket mapping via complexity_threshold (1-10 scale -> N targets)
|
|
- [x] Two-level dispatch — classifier router delegates to tier groups
|
|
- [x] Model groups exposed in /v1/models endpoint (owned_by: gophergate)
|
|
- [x] logic_level and primary_use metadata on model groups
|
|
- [x] Model group CRUD dashboard page
|
|
- [x] dispatcher, heavy-logic, standard-pro, fast-flow seed groups
|
|
- [x] Provider selection moved after routing resolution (fixes group routing)
|
|
- [x] Classifier selector model routed to correct provider (selectProvider)
|
|
- [x] DeepSeek English system prompt injection (ensureEnglish)
|
|
- [x] Deploy script (deploy.sh)
|
|
- [x] Recent Activity pane shows resolved model + group annotation
|
|
- [x] Model names aligned with models.dev registry
|
|
|
|
## Planned Resolutions (High Priority)
|
|
|
|
### Security Fixes
|
|
- [x] **Critical:** Fix `AuthMiddleware` to reject invalid tokens instead of falling back to insecure prefix derivation.
|
|
|
|
### Feature Parity Checklist (High Priority)
|
|
|
|
### OpenAI Provider
|
|
- [x] Tool Calling
|
|
- [x] Multimodal (Images) support
|
|
- [x] Accurate usage parsing (cached & reasoning tokens)
|
|
### Feature Parity: OpenAI Provider Enhancements
|
|
- [x] **Reasoning Content (CoT) Support (`o1`/`o3`):**
|
|
- [x] Infrastructure verified. `reasoning_content` is mapped in request/response structures.
|
|
- [x] **Support for `/v1/responses` API:**
|
|
- [x] Implemented new route in `internal/server/server.go`.
|
|
|
|
### Gemini Provider
|
|
- [x] Tool Calling (mapping to Gemini format)
|
|
- [x] Multimodal (Images) support
|
|
- [x] Reasoning/Thought support
|
|
- [x] Handle Tool Response role in unified format
|
|
|
|
### DeepSeek Provider
|
|
- [x] Reasoning Content (CoT) support
|
|
- [x] Parameter sanitization for `deepseek-reasoner`
|
|
- [x] Tool Calling support
|
|
- [x] Accurate usage parsing (cache hits & reasoning)
|
|
|
|
### Grok Provider
|
|
- [x] Tool Calling support
|
|
- [x] Multimodal support
|
|
- [x] Accurate usage parsing (via OpenAI helper)
|
|
|
|
### Ollama Provider
|
|
- [x] OpenAI-compatible API integration
|
|
- [x] Streaming support
|
|
- [x] Model pattern detection for routing
|
|
- [x] Zero cost calculation (local/free models)
|
|
|
|
## Infrastructure & Middleware
|
|
- [ ] Implement Rate Limiting (`golang.org/x/time/rate`)
|
|
- [x] Implement Circuit Breaker (`github.com/sony/gobreaker`)
|
|
|
|
## Verification
|
|
- [ ] Unit tests for feature-specific mapping (CoT, Tools, Images)
|
|
- [ ] Integration tests with live LLM APIs
|