Files
GopherGate/PLAN.md
hobokenchicken e8955fd36c
Some checks failed
CI / Check (push) Has been cancelled
CI / Clippy (push) Has been cancelled
CI / Formatting (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Release Build (push) Has been cancelled
merge
2026-03-06 15:35:30 -05:00

4.6 KiB

Project Plan: LLM Proxy Enhancements & Security Upgrade

This document outlines the roadmap for standardizing frontend security, cleaning up the codebase, upgrading session management to HMAC-signed tokens, and extending integration testing.

Phase 1: Frontend Security Standardization

Primary Agent: frontend-developer

  • Audit static/js/pages/users.js for manual HTML string concatenation.
  • Replace custom escaping or unescaped injections with window.api.escapeHtml.
  • Verify user list and user detail rendering for XSS vulnerabilities.

Phase 2: Codebase Cleanup

Primary Agent: backend-developer

  • Identify and remove unused imports in src/config/mod.rs.
  • Identify and remove unused imports in src/providers/mod.rs.
  • Run cargo clippy and cargo fmt to ensure adherence to standards.

Phase 3: HMAC Architectural Upgrade

Primary Agents: fullstack-developer, security-auditor, backend-developer

3.1 Design (Security Auditor)

  • Define Token Structure: base64(payload).signature.
    • Payload: { "session_id": "...", "username": "...", "role": "...", "exp": ... }
  • Select HMAC algorithm (HMAC-SHA256).
  • Define environment variable for secret key: SESSION_SECRET.

3.2 Implementation (Backend Developer)

  • Refactor src/dashboard/sessions.rs:
    • Integrate hmac and sha2 crates (or similar).
    • Update create_session to return signed tokens.
    • Update validate_session to verify signature before checking store.
  • Implement activity-based session refresh:
    • If session is valid and >50% through its TTL, extend expires_at and issue new signed token.

3.3 Integration (Fullstack Developer)

  • Update dashboard API handlers to handle new token format.
  • Update frontend session storage/retrieval if necessary.

Phase 4: Extended Integration Testing

Primary Agent: qa-automation

  • Setup test environment with encrypted key storage enabled.
  • Implement end-to-end flow:
    1. Store encrypted provider key via API.
    2. Authenticate through Proxy.
    3. Make proxied LLM request (verifying decryption and usage).
  • Validate HMAC token expiration and refresh logic in automated tests.

Phase 5: Code Quality & Refactoring

Primary Agent: fullstack-developer

  • Refactor dashboard monolith into modular sub-modules (auth.rs, usage.rs, etc.).
  • Standardize error handling and remove unwrap() in production paths.
  • Implement system health metrics and backup functionality.

Phase 6: Cache Cost & Provider Audit (ACTIVE)

Primary Agents: frontend-developer, backend-developer, database-optimizer, lab-assistant

6.1 Dashboard UI Updates (@frontend-developer)

  • Update Models Page Modal: Add input fields for Cache Read Cost and Cache Write Cost in static/js/pages/models.js.
  • API Integration: Ensure window.api.put includes these new cost fields in the request body.
  • Verify Costs Page: Confirm static/js/pages/costs.js displays these rates correctly in the pricing table.

6.2 Provider Audit & Stream Fixes (@backend-developer)

  • Standard DeepSeek Fix: Modify src/providers/deepseek.rs to stop stripping stream_options for deepseek-chat.
  • Grok Audit: Verify if Grok correctly returns usage in streaming; it uses build_openai_body and doesn't seem to strip it.
  • Gemini Audit: Confirm Gemini returns usage_metadata reliably in the final chunk.
  • Anthropic Audit: Check if Anthropic streaming requires include_usage or similar flags.

6.3 Database & Migration Validation (@database-optimizer)

  • Test Migrations: Run the server to ensure ALTER TABLE logic in src/database/mod.rs applies the new columns correctly.
  • Schema Verification: Verify model_configs has cache_read_cost_per_m and cache_write_cost_per_m columns.

6.4 Token Estimation Refinement (@lab-assistant)

  • Analyze Heuristic: Review chars / 4 in src/utils/tokens.rs.
  • Background Precise Recount: Propose a mechanism for a precise token count (using Tiktoken) after the response is finalized.

Critical Path

Migration Validation → UI Fields → Provider Stream Usage Reporting.

gantt
  title Phase 6 Timeline
  dateFormat YYYY-MM-DD
  section Frontend
  Models Page UI :2026-03-06, 1d
  Costs Table Update:after Models Page UI, 1d
  section Backend
  DeepSeek Fix :2026-03-06, 1d
  Provider Audit (Grok/Gemini):after DeepSeek Fix, 2d
  section Database
  Migration Test :2026-03-06, 1d
  section Optimization
  Token Heuristic Review :2026-03-06, 1d