chore: remove obsolete files and update CI to Go

Removed old Rust-era documentation, scripts, and migrations. Updated GitHub Actions workflow to use Go 1.22.
2026-03-19 10:46:23 -04:00
parent 90874a6721
commit 4f5b55d40f
13 changed files with 25 additions and 1910 deletions
@@ -6,56 +6,44 @@ on:
  pull_request:
    branches: [main]
 env:
  CARGO_TERM_COLOR: always
  RUST_BACKTRACE: 1
 jobs:
-  check:
+  lint:
-    name: Check
+    name: Lint
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
-      - uses: dtolnay/rust-toolchain@stable
+      - name: Set up Go
-      - uses: Swatinem/rust-cache@v2
+        uses: actions/setup-go@v5
      - run: cargo check --all-targets
  clippy:
    name: Clippy
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
-          components: clippy
+          go-version: '1.22'
-      - uses: Swatinem/rust-cache@v2
+          cache: true
-      - run: cargo clippy --all-targets -- -D warnings
+      - name: golangci-lint
-
+        uses: golangci/golangci-lint-action@v4
  fmt:
    name: Formatting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
-          components: rustfmt
+          version: latest
      - run: cargo fmt --all -- --check
  test:
    name: Test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
-      - uses: dtolnay/rust-toolchain@stable
+      - name: Set up Go
-      - uses: Swatinem/rust-cache@v2
+        uses: actions/setup-go@v5
-      - run: cargo test --all-targets
+        with:
          go-version: '1.22'
          cache: true
      - name: Run Tests
        run: go test -v ./...
-  build-release:
+  build:
-    name: Release Build
+    name: Build
    runs-on: ubuntu-latest
    needs: [check, clippy, test]
    steps:
      - uses: actions/checkout@v4
-      - uses: dtolnay/rust-toolchain@stable
+      - name: Set up Go
-      - uses: Swatinem/rust-cache@v2
+        uses: actions/setup-go@v5
-      - run: cargo build --release
+        with:
          go-version: '1.22'
          cache: true
      - name: Build
        run: go build -v -o llm-proxy ./cmd/llm-proxy
@@ -1,65 +0,0 @@
 # LLM Proxy Code Review Plan
 ## Overview
 The **LLM Proxy** project is a Rust-based middleware designed to provide a unified interface for multiple Large Language Models (LLMs). Based on the repository structure, the project aims to implement a high-performance proxy server (`src/`) that handles request routing, usage tracking, and billing logic. A static dashboard (`static/`) provides a management interface for monitoring consumption and managing API keys. The architecture leverages Rust's async capabilities for efficient request handling and SQLite for persistent state management.
 ## Review Phases
 ### Phase 1: Backend Architecture & Rust Logic (@code-reviewer)
 - **Focus on:**
  - **Core Proxy Logic:** Efficiency of the request/response pipeline and streaming support.
  - **State Management:** Thread-safety and shared state patterns using `Arc` and `Mutex`/`RwLock`.
  - **Error Handling:** Use of idiomatic Rust error types and propagation.
  - **Async Performance:** Proper use of `tokio` or similar runtimes to avoid blocking the executor.
  - **Rust Idioms:** Adherence to Clippy suggestions and standard Rust naming conventions.
 ### Phase 2: Security & Authentication Audit (@security-auditor)
 - **Focus on:**
  - **API Key Management:** Secure storage, masking in logs, and rotation mechanisms.
  - **JWT Handling:** Validation logic, signature verification, and expiration checks.
  - **Input Validation:** Sanitization of prompts and configuration parameters to prevent injection.
  - **Dependency Audit:** Scanning for known vulnerabilities in the `Cargo.lock` using `cargo-audit`.
 ### Phase 3: Database & Data Integrity Review (@database-optimizer)
 - **Focus on:**
  - **Schema Design:** Efficiency of the SQLite schema for usage tracking and billing.
  - **Migration Strategy:** Robustness of the migration scripts to prevent data loss.
  - **Usage Tracking:** Accuracy of token counting and concurrency handling during increments.
  - **Query Optimization:** Identifying potential bottlenecks in reporting queries.
 ### Phase 4: Frontend & Dashboard Review (@frontend-developer)
 - **Focus on:**
  - **Vanilla JS Patterns:** Review of Web Components and modular JS in `static/js`.
  - **Security:** Protection against XSS in the dashboard and secure handling of local storage.
  - **UI/UX Consistency:** Ensuring the management interface is intuitive and responsive.
  - **API Integration:** Robustness of the frontend's communication with the Rust backend.
 ### Phase 5: Infrastructure & Deployment Review (@devops-engineer)
 - **Focus on:**
  - **Dockerfile Optimization:** Multi-stage builds to minimize image size and attack surface.
  - **Resource Limits:** Configuration of CPU/Memory limits for the proxy container.
  - **Deployment Docs:** Clarity of the setup process and environment variable documentation.
 ## Timeline (Gantt)
 ```mermaid
 gantt
    title LLM Proxy Code Review Timeline (March 2026)
    dateFormat  YYYY-MM-DD
    section Backend & Security
    Architecture & Rust Logic (Phase 1)   :active, p1, 2026-03-06, 1d
    Security & Auth Audit (Phase 2)       :p2, 2026-03-07, 1d
    section Data & Frontend
    Database & Integrity (Phase 3)        :p3, 2026-03-07, 1d
    Frontend & Dashboard (Phase 4)        :p4, 2026-03-08, 1d
    section DevOps
    Infra & Deployment (Phase 5)          :p5, 2026-03-08, 1d
    Final Review & Sign-off               :2026-03-08, 4h
 ```
 ## Success Criteria
 - **Security:** Zero high-priority vulnerabilities identified; all API keys masked in logs.
 - **Performance:** Proxy overhead is minimal (<10ms latency addition); queries are indexed.
 - **Maintainability:** Code passes all linting (`cargo clippy`) and formatting (`cargo fmt`) checks.
 - **Documentation:** README and deployment guides are up-to-date and accurate.
 - **Reliability:** Usage tracking matches actual API consumption with 99.9% accuracy.
@@ -1,220 +0,0 @@
 # LLM Proxy Gateway - Admin Dashboard
 ## Overview
 This is a comprehensive admin dashboard for the LLM Proxy Gateway, providing real-time monitoring, analytics, and management capabilities for the proxy service.
 ## Features
 ### 1. Dashboard Overview
 - Real-time request counters and statistics
 - System health indicators
 - Provider status monitoring
 - Recent requests stream
 ### 2. Usage Analytics
 - Time series charts for requests, tokens, and costs
 - Filter by date range, client, provider, and model
 - Top clients and models analysis
 - Export functionality to CSV/JSON
 ### 3. Cost Management
 - Cost breakdown by provider, client, and model
 - Budget tracking with alerts
 - Cost projections
 - Pricing configuration management
 ### 4. Client Management
 - List, create, revoke, and rotate API tokens
 - Client-specific rate limits
 - Usage statistics per client
 - Token management interface
 ### 5. Provider Configuration
 - Enable/disable LLM providers
 - Configure API keys (masked display)
 - Test provider connections
 - Model availability management
 ### 6. User Management (RBAC)
 - **Admin Role:** Full access to all dashboard features, user management, system configuration
 - **Viewer Role:** Read-only access to usage analytics, costs, and monitoring
 - Create/manage dashboard users with role assignment
 - Secure password management
 ### 7. Real-time Monitoring
 - Live request stream via WebSocket
 - System metrics dashboard
 - Response time and error rate tracking
 - Live system logs
 ### 7. **System Settings**
 - General configuration
 - Database management
 - Logging settings
 - Security settings
 ## Technology Stack
 ### Frontend
 - **HTML5/CSS3**: Modern, responsive design with CSS Grid/Flexbox
 - **JavaScript (ES6+)**: Vanilla JavaScript with modular architecture
 - **Chart.js**: Interactive data visualizations
 - **Luxon**: Date/time manipulation
 - **WebSocket API**: Real-time updates
 ### Backend (Rust/Axum)
 - **Axum**: Web framework with WebSocket support
 - **Tokio**: Async runtime
 - **Serde**: JSON serialization/deserialization
 - **Broadcast channels**: Real-time event distribution
 ## Installation & Setup
 ### 1. Build and Run the Server
 ```bash
 # Build the project
 cargo build --release
 # Run the server
 cargo run --release
 ```
 ### 2. Access the Dashboard
 Once the server is running, access the dashboard at:
 ```
 http://localhost:8080
 ```
 ### 3. Default Login Credentials
 - **Username**: `admin`
 - **Password**: `admin123`
 ## API Endpoints
 ### Authentication
 - `POST /api/auth/login` - Dashboard login
 - `GET /api/auth/status` - Authentication status
 ### Analytics
 - `GET /api/usage/summary` - Overall usage summary
 - `GET /api/usage/time-series` - Time series data
 - `GET /api/usage/clients` - Client breakdown
 - `GET /api/usage/providers` - Provider breakdown
 ### Clients
 - `GET /api/clients` - List all clients
 - `POST /api/clients` - Create new client
 - `PUT /api/clients/{id}` - Update client
 - `DELETE /api/clients/{id}` - Revoke client
 - `GET /api/clients/{id}/usage` - Client-specific usage
 ### Users (RBAC)
 - `GET /api/users` - List all dashboard users
 - `POST /api/users` - Create new user
 - `PUT /api/users/{id}` - Update user (admin only)
 - `DELETE /api/users/{id}` - Delete user (admin only)
 ### Providers
 - `GET /api/providers` - List providers and status
 - `PUT /api/providers/{name}` - Update provider config
 - `POST /api/providers/{name}/test` - Test provider connection
 ### System
 - `GET /api/system/health` - System health
 - `GET /api/system/logs` - Recent logs
 - `POST /api/system/backup` - Trigger backup
 ### WebSocket
 - `GET /ws` - WebSocket endpoint for real-time updates
 ## Project Structure
 ```
 llm-proxy/
 ├── src/
 │   ├── dashboard/          # Dashboard backend module
 │   │   └── mod.rs         # Dashboard routes and handlers
 │   ├── server/            # Main proxy server
 │   ├── providers/         # LLM provider implementations
 │   └── ...               # Other modules
 ├── static/               # Frontend dashboard files
 │   ├── index.html        # Main dashboard HTML
 │   ├── css/
 │   │   └── dashboard.css # Dashboard styles
 │   ├── js/
 │   │   ├── auth.js       # Authentication module
 │   │   ├── dashboard.js  # Main dashboard controller
 │   │   ├── websocket.js  # WebSocket manager
 │   │   ├── charts.js     # Chart.js utilities
 │   │   └── pages/        # Page-specific modules
 │   │       ├── overview.js
 │   │       ├── analytics.js
 │   │       ├── costs.js
 │   │       ├── clients.js
 │   │       ├── providers.js
 │   │       ├── monitoring.js
 │   │       ├── settings.js
 │   │       └── logs.js
 │   ├── img/              # Images and icons
 │   └── fonts/            # Font files
 └── Cargo.toml           # Rust dependencies
 ```
 ## Development
 ### Adding New Pages
 1. Create a new JavaScript module in `static/js/pages/`
 2. Implement the page class with `init()` method
 3. Register the page in `dashboard.js`
 4. Add menu item in `index.html`
 ### Adding New API Endpoints
 1. Add route in `src/dashboard/mod.rs`
 2. Implement handler function
 3. Update frontend JavaScript to call the endpoint
 ### Styling Guidelines
 - Use CSS custom properties (variables) from `:root`
 - Follow mobile-first responsive design
 - Use BEM-like naming convention for CSS classes
 - Maintain consistent spacing with CSS variables
 ## Security Considerations
 1. **Authentication**: Simple password-based auth for demo; replace with proper auth in production
 2. **API Keys**: Tokens are masked in the UI (only last 4 characters shown)
 3. **CORS**: Configure appropriate CORS headers for production
 4. **Rate Limiting**: Implement rate limiting for API endpoints
 5. **HTTPS**: Always use HTTPS in production
 ## Performance Optimizations
 1. **Code Splitting**: JavaScript modules are loaded on-demand
 2. **Caching**: Static assets are served with cache headers
 3. **WebSocket**: Real-time updates reduce polling overhead
 4. **Lazy Loading**: Charts and tables load data as needed
 5. **Compression**: Enable gzip/brotli compression for static files
 ## Browser Support
 - Chrome 60+
 - Firefox 55+
 - Safari 11+
 - Edge 79+
 ## License
 MIT License - See LICENSE file for details.
 ## Contributing
 1. Fork the repository
 2. Create a feature branch
 3. Make your changes
 4. Add tests if applicable
 5. Submit a pull request
 ## Support
 For issues and feature requests, please use the GitHub issue tracker.
@@ -1,480 +0,0 @@
 # Database Review Report for LLM-Proxy Repository
 **Review Date:** 2025-03-06  
 **Reviewer:** Database Optimization Expert  
 **Repository:** llm-proxy  
 **Focus Areas:** Schema Design, Query Optimization, Migration Strategy, Data Integrity, Usage Tracking Accuracy
 ## Executive Summary
 The llm-proxy database implementation demonstrates solid foundation with appropriate table structures and clear separation of concerns. However, several areas require improvement to ensure scalability, data consistency, and performance as usage grows. Key findings include:
 1. **Schema Design**: Generally normalized but missing foreign key enforcement and some critical indexes.
 2. **Query Optimization**: Well-optimized for most queries but missing composite indexes for common filtering patterns.
 3. **Migration Strategy**: Ad-hoc migration approach that may cause issues with schema evolution.
 4. **Data Integrity**: Potential race conditions in usage tracking and missing transaction boundaries.
 5. **Usage Tracking**: Generally accurate but risk of inconsistent state between related tables.
 This report provides detailed analysis and actionable recommendations for each area.
 ## 1. Schema Design Review
 ### Tables Overview
 The database consists of 6 main tables:
 1. **clients**: Client management with usage aggregates
 2. **llm_requests**: Request logging with token counts and costs
 3. **provider_configs**: Provider configuration and credit balances
 4. **model_configs**: Model-specific configuration and cost overrides
 5. **users**: Dashboard user authentication
 6. **client_tokens**: API token storage for client authentication
 ### Normalization Assessment
 **Strengths:**
 - Tables follow 3rd Normal Form (3NF) with appropriate separation
 - Foreign key relationships properly defined
 - No obvious data duplication across tables
 **Areas for Improvement:**
 - **Denormalized aggregates**: `clients.total_requests`, `total_tokens`, `total_cost` are derived from `llm_requests`. This introduces risk of inconsistency.
 - **Provider credit balance**: Stored in `provider_configs` but also updated based on `llm_requests`. No audit trail for balance changes.
 ### Data Type Analysis
 **Appropriate Choices:**
 - INTEGER for token counts (cast from u32 to i64)
 - REAL for monetary values
 - DATETIME for timestamps using SQLite's CURRENT_TIMESTAMP
 - TEXT for identifiers with appropriate length
 **Potential Issues:**
 - `llm_requests.request_body` and `response_body` defined as TEXT but always set to NULL - consider removing or making optional columns.
 - `provider_configs.billing_mode` added via migration but default value not consistently applied to existing rows.
 ### Constraints and Foreign Keys
 **Current Constraints:**
 - Primary keys defined for all tables
 - UNIQUE constraints on `clients.client_id`, `users.username`, `client_tokens.token`
 - Foreign key definitions present but **not enforced** (SQLite default)
 **Missing Constraints:**
 - NOT NULL constraints missing on several columns where nullability not intended
 - CHECK constraints for positive values (`credit_balance >= 0`)
 - Foreign key enforcement not enabled
 ## 2. Query Optimization Analysis
 ### Indexing Strategy
 **Existing Indexes:**
 - `idx_clients_client_id` - Essential for client lookups
 - `idx_clients_created_at` - Useful for chronological listing
 - `idx_llm_requests_timestamp` - Critical for time-based queries
 - `idx_llm_requests_client_id` - Supports client-specific queries
 - `idx_llm_requests_provider` - Good for provider breakdowns
 - `idx_llm_requests_status` - Low cardinality but acceptable
 - `idx_client_tokens_token` UNIQUE - Essential for authentication
 - `idx_client_tokens_client_id` - Supports token management
 **Missing Critical Indexes:**
 1. `model_configs.provider_id` - Foreign key column used in JOINs
 2. `llm_requests(client_id, timestamp)` - Composite index for client time-series queries
 3. `llm_requests(provider, timestamp)` - For provider performance analysis
 4. `llm_requests(status, timestamp)` - For error trend analysis
 ### N+1 Query Detection
 **Well-Optimized Areas:**
 - Model configuration caching prevents repeated database hits
 - Provider configs loaded in batch for dashboard display
 - Client listing uses single efficient query
 **Potential N+1 Patterns:**
 - In `server/mod.rs` list_models function, cache lookup per model but this is in-memory
 - No significant database N+1 issues identified
 ### Inefficient Query Patterns
 **Query 1: Time-series aggregation with strftime()**
 ```sql
 SELECT strftime('%Y-%m-%d', timestamp) as date, ...
 FROM llm_requests
 WHERE 1=1 {}
 GROUP BY date, client_id, provider, model
 ORDER BY date DESC
 LIMIT 200
 ```
 **Issue:** Function on indexed column prevents index utilization for the WHERE clause when filtering by timestamp range.
 **Recommendation:** Store computed date column or use range queries on timestamp directly.
 **Query 2: Today's stats using strftime()**
 ```sql
 WHERE strftime('%Y-%m-%d', timestamp) = ?
 ```
 **Issue:** Non-sargable query prevents index usage.
 **Recommendation:** Use range query:
 ```sql
 WHERE timestamp >= date(?) AND timestamp < date(?, '+1 day')
 ```
 ### Recommended Index Additions
 ```sql
 -- Composite indexes for common query patterns
 CREATE INDEX idx_llm_requests_client_timestamp ON llm_requests(client_id, timestamp);
 CREATE INDEX idx_llm_requests_provider_timestamp ON llm_requests(provider, timestamp);
 CREATE INDEX idx_llm_requests_status_timestamp ON llm_requests(status, timestamp);
 -- Foreign key index
 CREATE INDEX idx_model_configs_provider_id ON model_configs(provider_id);
 -- Optional: Covering index for client usage queries
 CREATE INDEX idx_clients_usage ON clients(client_id, total_requests, total_tokens, total_cost);
 ```
 ## 3. Migration Strategy Assessment
 ### Current Approach
 The migration system uses a hybrid approach:
 1. **Schema synchronization**: `CREATE TABLE IF NOT EXISTS` on startup
 2. **Ad-hoc migrations**: `ALTER TABLE` statements with error suppression
 3. **Single migration file**: `migrations/001-add-billing-mode.sql` with transaction wrapper
 **Pros:**
 - Simple to understand and maintain
 - Automatic schema creation for new deployments
 - Error suppression prevents crashes on column existence
 **Cons:**
 - No version tracking of applied migrations
 - Potential for inconsistent schema across deployments
 - `ALTER TABLE` error suppression hides genuine schema issues
 - No rollback capability
 ### Risks and Limitations
 1. **Schema Drift**: Different instances may have different schemas if migrations are applied out of order
 2. **Data Loss Risk**: No backup/verification before schema changes
 3. **Production Issues**: Error suppression could mask migration failures until runtime
 ### Recommendations
 1. **Implement Proper Migration Tooling**: Use `sqlx migrate` or similar versioned migration system
 2. **Add Migration Version Table**: Track applied migrations and checksum verification
 3. **Separate Migration Scripts**: One file per migration with up/down directions
 4. **Pre-deployment Validation**: Schema checks in CI/CD pipeline
 5. **Backup Strategy**: Automatic backups before migration execution
 ## 4. Data Integrity Evaluation
 ### Foreign Key Enforcement
 **Critical Issue:** Foreign key constraints are defined but **not enforced** in SQLite.
 **Impact:** Orphaned records, inconsistent referential integrity.
 **Solution:** Enable foreign key support in connection string:
 ```rust
 let options = SqliteConnectOptions::from_str(&format!("sqlite:{}", database_path))?
    .create_if_missing(true)
    .pragma("foreign_keys", "ON");
 ```
 ### Transaction Usage
 **Good Patterns:**
 - Request logging uses transactions for insert + provider balance update
 - Atomic UPDATE for client usage statistics
 **Problematic Areas:**
 1. **Split Transactions**: Client usage update and request logging are in separate transactions
   - In `logging/mod.rs`: `insert_log` transaction includes provider balance update
   - In `utils/streaming.rs`: Client usage updated separately after logging
   - **Risk**: Partial updates if one transaction fails
 2. **No Transaction for Client Creation**: Client and token creation not atomic
 **Recommendations:**
 - Wrap client usage update within the same transaction as request logging
 - Use transaction for client + token creation
 - Consider using savepoints for complex operations
 ### Race Conditions and Consistency
 **Potential Race Conditions:**
 1. **Provider credit balance**: Concurrent requests may cause lost updates
   - Current: `UPDATE provider_configs SET credit_balance = credit_balance - ?`
   - SQLite provides serializable isolation, but negative balances not prevented
 2. **Client usage aggregates**: Concurrent updates to `total_requests`, `total_tokens`, `total_cost`
   - Similar UPDATE pattern, generally safe but consider idempotency
 **Recommendations:**
 - Add check constraint: `CHECK (credit_balance >= 0)`
 - Implement idempotent request logging with unique request IDs
 - Consider optimistic concurrency control for critical balances
 ## 5. Usage Tracking Accuracy
 ### Token Counting Methodology
 **Current Approach:**
 - Prompt tokens: Estimated using provider-specific estimators
 - Completion tokens: Estimated or from provider real usage data
 - Cache tokens: Separately tracked for cache-aware pricing
 **Strengths:**
 - Fallback to estimation when provider doesn't report usage
 - Cache token differentiation for accurate pricing
 **Weaknesses:**
 - Estimation may differ from actual provider counts
 - No validation of provider-reported token counts
 ### Cost Calculation
 **Well Implemented:**
 - Model-specific cost overrides via `model_configs`
 - Cache-aware pricing when supported by registry
 - Provider fallback calculations
 **Potential Issues:**
 - Floating-point precision for monetary calculations
 - No rounding strategy for fractional cents
 ### Update Consistency
 **Inconsistency Risk:** Client aggregates updated separately from request logging.
 **Example Flow:**
 1. Request log inserted and provider balance updated (transaction)
 2. Client usage updated (separate operation)
 3. If step 2 fails, client stats undercount usage
 **Solution:** Include client update in the same transaction:
 ```rust
 // In insert_log function, add:
 UPDATE clients 
 SET total_requests = total_requests + 1,
    total_tokens = total_tokens + ?,
    total_cost = total_cost + ?
 WHERE client_id = ?;
 ```
 ### Financial Accuracy
 **Good Practices:**
 - Token-level granularity for cost calculation
 - Separation of prompt/completion/cache pricing
 - Database persistence for audit trail
 **Recommendations:**
 1. **Audit Trail**: Add `balance_transactions` table for provider credit changes
 2. **Rounding Policy**: Define rounding strategy (e.g., to 6 decimal places)
 3. **Validation**: Periodic reconciliation of aggregates vs. detail records
 ## 6. Performance Recommendations
 ### Schema Improvements
 1. **Partitioning Strategy**: For high-volume `llm_requests`, consider:
   - Monthly partitioning by timestamp
   - Archive old data to separate tables
 2. **Data Retention Policy**: Implement automatic cleanup of old request logs
   ```sql
   DELETE FROM llm_requests WHERE timestamp < date('now', '-90 days');
   ```
 3. **Column Optimization**: Remove unused `request_body`, `response_body` columns or implement compression
 ### Query Optimizations
 1. **Avoid Functions on Indexed Columns**: Rewrite date queries as range queries
 2. **Batch Updates**: Consider batch updates for client usage instead of per-request
 3. **Read Replicas**: For dashboard queries, consider separate read connection
 ### Connection Pooling
 **Current:** SQLx connection pool with default settings
 **Recommendations:**
 - Configure pool size based on expected concurrency
 - Implement connection health checks
 - Monitor pool utilization metrics
 ### Monitoring Setup
 **Essential Metrics:**
 - Query execution times (slow query logging)
 - Index usage statistics
 - Table growth trends
 - Connection pool utilization
 **Implementation:**
 - Add `sqlx::metrics` integration
 - Regular `ANALYZE` execution for query planner
 - Dashboard for database health monitoring
 ## 7. Security Considerations
 ### Data Protection
 **Sensitive Data:**
 - `provider_configs.api_key` - Should be encrypted at rest
 - `users.password_hash` - Already hashed with bcrypt
 - `client_tokens.token` - Plain text storage
 **Recommendations:**
 - Encrypt API keys using libsodium or similar
 - Implement token hashing (similar to password hashing)
 - Regular security audits of authentication flows
 ### SQL Injection Prevention
 **Good Practices:**
 - Use sqlx query builder with parameter binding
 - No raw SQL concatenation observed in code review
 **Verification Needed:** Ensure all dynamic SQL uses parameterized queries
 ### Access Controls
 **Database Level:**
 - SQLite lacks built-in user management
 - Consider file system permissions for database file
 - Application-level authentication is primary control
 ## 8. Summary of Critical Issues
 **Priority 1 (Critical):**
 1. Foreign key constraints not enabled
 2. Split transactions risking data inconsistency
 3. Missing composite indexes for common queries
 **Priority 2 (High):**
 1. No proper migration versioning system
 2. Potential race conditions in balance updates
 3. Non-sargable date queries impacting performance
 **Priority 3 (Medium):**
 1. Denormalized aggregates without consistency guarantees
 2. No data retention policy for request logs
 3. Missing check constraints for data validation
 ## 9. Recommended Action Plan
 ### Phase 1: Immediate Fixes (1-2 weeks)
 1. Enable foreign key constraints in database connection
 2. Add composite indexes for common query patterns
 3. Fix transaction boundaries for client usage updates
 4. Rewrite non-sargable date queries
 ### Phase 2: Short-term Improvements (3-4 weeks)
 1. Implement proper migration system with version tracking
 2. Add check constraints for data validation
 3. Implement connection pooling configuration
 4. Create database monitoring dashboard
 ### Phase 3: Long-term Enhancements (2-3 months)
 1. Implement data retention and archiving strategy
 2. Add audit trail for provider balance changes
 3. Consider partitioning for high-volume tables
 4. Implement encryption for sensitive data
 ### Phase 4: Ongoing Maintenance
 1. Regular index maintenance and query plan analysis
 2. Periodic reconciliation of aggregate vs. detail data
 3. Security audits and dependency updates
 4. Performance benchmarking and optimization
 ---
 ## Appendices
 ### A. Sample Migration Implementation
 ```sql
 -- migrations/002-enable-foreign-keys.sql
 PRAGMA foreign_keys = ON;
 -- migrations/003-add-composite-indexes.sql
 CREATE INDEX idx_llm_requests_client_timestamp ON llm_requests(client_id, timestamp);
 CREATE INDEX idx_llm_requests_provider_timestamp ON llm_requests(provider, timestamp);
 CREATE INDEX idx_model_configs_provider_id ON model_configs(provider_id);
 ```
 ### B. Transaction Fix Example
 ```rust
 async fn insert_log(pool: &SqlitePool, log: RequestLog) -> Result<(), sqlx::Error> {
    let mut tx = pool.begin().await?;
    // Insert or ignore client
    sqlx::query("INSERT OR IGNORE INTO clients (client_id, name, description) VALUES (?, ?, 'Auto-created from request')")
        .bind(&log.client_id)
        .bind(&log.client_id)
        .execute(&mut *tx)
        .await?;
    // Insert request log
    sqlx::query("INSERT INTO llm_requests ...")
        .execute(&mut *tx)
        .await?;
    // Update provider balance
    if log.cost > 0.0 {
        sqlx::query("UPDATE provider_configs SET credit_balance = credit_balance - ? WHERE id = ? AND (billing_mode IS NULL OR billing_mode != 'postpaid')")
            .bind(log.cost)
            .bind(&log.provider)
            .execute(&mut *tx)
            .await?;
    }
    // Update client aggregates within same transaction
    sqlx::query("UPDATE clients SET total_requests = total_requests + 1, total_tokens = total_tokens + ?, total_cost = total_cost + ? WHERE client_id = ?")
        .bind(log.total_tokens as i64)
        .bind(log.cost)
        .bind(&log.client_id)
        .execute(&mut *tx)
        .await?;
    tx.commit().await?;
    Ok(())
 }
 ```
 ### C. Monitoring Query Examples
 ```sql
 -- Identify unused indexes
 SELECT * FROM sqlite_master 
 WHERE type = 'index' 
 AND name NOT IN (
    SELECT DISTINCT name 
    FROM sqlite_stat1 
    WHERE tbl = 'llm_requests'
 );
 -- Table size analysis
 SELECT name, (pgsize * page_count) / 1024 / 1024 as size_mb
 FROM dbstat 
 WHERE name = 'llm_requests';
 -- Query performance analysis (requires EXPLAIN QUERY PLAN)
 EXPLAIN QUERY PLAN
 SELECT * FROM llm_requests 
 WHERE client_id = ? AND timestamp >= ?;
 ```
 ---
 *This report provides a comprehensive analysis of the current database implementation and actionable recommendations for improvement. Regular review and iteration will ensure the database continues to meet performance, consistency, and scalability requirements as the application grows.*
@@ -1,232 +0,0 @@
 # Optimization for 512MB RAM Environment
 This document provides guidance for optimizing the LLM Proxy Gateway for deployment in resource-constrained environments (512MB RAM).
 ## Memory Optimization Strategies
 ### 1. Build Optimization
 The project is already configured with optimized build settings in `Cargo.toml`:
 ```toml
 [profile.release]
 opt-level = 3      # Maximum optimization
 lto = true         # Link-time optimization
 codegen-units = 1  # Single codegen unit for better optimization
 strip = true       # Strip debug symbols
 ```
 **Additional optimizations you can apply:**
 ```bash
 # Build with specific target for better optimization
 cargo build --release --target x86_64-unknown-linux-musl
 # Or for ARM (Raspberry Pi, etc.)
 cargo build --release --target aarch64-unknown-linux-musl
 ```
 ### 2. Runtime Memory Management
 #### Database Connection Pool
 - Default: 10 connections
 - Recommended for 512MB: 5 connections
 Update `config.toml`:
 ```toml
 [database]
 max_connections = 5
 ```
 #### Rate Limiting Memory Usage
 - Client rate limit buckets: Store in memory
 - Circuit breakers: Minimal memory usage
 - Consider reducing burst capacity if memory is critical
 #### Provider Management
 - Only enable providers you actually use
 - Disable unused providers in configuration
 ### 3. Configuration for Low Memory
 Create a `config-low-memory.toml`:
 ```toml
 [server]
 port = 8080
 host = "0.0.0.0"
 [database]
 path = "./data/llm_proxy.db"
 max_connections = 3  # Reduced from default 10
 [providers]
 # Only enable providers you need
 openai.enabled = true
 gemini.enabled = false  # Disable if not used
 deepseek.enabled = false  # Disable if not used
 grok.enabled = false  # Disable if not used
 [rate_limiting]
 # Reduce memory usage for rate limiting
 client_requests_per_minute = 30  # Reduced from 60
 client_burst_size = 5  # Reduced from 10
 global_requests_per_minute = 300  # Reduced from 600
 ```
 ### 4. System-Level Optimizations
 #### Linux Kernel Parameters
 Add to `/etc/sysctl.conf`:
 ```bash
 # Reduce TCP buffer sizes
 net.ipv4.tcp_rmem = 4096 87380 174760
 net.ipv4.tcp_wmem = 4096 65536 131072
 # Reduce connection tracking
 net.netfilter.nf_conntrack_max = 65536
 net.netfilter.nf_conntrack_tcp_timeout_established = 1200
 # Reduce socket buffer sizes
 net.core.rmem_max = 131072
 net.core.wmem_max = 131072
 net.core.rmem_default = 65536
 net.core.wmem_default = 65536
 ```
 #### Systemd Service Configuration
 Create `/etc/systemd/system/llm-proxy.service`:
 ```ini
 [Unit]
 Description=LLM Proxy Gateway
 After=network.target
 [Service]
 Type=simple
 User=llmproxy
 Group=llmproxy
 WorkingDirectory=/opt/llm-proxy
 ExecStart=/opt/llm-proxy/llm-proxy
 Restart=on-failure
 RestartSec=5
 # Memory limits
 MemoryMax=400M
 MemorySwapMax=100M
 # CPU limits
 CPUQuota=50%
 # Process limits
 LimitNOFILE=65536
 LimitNPROC=512
 Environment="RUST_LOG=info"
 Environment="LLM_PROXY__DATABASE__MAX_CONNECTIONS=3"
 [Install]
 WantedBy=multi-user.target
 ```
 ### 5. Application-Specific Optimizations
 #### Disable Unused Features
 - **Multimodal support**: If not using images, disable image processing dependencies
 - **Dashboard**: The dashboard uses WebSockets and additional memory. Consider disabling if not needed.
 - **Detailed logging**: Reduce log verbosity in production
 #### Memory Pool Sizes
 The application uses several memory pools:
 1. **Database connection pool**: Configured via `max_connections`
 2. **HTTP client pool**: Reqwest client pool (defaults to reasonable values)
 3. **Async runtime**: Tokio worker threads
 Reduce Tokio worker threads for low-core systems:
 ```rust
 // In main.rs, modify tokio runtime creation
 #[tokio::main(flavor = "current_thread")]  // Single-threaded runtime
 async fn main() -> Result<()> {
    // Or for multi-threaded with limited threads:
    // #[tokio::main(worker_threads = 2)]
 ```
 ### 6. Monitoring and Profiling
 #### Memory Usage Monitoring
 ```bash
 # Install heaptrack for memory profiling
 cargo install heaptrack
 # Profile memory usage
 heaptrack ./target/release/llm-proxy
 # Monitor with ps
 ps aux --sort=-%mem | head -10
 # Monitor with top
 top -p $(pgrep llm-proxy)
 ```
 #### Performance Benchmarks
 Test with different configurations:
 ```bash
 # Test with 100 concurrent connections
 wrk -t4 -c100 -d30s http://localhost:8080/health
 # Test chat completion endpoint
 ab -n 1000 -c 10 -p test_request.json -T application/json http://localhost:8080/v1/chat/completions
 ```
 ### 7. Deployment Checklist for 512MB RAM
 - [ ] Build with release profile: `cargo build --release`
 - [ ] Configure database with `max_connections = 3`
 - [ ] Disable unused providers in configuration
 - [ ] Set appropriate rate limiting limits
 - [ ] Configure systemd with memory limits
 - [ ] Set up log rotation to prevent disk space issues
 - [ ] Monitor memory usage during initial deployment
 - [ ] Consider using swap space (512MB-1GB) for safety
 ### 8. Troubleshooting High Memory Usage
 #### Common Issues and Solutions:
 1. **Database connection leaks**: Ensure connections are properly closed
 2. **Memory fragmentation**: Use jemalloc or mimalloc as allocator
 3. **Unbounded queues**: Check WebSocket message queues
 4. **Cache growth**: Implement cache limits or TTL
 #### Add to Cargo.toml for alternative allocator:
 ```toml
 [dependencies]
 mimalloc = { version = "0.1", default-features = false }
 [features]
 default = ["mimalloc"]
 ```
 #### In main.rs:
 ```rust
 #[global_allocator]
 static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;
 ```
 ### 9. Expected Memory Usage
 | Component | Baseline | With 10 clients | With 100 clients |
 |-----------|----------|-----------------|------------------|
 | Base executable | 15MB | 15MB | 15MB |
 | Database connections | 5MB | 8MB | 15MB |
 | Rate limiting | 2MB | 5MB | 20MB |
 | HTTP clients | 3MB | 5MB | 10MB |
 | **Total** | **25MB** | **33MB** | **60MB** |
 **Note**: These are estimates. Actual usage depends on request volume, payload sizes, and configuration.
 ### 10. Further Reading
 - [Tokio performance guide](https://tokio.rs/tokio/topics/performance)
 - [Rust performance book](https://nnethercote.github.io/perf-book/)
 - [Linux memory management](https://www.kernel.org/doc/html/latest/admin-guide/mm/)
 - [SQLite performance tips](https://www.sqlite.org/faq.html#q19)
@@ -1,99 +0,0 @@
 # Project Plan: LLM Proxy Enhancements & Security Upgrade
 This document outlines the roadmap for standardizing frontend security, cleaning up the codebase, upgrading session management to HMAC-signed tokens, and extending integration testing.
 ## Phase 1: Frontend Security Standardization
 **Primary Agent:** `frontend-developer`
 - [x] Audit `static/js/pages/users.js` for manual HTML string concatenation.
 - [x] Replace custom escaping or unescaped injections with `window.api.escapeHtml`.
 - [x] Verify user list and user detail rendering for XSS vulnerabilities.
 ## Phase 2: Codebase Cleanup
 **Primary Agent:** `backend-developer`
 - [x] Identify and remove unused imports in `src/config/mod.rs`.
 - [x] Identify and remove unused imports in `src/providers/mod.rs`.
 - [x] Run `cargo clippy` and `cargo fmt` to ensure adherence to standards.
 ## Phase 3: HMAC Architectural Upgrade
 **Primary Agents:** `fullstack-developer`, `security-auditor`, `backend-developer`
 ### 3.1 Design (Security Auditor)
 - [x] Define Token Structure: `base64(payload).signature`.
    - Payload: `{ "session_id": "...", "username": "...", "role": "...", "exp": ... }`
 - [x] Select HMAC algorithm (HMAC-SHA256).
 - [x] Define environment variable for secret key: `SESSION_SECRET`.
 ### 3.2 Implementation (Backend Developer)
 - [x] Refactor `src/dashboard/sessions.rs`:
    - Integrate `hmac` and `sha2` crates (or similar).
    - Update `create_session` to return signed tokens.
    - Update `validate_session` to verify signature before checking store.
 - [x] Implement activity-based session refresh:
    - If session is valid and >50% through its TTL, extend `expires_at` and issue new signed token.
 ### 3.3 Integration (Fullstack Developer)
 - [x] Update dashboard API handlers to handle new token format.
 - [x] Update frontend session storage/retrieval if necessary.
 ## Phase 4: Extended Integration Testing
 **Primary Agent:** `qa-automation`
 - [ ] Setup test environment with encrypted key storage enabled.
 - [ ] Implement end-to-end flow:
    1. Store encrypted provider key via API.
    2. Authenticate through Proxy.
    3. Make proxied LLM request (verifying decryption and usage).
 - [ ] Validate HMAC token expiration and refresh logic in automated tests.
 ## Phase 5: Code Quality & Refactoring
 **Primary Agent:** `fullstack-developer`
 - [x] Refactor dashboard monolith into modular sub-modules (`auth.rs`, `usage.rs`, etc.).
 - [x] Standardize error handling and remove `unwrap()` in production paths.
 - [x] Implement system health metrics and backup functionality.
 ---
 # Phase 6: Cache Cost & Provider Audit (ACTIVE)
 **Primary Agents:** `frontend-developer`, `backend-developer`, `database-optimizer`, `lab-assistant`
 ## 6.1 Dashboard UI Updates (@frontend-developer)
 - [ ] **Update Models Page Modal:** Add input fields for `Cache Read Cost` and `Cache Write Cost` in `static/js/pages/models.js`.
 - [ ] **API Integration:** Ensure `window.api.put` includes these new cost fields in the request body.
 - [ ] **Verify Costs Page:** Confirm `static/js/pages/costs.js` displays these rates correctly in the pricing table.
 ## 6.2 Provider Audit & Stream Fixes (@backend-developer)
 - [ ] **Standard DeepSeek Fix:** Modify `src/providers/deepseek.rs` to stop stripping `stream_options` for `deepseek-chat`.
 - [ ] **Grok Audit:** Verify if Grok correctly returns usage in streaming; it uses `build_openai_body` and doesn't seem to strip it.
 - [ ] **Gemini Audit:** Confirm Gemini returns `usage_metadata` reliably in the final chunk.
 - [ ] **Anthropic Audit:** Check if Anthropic streaming requires `include_usage` or similar flags.
 ## 6.3 Database & Migration Validation (@database-optimizer)
 - [ ] **Test Migrations:** Run the server to ensure `ALTER TABLE` logic in `src/database/mod.rs` applies the new columns correctly.
 - [ ] **Schema Verification:** Verify `model_configs` has `cache_read_cost_per_m` and `cache_write_cost_per_m` columns.
 ## 6.4 Token Estimation Refinement (@lab-assistant)
 - [ ] **Analyze Heuristic:** Review `chars / 4` in `src/utils/tokens.rs`.
 - [ ] **Background Precise Recount:** Propose a mechanism for a precise token count (using Tiktoken) after the response is finalized.
 ## Critical Path
 Migration Validation → UI Fields → Provider Stream Usage Reporting.
 ```mermaid
 gantt
  title Phase 6 Timeline
  dateFormat YYYY-MM-DD
  section Frontend
  Models Page UI :2026-03-06, 1d
  Costs Table Update:after Models Page UI, 1d
  section Backend
  DeepSeek Fix :2026-03-06, 1d
  Provider Audit (Grok/Gemini):after DeepSeek Fix, 2d
  section Database
  Migration Test :2026-03-06, 1d
  section Optimization
  Token Heuristic Review :2026-03-06, 1d
 ```
@@ -1,58 +0,0 @@
 # LLM Proxy Security Audit Report
 ## Executive Summary
 A comprehensive security audit of the `llm-proxy` repository was conducted. The audit identified **1 critical vulnerability**, **3 high-risk issues**, **4 medium-risk issues**, and **3 low-risk issues**. The most severe findings include Cross-Site Scripting (XSS) in the dashboard interface and insecure storage of provider API keys in the database.
 ## Detailed Findings
 ### Critical Risk Vulnerabilities
 #### **CRITICAL-01: Cross-Site Scripting (XSS) in Dashboard Interface**
 - **Location**: `static/js/pages/clients.js` (multiple locations).
 - **Description**: User-controlled data (e.g., `client.id`) inserted directly into HTML or `onclick` handlers without escaping.
 - **Impact**: Arbitrary JavaScript execution in admin context, potentially stealing session tokens.
 #### **CRITICAL-02: Insecure API Key Storage in Database**
 - **Location**: `src/database/mod.rs`, `src/providers/mod.rs`, `src/dashboard/providers.rs`.
 - **Description**: Provider API keys are stored in **plaintext** in the SQLite database.
 - **Impact**: Compromised database file exposes all provider API keys.
 ### High Risk Vulnerabilities
 #### **HIGH-01: Missing Input Validation and Size Limits**
 - **Location**: `src/server/mod.rs`, `src/models/mod.rs`.
 - **Impact**: Denial of Service via large payloads.
 #### **HIGH-02: Sensitive Data Logging Without Encryption**
 - **Location**: `src/database/mod.rs`, `src/logging/mod.rs`.
 - **Description**: Full request and response bodies stored in `llm_requests` table without encryption or redaction.
 #### **HIGH-03: Weak Default Credentials and Password Policy**
 - **Description**: Default admin password is 'admin' with only 4-character minimum password length.
 ### Medium Risk Vulnerabilities
 #### **MEDIUM-01: Missing CSRF Protection**
 - No CSRF tokens or SameSite cookie attributes for state-changing dashboard endpoints.
 #### **MEDIUM-02: Insecure Session Management**
 - Session tokens stored in localStorage without HttpOnly flag.
 - Tokens use simple `session-{uuid}` format.
 #### **MEDIUM-03: Error Information Leakage**
 - Internal error details exposed to clients in some cases.
 #### **MEDIUM-04: Outdated Dependencies**
 - Outdated versions of `chrono`, `tokio`, and `reqwest`.
 ### Low Risk Vulnerabilities
 - Missing security headers (CSP, HSTS, X-Frame-Options).
 - Insufficient rate limiting on dashboard authentication.
 - No database encryption at rest.
 ## Recommendations
 ### Immediate Actions
 1. **Fix XSS Vulnerabilities:** Implement proper HTML escaping for all user-controlled data.
 2. **Secure API Key Storage:** Encrypt API keys in database using a library like `ring`.
 3. **Implement Input Validation:** Add maximum payload size limits (e.g., 10MB).
 4. **Improve Data Protection:** Add option to disable request/response body logging.
 ---
 *Report generated by Security Auditor Agent on March 6, 2026*
@@ -1,667 +0,0 @@
 #!/bin/bash
 # LLM Proxy Gateway Deployment Script
 # This script automates the deployment of the LLM Proxy Gateway on a Linux server
 set -e  # Exit on error
 set -u  # Exit on undefined variable
 # Configuration
 APP_NAME="llm-proxy"
 APP_USER="llmproxy"
 APP_GROUP="llmproxy"
 GIT_REPO="ssh://git.dustin.coffee:2222/hobokenchicken/llm-proxy.git"
 INSTALL_DIR="/opt/$APP_NAME"
 CONFIG_DIR="/etc/$APP_NAME"
 DATA_DIR="/var/lib/$APP_NAME"
 LOG_DIR="/var/log/$APP_NAME"
 SERVICE_FILE="/etc/systemd/system/$APP_NAME.service"
 ENV_FILE="$CONFIG_DIR/.env"
 # Colors for output
 RED='\033[0;31m'
 GREEN='\033[0;32m'
 YELLOW='\033[1;33m'
 NC='\033[0m' # No Color
 # Logging functions
 log_info() {
    echo -e "${GREEN}[INFO]${NC} $1"
 }
 log_warn() {
    echo -e "${YELLOW}[WARN]${NC} $1"
 }
 log_error() {
    echo -e "${RED}[ERROR]${NC} $1"
 }
 # Check if running as root
 check_root() {
    if [[ $EUID -ne 0 ]]; then
        log_error "This script must be run as root"
        exit 1
    fi
 }
 # Install system dependencies
 install_dependencies() {
    log_info "Installing system dependencies..."
    # Detect package manager
    if command -v apt-get &> /dev/null; then
        # Debian/Ubuntu
        apt-get update
        apt-get install -y \
            build-essential \
            pkg-config \
            libssl-dev \
            sqlite3 \
            curl \
            git
    elif command -v yum &> /dev/null; then
        # RHEL/CentOS
        yum groupinstall -y "Development Tools"
        yum install -y \
            openssl-devel \
            sqlite \
            curl \
            git
    elif command -v dnf &> /dev/null; then
        # Fedora
        dnf groupinstall -y "Development Tools"
        dnf install -y \
            openssl-devel \
            sqlite \
            curl \
            git
    elif command -v pacman &> /dev/null; then
        # Arch Linux
        pacman -Syu --noconfirm \
            base-devel \
            openssl \
            sqlite \
            curl \
            git
    else
        log_warn "Could not detect package manager. Please install dependencies manually."
    fi
 }
 # Install Rust if not present
 install_rust() {
    log_info "Checking for Rust installation..."
    if ! command -v rustc &> /dev/null; then
        log_info "Installing Rust..."
        curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
        source "$HOME/.cargo/env"
    else
        log_info "Rust is already installed"
    fi
    # Verify installation
    rustc --version
    cargo --version
 }
 # Create system user and directories
 setup_directories() {
    log_info "Creating system user and directories..."
    # Create user and group if they don't exist
    if ! id "$APP_USER" &>/dev/null; then
        # Arch uses /usr/bin/nologin, Debian/Ubuntu use /usr/sbin/nologin
        NOLOGIN=$(command -v nologin 2>/dev/null || echo "/usr/bin/nologin")
        useradd -r -s "$NOLOGIN" -M "$APP_USER"
    fi
    # Create directories
    mkdir -p "$INSTALL_DIR"
    mkdir -p "$CONFIG_DIR"
    mkdir -p "$DATA_DIR"
    mkdir -p "$LOG_DIR"
    # Set permissions
    chown -R "$APP_USER:$APP_GROUP" "$INSTALL_DIR"
    chown -R "$APP_USER:$APP_GROUP" "$CONFIG_DIR"
    chown -R "$APP_USER:$APP_GROUP" "$DATA_DIR"
    chown -R "$APP_USER:$APP_GROUP" "$LOG_DIR"
    chmod 750 "$INSTALL_DIR"
    chmod 750 "$CONFIG_DIR"
    chmod 750 "$DATA_DIR"
    chmod 750 "$LOG_DIR"
 }
 # Build the application
 build_application() {
    log_info "Building the application..."
    # Clone or update repository
    if [[ ! -d "$INSTALL_DIR/.git" ]]; then
        log_info "Cloning repository..."
        git clone "$GIT_REPO" "$INSTALL_DIR"
    else
        log_info "Updating repository..."
        cd "$INSTALL_DIR"
        git pull
    fi
    # Build in release mode
    cd "$INSTALL_DIR"
    log_info "Building release binary..."
    cargo build --release
    # Verify build
    if [[ -f "target/release/$APP_NAME" ]]; then
        log_info "Build successful"
    else
        log_error "Build failed"
        exit 1
    fi
 }
 # Create configuration files
 create_configuration() {
    log_info "Creating configuration files..."
    # Create .env file with API keys
    cat > "$ENV_FILE" << EOF
 # LLM Proxy Gateway Environment Variables
 # Add your API keys here
 # OpenAI API Key
 # OPENAI_API_KEY=sk-your-key-here
 # Google Gemini API Key
 # GEMINI_API_KEY=AIza-your-key-here
 # DeepSeek API Key
 # DEEPSEEK_API_KEY=sk-your-key-here
 # xAI Grok API Key
 # GROK_API_KEY=gk-your-key-here
 # Authentication tokens (comma-separated)
 # LLM_PROXY__SERVER__AUTH_TOKENS=token1,token2,token3
 EOF
    # Create config.toml
    cat > "$CONFIG_DIR/config.toml" << EOF
 # LLM Proxy Gateway Configuration
 [server]
 port = 8080
 host = "0.0.0.0"
 # auth_tokens = ["token1", "token2", "token3"]  # Uncomment to enable authentication
 [database]
 path = "$DATA_DIR/llm_proxy.db"
 max_connections = 5
 [providers.openai]
 enabled = true
 api_key_env = "OPENAI_API_KEY"
 base_url = "https://api.openai.com/v1"
 default_model = "gpt-4o"
 [providers.gemini]
 enabled = true
 api_key_env = "GEMINI_API_KEY"
 base_url = "https://generativelanguage.googleapis.com/v1"
 default_model = "gemini-2.0-flash"
 [providers.deepseek]
 enabled = true
 api_key_env = "DEEPSEEK_API_KEY"
 base_url = "https://api.deepseek.com"
 default_model = "deepseek-reasoner"
 [providers.grok]
 enabled = false  # Disabled by default until API is researched
 api_key_env = "GROK_API_KEY"
 base_url = "https://api.x.ai/v1"
 default_model = "grok-beta"
 [model_mapping]
 "gpt-*" = "openai"
 "gemini-*" = "gemini"
 "deepseek-*" = "deepseek"
 "grok-*" = "grok"
 [pricing]
 openai = { input = 0.01, output = 0.03 }
 gemini = { input = 0.0005, output = 0.0015 }
 deepseek = { input = 0.00014, output = 0.00028 }
 grok = { input = 0.001, output = 0.003 }
 EOF
    # Set permissions
    chown "$APP_USER:$APP_GROUP" "$ENV_FILE"
    chown "$APP_USER:$APP_GROUP" "$CONFIG_DIR/config.toml"
    chmod 640 "$ENV_FILE"
    chmod 640 "$CONFIG_DIR/config.toml"
 }
 # Create systemd service
 create_systemd_service() {
    log_info "Creating systemd service..."
    cat > "$SERVICE_FILE" << EOF
 [Unit]
 Description=LLM Proxy Gateway
 Documentation=https://git.dustin.coffee/hobokenchicken/llm-proxy
 After=network.target
 Wants=network.target
 [Service]
 Type=simple
 User=$APP_USER
 Group=$APP_GROUP
 WorkingDirectory=$INSTALL_DIR
 EnvironmentFile=$ENV_FILE
 Environment="RUST_LOG=info"
 Environment="LLM_PROXY__CONFIG_PATH=$CONFIG_DIR/config.toml"
 Environment="LLM_PROXY__DATABASE__PATH=$DATA_DIR/llm_proxy.db"
 ExecStart=$INSTALL_DIR/target/release/$APP_NAME
 Restart=on-failure
 RestartSec=5
 # Security hardening
 NoNewPrivileges=true
 PrivateTmp=true
 ProtectSystem=strict
 ProtectHome=true
 ReadWritePaths=$DATA_DIR $LOG_DIR
 # Resource limits (adjust based on your server)
 MemoryMax=400M
 MemorySwapMax=100M
 CPUQuota=50%
 LimitNOFILE=65536
 # Logging
 StandardOutput=journal
 StandardError=journal
 SyslogIdentifier=$APP_NAME
 [Install]
 WantedBy=multi-user.target
 EOF
    # Reload systemd
    systemctl daemon-reload
 }
 # Setup nginx reverse proxy (optional)
 setup_nginx_proxy() {
    if ! command -v nginx &> /dev/null; then
        log_warn "nginx not installed. Skipping reverse proxy setup."
        return
    fi
    log_info "Setting up nginx reverse proxy..."
    cat > "/etc/nginx/sites-available/$APP_NAME" << EOF
 server {
    listen 80;
    server_name your-domain.com;  # Change to your domain
    # Redirect to HTTPS (recommended)
    return 301 https://\$server_name\$request_uri;
 }
 server {
    listen 443 ssl http2;
    server_name your-domain.com;  # Change to your domain
    # SSL certificates (adjust paths)
    ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem;
    # SSL configuration
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384;
    ssl_prefer_server_ciphers off;
    # Proxy to LLM Proxy Gateway
    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_http_version 1.1;
        proxy_set_header Upgrade \$http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host \$host;
        proxy_set_header X-Real-IP \$remote_addr;
        proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto \$scheme;
        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
    # Health check endpoint
    location /health {
        proxy_pass http://127.0.0.1:8080/health;
        access_log off;
    }
    # Dashboard
    location /dashboard {
        proxy_pass http://127.0.0.1:8080/dashboard;
    }
 }
 EOF
    # Enable site
    ln -sf "/etc/nginx/sites-available/$APP_NAME" "/etc/nginx/sites-enabled/"
    # Test nginx configuration
    nginx -t
    log_info "nginx configuration created. Please update the domain and SSL certificate paths."
 }
 # Setup firewall
 setup_firewall() {
    log_info "Configuring firewall..."
    # Check for ufw (Ubuntu)
    if command -v ufw &> /dev/null; then
        ufw allow 22/tcp  # SSH
        ufw allow 80/tcp  # HTTP
        ufw allow 443/tcp # HTTPS
        ufw --force enable
        log_info "UFW firewall configured"
    fi
    # Check for firewalld (RHEL/CentOS)
    if command -v firewall-cmd &> /dev/null; then
        firewall-cmd --permanent --add-service=ssh
        firewall-cmd --permanent --add-service=http
        firewall-cmd --permanent --add-service=https
        firewall-cmd --reload
        log_info "Firewalld configured"
    fi
 }
 # Initialize database
 initialize_database() {
    log_info "Initializing database..."
    # Run the application once to create database
    sudo -u "$APP_USER" "$INSTALL_DIR/target/release/$APP_NAME" --help &> /dev/null || true
    log_info "Database initialized at $DATA_DIR/llm_proxy.db"
 }
 # Start and enable service
 start_service() {
    log_info "Starting $APP_NAME service..."
    systemctl enable "$APP_NAME"
    systemctl start "$APP_NAME"
    # Check status
    sleep 2
    systemctl status "$APP_NAME" --no-pager
 }
 # Verify installation
 verify_installation() {
    log_info "Verifying installation..."
    # Check if service is running
    if systemctl is-active --quiet "$APP_NAME"; then
        log_info "Service is running"
    else
        log_error "Service is not running"
        journalctl -u "$APP_NAME" -n 20 --no-pager
        exit 1
    fi
    # Test health endpoint
    if curl -s http://localhost:8080/health | grep -q "OK"; then
        log_info "Health check passed"
    else
        log_error "Health check failed"
        exit 1
    fi
    # Test dashboard
    if curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/dashboard | grep -q "200"; then
        log_info "Dashboard is accessible"
    else
        log_warn "Dashboard may not be accessible (this is normal if not configured)"
    fi
    log_info "Installation verified successfully!"
 }
 # Print next steps
 print_next_steps() {
    cat << EOF
 ${GREEN}=== LLM Proxy Gateway Installation Complete ===${NC}
 ${YELLOW}Next steps:${NC}
 1. ${GREEN}Configure API keys${NC}
   Edit: $ENV_FILE
   Add your API keys for the providers you want to use
 2. ${GREEN}Configure authentication${NC}
   Edit: $CONFIG_DIR/config.toml
   Uncomment and set auth_tokens for client authentication
 3. ${GREEN}Configure nginx${NC}
   Edit: /etc/nginx/sites-available/$APP_NAME
   Update domain name and SSL certificate paths
 4. ${GREEN}Test the API${NC}
   curl -X POST http://localhost:8080/v1/chat/completions \\
     -H "Content-Type: application/json" \\
     -H "Authorization: Bearer your-token" \\
     -d '{
       "model": "gpt-4o",
       "messages": [{"role": "user", "content": "Hello!"}]
     }'
 5. ${GREEN}Access the dashboard${NC}
   Open: http://your-server-ip:8080/dashboard
   Or: https://your-domain.com/dashboard (if nginx configured)
 ${YELLOW}Useful commands:${NC}
   systemctl status $APP_NAME    # Check service status
   journalctl -u $APP_NAME -f    # View logs
   systemctl restart $APP_NAME   # Restart service
 ${YELLOW}Configuration files:${NC}
   Service: $SERVICE_FILE
   Config: $CONFIG_DIR/config.toml
   Environment: $ENV_FILE
   Database: $DATA_DIR/llm_proxy.db
   Logs: $LOG_DIR/
 ${GREEN}For more information, see:${NC}
   https://git.dustin.coffee/hobokenchicken/llm-proxy
   $INSTALL_DIR/README.md
   $INSTALL_DIR/deployment.md
 EOF
 }
 # Main deployment function
 deploy() {
    log_info "Starting LLM Proxy Gateway deployment..."
    check_root
    install_dependencies
    install_rust
    setup_directories
    build_application
    create_configuration
    create_systemd_service
    initialize_database
    start_service
    verify_installation
    print_next_steps
    # Optional steps (uncomment if needed)
    # setup_nginx_proxy
    # setup_firewall
    log_info "Deployment completed successfully!"
 }
 # Update function
 update() {
    log_info "Updating LLM Proxy Gateway..."
    check_root
    # Pull latest changes (while service keeps running)
    cd "$INSTALL_DIR"
    log_info "Pulling latest changes..."
    git pull
    # Build new binary (service stays up on the old binary)
    log_info "Building release binary (service still running)..."
    if ! cargo build --release; then
        log_error "Build failed — service was NOT interrupted. Fix the error and try again."
        exit 1
    fi
    # Verify binary exists
    if [[ ! -f "target/release/$APP_NAME" ]]; then
        log_error "Binary not found after build — aborting."
        exit 1
    fi
    # Restart service to pick up new binary
    log_info "Build succeeded. Restarting service..."
    systemctl restart "$APP_NAME"
    sleep 2
    if systemctl is-active --quiet "$APP_NAME"; then
        log_info "Update completed successfully!"
        systemctl status "$APP_NAME" --no-pager
    else
        log_error "Service failed to start after update. Check logs:"
        journalctl -u "$APP_NAME" -n 20 --no-pager
        exit 1
    fi
 }
 # Uninstall function
 uninstall() {
    log_info "Uninstalling LLM Proxy Gateway..."
    check_root
    # Stop and disable service
    systemctl stop "$APP_NAME" 2>/dev/null || true
    systemctl disable "$APP_NAME" 2>/dev/null || true
    rm -f "$SERVICE_FILE"
    systemctl daemon-reload
    # Remove application files
    rm -rf "$INSTALL_DIR"
    rm -rf "$CONFIG_DIR"
    # Keep data and logs (comment out to remove)
    log_warn "Data directory $DATA_DIR and logs $LOG_DIR have been preserved"
    log_warn "Remove manually if desired:"
    log_warn "  rm -rf $DATA_DIR $LOG_DIR"
    # Remove user (optional)
    read -p "Remove user $APP_USER? [y/N]: " -n 1 -r
    echo
    if [[ $REPLY =~ ^[Yy]$ ]]; then
        userdel "$APP_USER" 2>/dev/null || true
        groupdel "$APP_GROUP" 2>/dev/null || true
    fi
    log_info "Uninstallation completed!"
 }
 # Show usage
 usage() {
    cat << EOF
 LLM Proxy Gateway Deployment Script
 Usage: $0 [command]
 Commands:
  deploy     - Install and configure LLM Proxy Gateway
  update     - Pull latest changes, rebuild, and restart
  status     - Show service status and health check
  logs       - Tail the service logs (Ctrl+C to stop)
  uninstall  - Remove LLM Proxy Gateway
  help       - Show this help message
 Examples:
  $0 deploy     # Full installation
  $0 update     # Update to latest version
  $0 status     # Check if service is healthy
  $0 logs       # Follow live logs
 EOF
 }
 # Status function
 status() {
    echo ""
    log_info "Service status:"
    systemctl status "$APP_NAME" --no-pager 2>/dev/null || log_warn "Service not found"
    echo ""
    # Health check
    if curl -sf http://localhost:8080/health &>/dev/null; then
        log_info "Health check: OK"
    else
        log_warn "Health check: FAILED (service may not be running or port 8080 not responding)"
    fi
    # Show current git commit
    if [[ -d "$INSTALL_DIR/.git" ]]; then
        echo ""
        log_info "Installed version:"
        git -C "$INSTALL_DIR" log -1 --format="  %h  %s  (%cr)" 2>/dev/null
    fi
 }
 # Logs function
 logs() {
    log_info "Tailing $APP_NAME logs (Ctrl+C to stop)..."
    journalctl -u "$APP_NAME" -f
 }
 # Parse command line arguments
 case "${1:-}" in
    deploy)
        deploy
        ;;
    update)
        update
        ;;
    status)
        status
        ;;
    logs)
        logs
        ;;
    uninstall)
        uninstall
        ;;
    help|--help|-h)
        usage
        ;;
    *)
        usage
        exit 1
        ;;
 esac
@@ -1,13 +0,0 @@
 -- Migration: add billing_mode to provider_configs
 -- Adds a billing_mode TEXT column with default 'prepaid'
 -- After applying, set Gemini to postpaid with:
 --   UPDATE provider_configs SET billing_mode = 'postpaid' WHERE id = 'gemini';
 BEGIN TRANSACTION;
 ALTER TABLE provider_configs ADD COLUMN billing_mode TEXT DEFAULT 'prepaid';
 COMMIT;
 -- NOTE: If you use a production SQLite file, run the following to set Gemini to postpaid:
 -- sqlite3 /path/to/db.sqlite "UPDATE provider_configs SET billing_mode='postpaid' WHERE id='gemini';"
@@ -1,13 +0,0 @@
 -- Migration: add composite indexes for query performance
 -- Adds three composite indexes:
 -- 1. idx_llm_requests_client_timestamp on llm_requests(client_id, timestamp)
 -- 2. idx_llm_requests_provider_timestamp on llm_requests(provider, timestamp)
 -- 3. idx_model_configs_provider_id on model_configs(provider_id)
 BEGIN TRANSACTION;
 CREATE INDEX IF NOT EXISTS idx_llm_requests_client_timestamp ON llm_requests(client_id, timestamp);
 CREATE INDEX IF NOT EXISTS idx_llm_requests_provider_timestamp ON llm_requests(provider, timestamp);
 CREATE INDEX IF NOT EXISTS idx_model_configs_provider_id ON model_configs(provider_id);
 COMMIT;
@@ -1,11 +0,0 @@
 [2m2026-03-06T20:07:36.737914Z[0m [32m INFO[0m Starting LLM Proxy Gateway v0.1.0
 [2m2026-03-06T20:07:36.738903Z[0m [32m INFO[0m Configuration loaded from Some("/home/newkirk/Documents/projects/web_projects/llm-proxy/config.toml")
 [2m2026-03-06T20:07:36.738945Z[0m [32m INFO[0m Encryption initialized
 [2m2026-03-06T20:07:36.739124Z[0m [32m INFO[0m Connecting to database at ./data/llm_proxy.db
 [2m2026-03-06T20:07:36.753254Z[0m [32m INFO[0m Database migrations completed
 [2m2026-03-06T20:07:36.753294Z[0m [32m INFO[0m Database initialized at "./data/llm_proxy.db"
 [2m2026-03-06T20:07:36.755187Z[0m [32m INFO[0m Fetching model registry from https://models.dev/api.json
 [2m2026-03-06T20:07:37.000853Z[0m [32m INFO[0m Successfully loaded model registry
 [2m2026-03-06T20:07:37.001382Z[0m [32m INFO[0m Model config cache initialized
 [2m2026-03-06T20:07:37.001702Z[0m [33m WARN[0m SESSION_SECRET environment variable not set. Using a randomly generated secret. This will invalidate all sessions on restart. Set SESSION_SECRET to a fixed hex or base64 encoded 32-byte value.
 [2m2026-03-06T20:07:37.002898Z[0m [32m INFO[0m Server listening on http://0.0.0.0:8082
@@ -1 +0,0 @@
 945904
@@ -1,14 +0,0 @@
 gantt
    title LLM Proxy Project Timeline
    dateFormat  YYYY-MM-DD
    section Frontend
    Standardize Escaping (users.js)    :a1, 2026-03-06, 1d
    section Backend Cleanup
    Remove Unused Imports             :b1, 2026-03-06, 1d
    section HMAC Migration
    Architecture Design               :c1, 2026-03-07, 1d
    Backend Implementation            :c2, after c1, 2d
    Session Refresh Logic             :c3, after c2, 1d
    section Testing
    Integration Test (Encrypted Keys) :d1, 2026-03-09, 2d
    HMAC Verification Tests           :d2, after c3, 1d