kira/backend/requirements.txt at 08932068fd2c39de0064b4249d78064dfb1d97da - kira - EspressOps

hobokenchicken/kira

Files

T

hobokenchicken e2332af8d0 feat: OpenAI Realtime API pipeline

Replaced the 3-step sequential pipeline (Whisper STT → DeepSeek LLM
→ OpenAI TTS) with a single OpenAI Realtime API WebSocket using
gpt-4o-mini-realtime-preview.

- ~300-800ms latency vs 1-3s
- Server VAD for automatic turn detection
- Streaming audio chunks during playback
- Interruptions: user can speak over Kira mid-response
- Honcho memory still injected into session instructions
- Frontend captures PCM16 mono 24kHz via AudioContext
- Backend relays client ↔ OpenAI Realtime API
- Supports both voice (PCM16) and text input

2026-06-04 13:32:39 -04:00

11 lines

194 B

Plaintext

Raw Blame History

 fastapi>=0.115.0
 uvicorn[standard]>=0.34.0
 python-dotenv>=1.1.0
 openai>=1.55.0
 websockets>=14.1
 pydantic>=2.10.0
 pydantic-settings>=2.7.0
 httpx>=0.28.0
 honcho-ai>=2.1.0
 openai[realtime]>=2.41.0