hobokenchicken/kira

Fork 0

Commit Graph

Author	SHA1	Message	Date
hobokenchicken	e2332af8d0	feat: OpenAI Realtime API pipeline Replaced the 3-step sequential pipeline (Whisper STT → DeepSeek LLM → OpenAI TTS) with a single OpenAI Realtime API WebSocket using gpt-4o-mini-realtime-preview. - ~300-800ms latency vs 1-3s - Server VAD for automatic turn detection - Streaming audio chunks during playback - Interruptions: user can speak over Kira mid-response - Honcho memory still injected into session instructions - Frontend captures PCM16 mono 24kHz via AudioContext - Backend relays client ↔ OpenAI Realtime API - Supports both voice (PCM16) and text input	2026-06-04 13:32:39 -04:00
hobokenchicken	97424cb98f	init: Kira — AI body double with Honcho memory Full voice pipeline (Whisper STT -> DeepSeek LLM -> OpenAI TTS), animated SVG avatar (Live2D-ready), girly-pop UI, lofi music, timer/notes/pets/wardrobe widgets, 10 background scenes with particle effects, Honcho cross-session memory.	2026-06-04 10:51:38 -04:00

Author

SHA1

Message

Date

hobokenchicken

e2332af8d0

feat: OpenAI Realtime API pipeline

Replaced the 3-step sequential pipeline (Whisper STT → DeepSeek LLM
→ OpenAI TTS) with a single OpenAI Realtime API WebSocket using
gpt-4o-mini-realtime-preview.

- ~300-800ms latency vs 1-3s
- Server VAD for automatic turn detection
- Streaming audio chunks during playback
- Interruptions: user can speak over Kira mid-response
- Honcho memory still injected into session instructions
- Frontend captures PCM16 mono 24kHz via AudioContext
- Backend relays client ↔ OpenAI Realtime API
- Supports both voice (PCM16) and text input

2026-06-04 13:32:39 -04:00

hobokenchicken

97424cb98f

init: Kira — AI body double with Honcho memory

Full voice pipeline (Whisper STT -> DeepSeek LLM -> OpenAI TTS),
animated SVG avatar (Live2D-ready), girly-pop UI, lofi music,
timer/notes/pets/wardrobe widgets, 10 background scenes with
particle effects, Honcho cross-session memory.

2026-06-04 10:51:38 -04:00

2 Commits