feat: hybrid pipeline — gpt-realtime-whisper + gpt-5.4-nano + TTS

Hybrid approach gives streaming STT at ~/usr/bin/bash.017/min + cheap brain at ~/usr/bin/bash.001/min + TTS at ~/usr/bin/bash.015/min = ~/usr/bin/bash.033/min total. - gpt-realtime-whisper handles streaming transcription with VAD - gpt-5.4-nano handles response generation (chat completions) - OpenAI TTS (nova) for voice output - Server VAD detects utterance boundaries - Honcho memory context injected into system prompt - Removed old full Realtime relay service
2026-06-04 13:48:06 -04:00
parent 1c15d42e06
commit 274d04ea10
4 changed files with 281 additions and 284 deletions
@@ -142,6 +142,10 @@ export function useConversation() {
        addMessage(msg.role === 'user' ? 'user' : 'kira', msg.text);
        break;

+      case 'transcript_delta':
+        // Streaming partial transcript — could show as typing indicator
+        break;
+
      case 'speaking_start':
        setIsKiraSpeaking(true);
        break;