feat(audio): Gemini Live API replaces Whisper+GPT+ElevenLabs

Single WebSocket proxy: frontend PCM16 16kHz → backend → Gemini Live API
Gemini returns PCM16 24kHz audio + text. Playback via Web Audio API queue.
Removed OpenAI/DeepSeek deps. Model: gemini-3.1-flash-live-preview.
Voice: Aoede. Streaming bidirectional audio with silence gating.
This commit is contained in:
2026-06-05 23:36:29 -04:00
parent d2bde65645
commit 83a990e838
6 changed files with 331 additions and 286 deletions
-2
View File
@@ -1,10 +1,8 @@
fastapi>=0.115.0
uvicorn[standard]>=0.34.0
python-dotenv>=1.1.0
openai>=1.55.0
websockets>=14.1
pydantic>=2.10.0
pydantic-settings>=2.7.0
httpx>=0.28.0
honcho-ai>=2.1.0
openai[realtime]>=2.41.0