feat(audio): Gemini Live API replaces Whisper+GPT+ElevenLabs
Single WebSocket proxy: frontend PCM16 16kHz → backend → Gemini Live API Gemini returns PCM16 24kHz audio + text. Playback via Web Audio API queue. Removed OpenAI/DeepSeek deps. Model: gemini-3.1-flash-live-preview. Voice: Aoede. Streaming bidirectional audio with silence gating.
This commit is contained in:
@@ -1,10 +1,8 @@
|
||||
fastapi>=0.115.0
|
||||
uvicorn[standard]>=0.34.0
|
||||
python-dotenv>=1.1.0
|
||||
openai>=1.55.0
|
||||
websockets>=14.1
|
||||
pydantic>=2.10.0
|
||||
pydantic-settings>=2.7.0
|
||||
httpx>=0.28.0
|
||||
honcho-ai>=2.1.0
|
||||
openai[realtime]>=2.41.0
|
||||
|
||||
Reference in New Issue
Block a user