0e74a16b4099c190935da2f8c428885d29da44d1
- Backend: added transcribe_audio (gpt-4o-transcribe), switched audio handler to full blob -> REST -> LLM -> streaming TTS - Frontend: MediaRecorder (webm/opus) full recording sent on stop (one blob per utterance) - Removed dead WhisperStream callbacks and pending_transcript/lock - This unblocks voice per AUDIT item 1 (Option B fallback). Deltas will come in later item. - Also preps for deprecation fix (MediaRecorder is the good path).
Kira ✨
AI body double — a girly-pop focus companion with real-time voice conversation, lo-fi music, ADHD tools, and a customizable avatar.
Your wife's very own focus bestie who's always there when her real body double isn't available.
Features
- 🎙️ Voice conversation — Push-to-talk microphone → STT (Whisper) → LLM (DeepSeek) → TTS (OpenAI Nova voice)
- 💬 Text chat fallback — Type when you don't want to speak
- 🎶 Lo-fi music — Streaming from Lofi Girl YouTube channel
- 🍅 Timer — Pomodoro, countdown, stopwatch modes
- 📝 Notes — Quick task list / check-in notes
- 🎨 10 background scenes — Cozy room, coffee shop, garden, rainy window, starry night, sakura, ocean, autumn, winter cabin, sunset
- ✨ Particle effects — Rain, stars, cherry petals, snow per scene
- 👘 Wardrobe — 5 outfits + 5 accessories (bow, glasses, flower crown, earrings, scarf)
- 🐱 Pet zone — Two cats: Mochi (orange fluffy) and Luna (sleepy black)
- 🕐 Live clock — Time + date
- 🌸 Animated avatar — Blinking, speaking mouth, waving, outfit colors (Live2D-ready)
Quick Start
1. Get API keys
| Service | What for | Where |
|---|---|---|
| OpenAI | Whisper STT + TTS (Nova voice) | https://platform.openai.com/api-keys |
| DeepSeek | LLM brain | https://platform.deepseek.com/api_keys |
2. Configure
cd ~/Projects/ai-body-double
cp .env.example .env
# Edit .env with your API keys:
# OPENAI_API_KEY=sk-...
# DEEPSEEK_API_KEY=sk-...
3. Run
docker compose up -d
Open http://kira.hobokenchicken.com:3000 (or wherever you deploy it).
4. Add a Caddy entry (homelab)
kira.hobokenchicken.com {
reverse_proxy 172.20.0.X:3000
}
Architecture
Browser ──WebSocket──▶ Backend (FastAPI)
│ │
├─ Mic audio ──────────▶ ├─ Whisper API (STT)
│ ├─ DeepSeek (LLM)
│ ◀── TTS audio ──────── ├─ OpenAI TTS
│ │
├─ YouTube embed (lo-fi) │
├─ Timer / Notes / Cats │
└─ Animated avatar │
Live2D Model Setup
Kira currently uses a CSS/SVG animated placeholder avatar. To add a Live2D model:
- Commission or obtain a
.model3.json(Cubism 4.x format) model - Place the model directory in
frontend/public/live2d/models/ - Rename the model entry point to
kira.model3.json - The WebGL renderer will auto-detect and switch
Required model files:
kira.model3.json— model definitionkira.moc3— mesh/deformer datakira.cdi3.json— display info (optional)textures/— PNG texture filesmotions/— animation files (optional)expressions/— face expression files (optional)
Recommended creators for custom Live2D models: search VGen, VTube Studio model artists, or Fiverr.
Color Palette
| Token | Hex | Usage |
|---|---|---|
| Kira Pink | #FFB6C1 |
Primary accent |
| Kira Lavender | #D8B4FE |
Secondary accent |
| Kira Mint | #A7F3D0 |
Success/status |
| Background | #FFF5F5 |
Card/comfy bg |
| Text Plum | #4A1942 |
Body text |
| Text Violet | #7C3AED |
Soft text |
Project Structure
ai-body-double/
├── docker-compose.yml
├── .env
├── backend/ # FastAPI + WebSocket
│ ├── main.py # WS handler (STT→LLM→TTS pipeline)
│ ├── services/
│ │ ├── stt.py # OpenAI Whisper
│ │ ├── llm.py # DeepSeek
│ │ └── tts.py # OpenAI TTS
│ └── config.py # Env config
├── frontend/ # React + Vite + TailwindCSS
│ ├── src/
│ │ ├── App.tsx # Main layout
│ │ ├── components/
│ │ │ ├── AnimatedAvatar.tsx # SVG animated character
│ │ │ ├── KiraAvatar.tsx # Live2D loader + fallback
│ │ │ ├── ChatBubble.tsx # Conversation display
│ │ │ ├── Timer.tsx # Pomodoro/stopwatch
│ │ │ ├── MusicPlayer.tsx # Lofi Girl embed
│ │ │ ├── PetZone.tsx # Two CSS cats
│ │ │ ├── Wardrobe.tsx # Outfit + accessories
│ │ │ └── Particles.tsx # Rain/stars/petals/snow
│ │ └── hooks/
│ │ └── useConversation.ts # WS + audio management
│ └── public/live2d/ # Cubism SDK + model slot
└── README.md
Description
Languages
TypeScript
73.8%
Python
24.5%
CSS
0.8%
HTML
0.5%
Dockerfile
0.4%