Files

T

hobokenchicken 97424cb98f init: Kira — AI body double with Honcho memory

Full voice pipeline (Whisper STT -> DeepSeek LLM -> OpenAI TTS),
animated SVG avatar (Live2D-ready), girly-pop UI, lofi music,
timer/notes/pets/wardrobe widgets, 10 background scenes with
particle effects, Honcho cross-session memory.

2026-06-04 10:51:38 -04:00

3.9 KiB

Raw Blame History

AI Body Double — Project Scope Questions

1. Avatar Style

What visual direction for the character companion?

Option A: Cute 2D illustrated character — Simple animated sprite (blink, sway, idle bounce, wave). Think a stylized flat-vector character. Fast to build, runs anywhere.
Option B: Live2D / Vtuber-style — Full rigged character with lip-sync, gestures, head tracking. Looks incredible but requires custom art assets and is significantly more work.
Option C: Pixel art character — Retro chibi sprite with simple animations. Cozy, low-fi aesthetic.
Option D: No character art yet — Start with a clean UI dashboard, add the character later.

2. Voice (TTS)

What level of voice quality?

Option A: Cloud API (ElevenLabs) — Best quality female voices, natural intonation, can do "girly-pop" vibe. ~$5/month.
Option B: Local Piper TTS — Free, self-hosted, no recurring cost. Lower quality, more robotic, but plenty of female voice models available.
Option C: OpenAI TTS — Good quality, pay-per-use (~$0.015/minute). Middle ground.
Option D: No voice yet — Start text-only, add TTS later.

3. Platform

Where does this need to run?

Browser-based (web app) — Accessible from any laptop/phone/tablet on the home network. Easiest to build and iterate.
Desktop app (Electron/Tauri) — Native feel, offline capable. More work.
Mobile app — Phone-native experience. Most work.
Both — Browser for laptop, but also accessible on phone.

4. Start Scope

How should we slice the first working version?

Option A: MVP — Character + Lo-fi + Timer + Notes Build the visual companion, lo-fi music player, pomodoro/focus timer, and a notes widget. Get the "presence" and toolset right first. Add voice interactivity in phase 2.
Option B: Full build — Everything including voice Go straight for the full pipeline: STT (microphone input) -> LLM (processes what she says) -> TTS (talks back) + all tools. Longer first delivery but one complete system.
Option C: Somewhere in between Build the dashboard + character + tools, plus TTS only (so the assistant talks to her but doesn't listen yet). Add microphone input in phase 2.

5. LLM Backend (Brain)

What powers the assistant's responses and personality?

Local (Ollama) — Self-hosted, free, private. You've got an AMD 6650 XT that can accelerate inference. Runs models like Llama 3, Mistral, Qwen.
Cloud API — Better personality/instruction following. OpenAI, Anthropic, or similar. Small cost per month.
Not needed at first — Start with scripted responses and canned encouragement, add AI conversational ability later.

6. Music & Audio

How should the lo-fi / white noise work?

Streaming (YouTube/SoundCloud URLs) — Curate playlists of lo-fi study beats. Free, endless variety.
Local audio files — Download lo-fi tracks and ambient sounds. Works offline.
Generated — Use AI music generation for custom tracks. Experimental.
Integrated web player — Embed something like Spotify or YouTube Music.

7. Virtual Pet

What kind of pet?

Cat — Fits the cozy/girly aesthetic. Classic.
Dog — Energetic companion.
Fantasy creature — A cute blob/slime/fairy/dragon.
Customizable — Let her pick, or unlock different ones.

8. Background Scenes

What environments should be available?

Cozy bedroom / study
Coffee shop / café
Garden / nature
Rainy window
Starry night / space
Underwater / aquarium
Minimalist / clean
Seasonal (winter cabin, spring garden, autumn library)

Project Name (optional)

What should we call this? Some ideas:

Buddy (simple)
CozyFocus / CoPilot (functional)
Luna / Mochi / Coco (character name)
Something else?

Once you answer these, I'll write up the full architecture plan and start building.

3.9 KiB Raw Blame History