hobokenchicken/kira

Author	SHA1	Message	Date
hobokenchicken	4641d74536	fix(welcome): make WelcomeScreen support isCompact prop to prevent full-screen CSS clash when rendering inside saved-ID wrapper card in App.tsx Per PLAN item 8. Saved users now get a clean compact welcome prompt without double min-h-screen divs.	2026-06-04 16:04:14 -04:00
hobokenchicken	eb5952adc6	fix(deprecations): remove dead ScriptProcessorNode PCM code (eliminates console warning); improve YouTube playerVars with origin/modestbranding (reduce postMessage spam); fix Timer stopwatch (now properly counts UP, clean display + interval) Per PLAN item 7.	2026-06-04 15:59:02 -04:00
hobokenchicken	59b72aa184	feat(white-noise): add Web Audio generated white/pink/brown/rain/cafe noise player Separate from lofi music per original spec. Toggleable, volume control, always available in focus column. Finishes item 5.	2026-06-04 15:49:18 -04:00
hobokenchicken	3f1497174d	feat(ui): integrate Notes component into main grid (was dead import) Per original spec and AUDIT/PLAN item 4. Notes now visible alongside Timer/Music.	2026-06-04 15:38:52 -04:00
hobokenchicken	771c00830a	feat(ui): display livePartial / transcript_delta in ChatBubble as 'Hearing:' indicator - REST STT now sends delta (full text) so UI lights up immediately with what was heard. - Works as 'live' for the final transcript (true partials would stream words if Realtime was available). - Per PLAN item 3.	2026-06-04 15:34:37 -04:00
hobokenchicken	77cbd91b93	fix(tts): play Opus chunks immediately as they arrive instead of buffering until speaking_end This makes the voice start playing the first words while the rest of the response is still generating (big win for perceived latency). Per PLAN item 2.	2026-06-04 15:28:40 -04:00
hobokenchicken	0e74a16b40	fix(stt): revert to reliable REST gpt-4o-transcribe + MediaRecorder full-blob (Realtime WS not accessible on key) - Backend: added transcribe_audio (gpt-4o-transcribe), switched audio handler to full blob -> REST -> LLM -> streaming TTS - Frontend: MediaRecorder (webm/opus) full recording sent on stop (one blob per utterance) - Removed dead WhisperStream callbacks and pending_transcript/lock - This unblocks voice per AUDIT item 1 (Option B fallback). Deltas will come in later item. - Also preps for deprecation fix (MediaRecorder is the good path).	2026-06-04 15:23:57 -04:00
hobokenchicken	7502f201c7	feat: Realtime WebSocket STT via gpt-realtime-whisper Replaces REST-based transcription (gpt-4o-transcribe) with WebSocket streaming via gpt-realtime-whisper. Frontend captures PCM16 audio and streams it through the backend to a Realtime transcription session. - Server-side VAD detects utterance boundaries automatically - Word-level transcript deltas stream to the client in real-time - On utterance end, gpt-5.4-nano generates a response - TTS streams back via with_streaming_response - Total pipeline: PCM16 → Realtime WS → LLM → streaming TTS	2026-06-04 14:26:19 -04:00
hobokenchicken	9cd183a83b	fix: streaming TTS via with_streaming_response Replaced synchronous TTS (waiting for full audio at 5.9s) with streaming TTS that sends audio chunks as they arrive. Backend now accumulates chunks in audioBufferRef and plays the complete stream on speaking_end. Reduces TTS latency from ~6s to ~1s first byte.	2026-06-04 14:17:54 -04:00
hobokenchicken	c5cc4dd480	fix: replace PCM16 capture with MediaRecorder (Opus/webm) PCM16 capture via AudioContext was streaming raw audio continuously, causing massive accumulated buffers that took ~20s to transcribe. Replaced with MediaRecorder which records compressed Opus/webm and sends a single blob on release — much smaller, faster to transcribe. Also removed all unused PCM16/WAV helper functions from both frontend and backend.	2026-06-04 14:04:44 -04:00
hobokenchicken	537ddcd841	fix: play Opus TTS audio directly instead of WAV-converting it The backend sends Opus-encoded audio from OpenAI TTS (tts-1 with response_format=opus). The frontend was treating it as raw PCM16 and wrapping it in a WAV container, which corrupted the audio into static. Now plays the Opus data directly as audio/ogg.	2026-06-04 13:59:04 -04:00
hobokenchicken	f2a5416408	feat: cheapest pipeline — gpt-4o-mini-transcribe + gpt-5.4-nano + TTS Simple 3-step chat completions pipeline at ~/usr/bin/bash.019/min total. Streams PCM16 audio from frontend, transcribes on release, generates response via gpt-5.4-nano, speaks via OpenAI TTS. Cost breakdown: gpt-4o-mini-transcribe: /usr/bin/bash.003/min gpt-5.4-nano: ~/usr/bin/bash.001/min OpenAI TTS (nova): /usr/bin/bash.015/min Total: ~/usr/bin/bash.019/min (~/usr/bin/bash.57/day at 30min)	2026-06-04 13:51:35 -04:00
hobokenchicken	274d04ea10	feat: hybrid pipeline — gpt-realtime-whisper + gpt-5.4-nano + TTS Hybrid approach gives streaming STT at ~/usr/bin/bash.017/min + cheap brain at ~/usr/bin/bash.001/min + TTS at ~/usr/bin/bash.015/min = ~/usr/bin/bash.033/min total. - gpt-realtime-whisper handles streaming transcription with VAD - gpt-5.4-nano handles response generation (chat completions) - OpenAI TTS (nova) for voice output - Server VAD detects utterance boundaries - Honcho memory context injected into system prompt - Removed old full Realtime relay service	2026-06-04 13:48:06 -04:00
hobokenchicken	e2332af8d0	feat: OpenAI Realtime API pipeline Replaced the 3-step sequential pipeline (Whisper STT → DeepSeek LLM → OpenAI TTS) with a single OpenAI Realtime API WebSocket using gpt-4o-mini-realtime-preview. - ~300-800ms latency vs 1-3s - Server VAD for automatic turn detection - Streaming audio chunks during playback - Interruptions: user can speak over Kira mid-response - Honcho memory still injected into session instructions - Frontend captures PCM16 mono 24kHz via AudioContext - Backend relays client ↔ OpenAI Realtime API - Supports both voice (PCM16) and text input	2026-06-04 13:32:39 -04:00
hobokenchicken	e64698b0ab	fix: graceful mic-unavailable handling over HTTP navigator.mediaDevices.getUserMedia() requires a secure context (HTTPS or localhost). When accessed over plain HTTP, the API is undefined. Now shows a friendly chat message instead of a cryptic TypeError in the console.	2026-06-04 12:12:07 -04:00
hobokenchicken	895fb9ac0b	fix: Live2D Ticker registration + outfit texture swap path - Registered pixi Ticker via (Live2DModel as any).registerTicker() to fix 'No Ticker registered' warning and animation issues - Fixed outfit texture swap: textures live on model.textures[] not model.internalModel.textures[]	2026-06-04 12:10:20 -04:00
hobokenchicken	3d3df64d7c	fix: missing mic toggle in Live2D view + YouTube autoplay KiraAvatar: Added Talk mic button to Live2D view (was only in AnimatedAvatar fallback). Includes listening-pulse animation. MusicPlayer: Replaced hidden YouTube iframe with proper IFrame Player API. Now starts on explicit user click (Start Lo-Fi button), complying with browser autoplay policies. Supports station switching and volume control after playback starts.	2026-06-04 12:06:16 -04:00
hobokenchicken	bee428ae0c	fix: outfit texture swap via internalModel.textures array model.internalModel.coreModel.setTexture() expects a raw WebGL texture, not a PixiJS Texture. Instead, set the new PixiJS Texture directly on the model's internalModel.textures[2] array. The render loop's bindTexture() call extracts the WebGL handle from the PixiJS BaseTexture and passes it to the Cubism core. This eliminates the cascade of try-catch fallbacks and the 'coreModel.setTexture is not a function' TypeError.	2026-06-04 12:02:48 -04:00
hobokenchicken	0a6946b580	fix: pixi v7 isInteractive TypeError + outfit texture swap - Added isInteractive() stub on Live2DModel to prevent 'e.isInteractive is not a function' errors in pixi v7 - Swapped outfit texture loading to use Assets.load() with cascading fallbacks (model.setTexture -> internalModel -> coreModel) - Removed unused Texture import	2026-06-04 11:57:33 -04:00
hobokenchicken	d519258942	fix: TypeScript build errors in Docker - useRef(null) initial values for Strict TS 6.0 - Removed deprecated transparent option in pixi v7 - Cast Live2DModel to any for addChild type compat - Cast coreModel for getTexture access - Fixed types/index.ts import path to scenes - Added vite-env.d.ts with CSS module declarations - Added null-coalesce for clearInterval calls	2026-06-04 11:46:58 -04:00
hobokenchicken	b7edf6a82d	feat: Live2D outfit textures + expression system + canvas tweaks - Generated 5 outfit texture variants via HSL recolor (saved skin tones) - Dynamic texture_02 swapping when outfit changes - Expression buttons (Normal, Smile, Sad, Angry, Surprised, Blushing) - Random idle expression changes every 8-15s - Responsive canvas sizing with devicePixelRatio support - Outfit generation script in scripts/gen_outfits.py - Smoother lip-sync with phase-based mouth animation	2026-06-04 11:40:10 -04:00
hobokenchicken	9653f80abd	feat: Live2D model integration with pixi-live2d-display - Added Epsilon Live2D model (Cubism 4) with full motion/expression set - KiraAvatar now loads Live2D via PixiJS + cubism4 renderer - Idle animation auto-plays on load - Lip-sync: PARAM_MOUTH_OPEN_Y driven by speaking state - 8 expressions (Normal, Smile, Sad, Angry, Surprised, Blushing, f01, f02) - 15 motion files including idle, tap, flick, shake - Physics, eye blink, and LipSync parameter groups configured - Falls back to animated SVG placeholder if model isn't available	2026-06-04 11:34:59 -04:00
hobokenchicken	78ea059f08	feat: user personalization with Honcho-backed preferences - WelcomeScreen: first-time name entry with cute onboarding - identify WS message: sets user_id, loads saved prefs from Honcho - set_preference WS message: saves scene/outfit/accessory to Honcho metadata - Preferences auto-load on return visits via localStorage + Honcho peer meta - Kira uses the user's name in greeting and prompts - Backend: get/set preference methods in KiraMemory service - Frontend: optimistic preference updates, synced to backend on change	2026-06-04 11:00:58 -04:00
hobokenchicken	97424cb98f	init: Kira — AI body double with Honcho memory Full voice pipeline (Whisper STT -> DeepSeek LLM -> OpenAI TTS), animated SVG avatar (Live2D-ready), girly-pop UI, lofi music, timer/notes/pets/wardrobe widgets, 10 background scenes with particle effects, Honcho cross-session memory.	2026-06-04 10:51:38 -04:00

1 2

74 Commits