Commit Graph

59 Commits

Author SHA1 Message Date
hobokenchicken 83a990e838 feat(audio): Gemini Live API replaces Whisper+GPT+ElevenLabs
Single WebSocket proxy: frontend PCM16 16kHz → backend → Gemini Live API
Gemini returns PCM16 24kHz audio + text. Playback via Web Audio API queue.
Removed OpenAI/DeepSeek deps. Model: gemini-3.1-flash-live-preview.
Voice: Aoede. Streaming bidirectional audio with silence gating.
2026-06-05 23:36:29 -04:00
hobokenchicken d2bde65645 fix(outfit): PIXI BaseTexture.from() + GL cache invalidation
Uses PIXI.BaseTexture.from(url) for proper pipeline.
Deletes old texture's _glTextures entry before swap.
Deletes new texture's _glTextures after swap to force re-upload.
Render loop (line 4960) detects missing GL entry and re-binds.
2026-06-05 16:11:04 -04:00
hobokenchicken 235f049405 fix(outfit): direct GL texture injection into Cubism renderer
Bypasses PIXI texture pipeline entirely. Loads outfit PNG as Image,
creates raw WebGL texture, and calls cubismRenderer.bindTexture() directly.

Also updated lo-fi video IDs to confirmed working non-stream videos:
- 7ccH8u8fj8Y: Lofi Girl best of 2025
- HFQibg2OJkU: Chillhop Spring 2025
- udGvUx70Q3U: Lofi Chilled Beats 12hr
2026-06-05 16:07:42 -04:00
hobokenchicken 705792a4cb fix(lofi): update to working YouTube video IDs
Old Lofi Girl streams were taken down. Updated to active streams:
- 7NOSDKb0HlU: lofi hip hop radio
- MCkTebktHVc: Chill lofi (College Music)
- KMXZF-K2mus: 24/7 beats
2026-06-05 15:48:33 -04:00
hobokenchicken 45a1de936a fix(outfit+lofi): proper texture swap + YT Player API
Outfit swap: replace entire Texture object + invalidate GL cache
Lo-fi: visible 80px player with YT Player API, proper playVideo() on user gesture
2026-06-05 15:45:15 -04:00
hobokenchicken a3b5477524 feat(wardrobe): swap Live2D outfit textures via wardrobe buttons
Epsilon model has 3 texture sheets: body (00), hair (01), clothes (02).
Outfit PNGs in /outfits/ replace texture_02 at runtime via PIXI BaseTexture swap.
Live2DStage now accepts outfit prop and swaps on change.
App passes currentOutfit to Live2DStage.
2026-06-05 15:29:20 -04:00
hobokenchicken 8a50fef24b fix(live2d): canvas z-50 so cat renders above frosted glass sidebar 2026-06-05 15:23:40 -04:00
hobokenchicken ff6bf46724 fix(lofi): replace YT IFrame API with direct iframe embed
The YT Player API silently fails to autoplay in hidden iframes.
Replaced with a direct <iframe> embed with allow='autoplay; encrypted-media'.
- On 'Start Lo-Fi': creates iframe with autoplay=1
- Station change: remounts iframe with new videoId
- Volume: postMessage API to YT iframe (best effort)
- Much simpler, no external script dependency
2026-06-05 15:20:00 -04:00
hobokenchicken 5dbe30b43c feat(audio): Haus ambient sounds + YouTube player fix
WhiteNoise replaced with full ambient mixer:
- 6 categories: Noise, Rain, Nature, Places, Things, Animals
- 29 real ambient sounds from Haus (MIT licensed)
- Howler.js for playback with looping and per-sound volume
- Mix multiple sounds simultaneously
- Category tab navigation
- 300ms fade in/out for smooth toggling

MusicPlayer fixes:
- Removed origin param (causes issues behind reverse proxy)
- Added onError handler for YouTube errors
- Added onStateChange to track playing state
- Player container 1x1 opacity:0 instead of offscreen positioning
2026-06-05 15:12:51 -04:00
hobokenchicken 5131eb729f fix(audio): proper noise synthesis + YouTube player init
WhiteNoise:
- Paul Kellet pink noise filter (industry standard)
- Brownian motion for brown noise (proper integration)
- Rain: layered brown rumble + random droplet impulse pings
- Cafe: low hum + scattered clatter transients
- Stereo buffers for spatial depth
- Crossfade start/end for seamless loop (50ms fade)
- Proper cleanup on unmount

MusicPlayer:
- YouTube player container uses offscreen positioning instead of
  display:none (hidden divs prevent iframe API initialization)
- Ref-based closure fix: activeId and volume use refs so the
  onYouTubeIframeAPIReady callback reads current values
- Added playerReady state to guard loadVideoById calls
2026-06-05 15:03:02 -04:00
hobokenchicken 8543461195 style: frosted glass sidebars (backdrop-blur-xl, bg-white/40)
Both sidebars get glassmorphism treatment so they stand out against
the scene background images.
2026-06-05 14:57:12 -04:00
hobokenchicken 08932068fd feat(scenes): add 5 illustrated background scenes
Generated 5 scene illustrations via gpt-image-2:
- Cozy Room: golden hour bedroom workspace
- Rooftop Cafe: twilight cafe with city skyline
- Garden: indoor botanical reading nook
- Rainy Window: night rain with bokeh lights
- Stargazing: hilltop under stars and cherry blossoms

Scene interface now supports optional 'image' field.
Background shows full-bleed image when available, gradient as fallback.
Rainy window gets rain particles, stargazing gets stars.
2026-06-05 14:53:38 -04:00
hobokenchicken 37c06db6be fix(expressions): use full expression names with .exp3.json suffix
Epsilon model registers expressions as 'Smile.exp3.json' not 'Smile'.
Added EXPR_MAP to map friendly names to full registered names.
Fixes expression buttons and idle cycling.
2026-06-05 14:19:59 -04:00
hobokenchicken 73fe77f9aa fix(pets): dynamically position cat at PetZone DOM element
Cat position is now measured from the [data-petzone] DOM element's
getBoundingClientRect(), so it always aligns with the PetZone section
regardless of window size or sidebar content height.
Removed Live2DCat import from PetZone (cat renders on shared stage).
2026-06-05 13:51:50 -04:00
hobokenchicken 04ad706de6 fix(layout): pin PetZone to bottom of right sidebar, separate from scrollable content 2026-06-05 13:48:53 -04:00
hobokenchicken 7f11ff83f0 fix(cat): position cat at bottom-right above status bar, not overlapping wardrobe 2026-06-05 13:47:15 -04:00
hobokenchicken f76ae3faec fix(cat): use getLocalBounds for accurate scale calculation
Cat was rendering huge because scale was computed from current model.width
which included previous scale transforms. Now resets to scale(1) first,
reads natural bounds, then computes target scale for 100px rendered height.
2026-06-05 13:45:32 -04:00
hobokenchicken 9d2ba052f4 fix(live2d): precise cat positioning and sizing
- Extract layout constants matching Tailwind config (PAD, LEFT_W, GAP, RIGHT_W)
- positionModels() helper computes exact pixel positions from layout
- Kira: centered in center panel at 78% of available space
- Mochi: 120px tall, centered in right sidebar, above status bar
- Both models reposition on window resize
2026-06-05 13:42:21 -04:00
hobokenchicken 5f5127f4fa refactor(live2d): single shared stage for both Kira and Mochi
Live2DStage creates ONE full-viewport transparent canvas (z-0, pointer-events:none).
Both Kira and Mochi cat models render on the same Pixi stage and WebGL context.
KiraAvatar is now UI-only (no canvas), receives model ref from stage.
PetZone is label-only. Eliminates all WebGL context conflict errors.
2026-06-05 13:34:51 -04:00
hobokenchicken 43a392e5f5 fix(pets): use preferWebGLVersion:1 for cat canvas to avoid context conflicts
Cat gets its own canvas with WebGL1 context (Kira uses WebGL2 by default).
Different GL versions don't share buffers, so no bindBuffer spam.
Cat now renders in the PetZone section of the right sidebar where it belongs.
Removed all shared-context onAppReady plumbing.
2026-06-05 13:25:10 -04:00
hobokenchicken 1f8bcf6b4f fix(pets): cat renders on shared KiraAvatar canvas via onAppReady callback
Single WebGL context, no bindBuffer spam. Cat model loads onto
KiraAvatar's stage and positions itself at bottom-right corner.
PetZone passes app prop through to Live2DCat.
2026-06-05 13:19:32 -04:00
hobokenchicken 37f8bf59a0 fix(webgl): use forceCanvas for Live2DCat to avoid dual WebGL context conflicts
Reverts shared-context approach. Live2DCat gets its own canvas with
forceCanvas:true (Canvas2D renderer), which avoids the WebGL bindBuffer
spam entirely. Cleaned up onAppReady prop from KiraAvatar.
2026-06-05 13:08:18 -04:00
hobokenchicken be1e51cc9a fix(webgl): share single Pixi context between KiraAvatar and Live2DCat
Eliminates WebGL bindBuffer/bindTexture spam from dual Application contexts.
Cat model now loads onto KiraAvatar's shared stage via onAppReady callback.
2026-06-05 13:04:43 -04:00
hobokenchicken 017c81cffa feat(pets): replace static cats with Live2D LittleCat model (black texture)
- Copied LittleCat model files to frontend/public/live2d/models/little-cat/
- Using the black alternate texture as default
- Created Live2DCat component that renders the model in a small canvas
- PetZone now shows a single Live2D cat instead of two SVG cats
2026-06-05 12:55:24 -04:00
hobokenchicken 15199dfdee feat(layout): move avatar to center hero position; timer+notes+chat to left sidebar 2026-06-05 12:44:17 -04:00
hobokenchicken 95f97fa897 fix(avatar): declarative canvas element in JSX; remove manual DOM append
React was potentially clearing the canvas on re-render because we
appended it manually to a div. Now using a <canvas ref={canvasRef}>
element directly in JSX that React manages. Pixi app uses .
Scale set to 82% of container.
2026-06-05 10:10:03 -04:00
hobokenchicken e00dc37e68 fix(avatar): use Pixi resizeTo for native canvas sizing; remove all manual CSS/ResizeObserver
Previous approach set CSS width:100% on a low-res canvas, causing the browser
to stretch/pixelate the model. Now using Pixi's built-in resizeTo so the
canvas internal resolution always matches the container. Model scaled to 90%
of container with centered anchor.
2026-06-05 09:57:54 -04:00
hobokenchicken 3a6a1cd6c3 fix(avatar): reduce model margin to 45% to prevent clipping in narrow sidebar 2026-06-05 09:51:39 -04:00
hobokenchicken 13dbcdb7f5 fix(avatar): re-apply CSS 100% after Pixi resize(); use fitModel helper; 65% margin
Pixi renderer.resize() overwrites canvas inline width/height styles,
locking the canvas to the initial size and leaving empty space below.
Now we re-apply width:100%;height:100% after every resize so the canvas
always fills its container. Removed unused appRef.
2026-06-05 09:47:51 -04:00
hobokenchicken f2ff91730b fix(avatar): use ResizeObserver for accurate container sizing; force canvas CSS 100%; reduce margin to 68%
Problem: flex layout wasn't ready on first paint, so clientWidth fell back
to 400px. Canvas was 400px wide but parent was only 288px, causing the
avatar to be clipped on the right.

Fix: ResizeObserver measures real laid-out size before init. Canvas forced
to width/height 100% via CSS so it never overflows. Model scaled to 68%
with centered anchor. Resize handled dynamically.
2026-06-05 09:43:12 -04:00
hobokenchicken dc2cb3bbb3 fix(avatar): reduce model scale to 72% (from 85%) and tighten anchor to prevent right-side clipping in narrow sidebar 2026-06-05 09:36:37 -04:00
hobokenchicken dfd014ac82 feat(ui): complete layout redesign — three-panel desk layout
Replaced the hero + scrollable grid with a fixed-height three-column
workspace:
- Left (fixed 288px): Kira avatar + compact chat + text input
- Center (flex): Large focus timer + notes
- Right (fixed 256px): Music, white noise, wardrobe, pets

Thin top bar: scene selector dots + clock
Thin bottom bar: status + connection indicator

No cards, no scrollable grid, no wasted space. Clean, modern,
everything visible at once. Avatar fills full sidebar height.
2026-06-05 09:33:42 -04:00
hobokenchicken db23034e36 feat(ui): ditch all glass-card containers — flat, modern, card-free layout
All 15+ glass-card instances removed across every component (Timer, Music,
Notes, WhiteNoise, PetZone, Clock, ChatBubble, Wardrobe, Toolbar, KiraAvatar,
BackgroundScene, WelcomeScreen, App text input + bottom bar).

New design: widgets sit directly on the gradient background with only padding,
no frosted-glass backgrounds, borders, or shadows. Cleaner, more modern look.
2026-06-05 09:26:51 -04:00
hobokenchicken f5930d6190 fix(avatar): center Live2D model in card, overlay controls on canvas; scale model to 85% of container; remove card padding; clean template literals to avoid TS parsing issues 2026-06-05 09:16:16 -04:00
hobokenchicken baaa89756f feat(ui): center avatar as hero, ~1/3 viewport height; tools grid below
- Avatar now centered in its own row above the tools grid (was crammed in column 1)
- KiraAvatar container: min-height 33vh, canvas up to 500px wide
- Tools reorganized into 4 columns below: Chat, Timer+Music, Notes+Noise, Clock+Pets+Wardrobe
- WelcomeScreen restored to full (not compact) for first-time users
2026-06-05 09:03:32 -04:00
hobokenchicken 4641d74536 fix(welcome): make WelcomeScreen support isCompact prop to prevent full-screen CSS clash when rendering inside saved-ID wrapper card in App.tsx
Per PLAN item 8. Saved users now get a clean compact welcome prompt without double min-h-screen divs.
2026-06-04 16:04:14 -04:00
hobokenchicken eb5952adc6 fix(deprecations): remove dead ScriptProcessorNode PCM code (eliminates console warning); improve YouTube playerVars with origin/modestbranding (reduce postMessage spam); fix Timer stopwatch (now properly counts UP, clean display + interval)
Per PLAN item 7.
2026-06-04 15:59:02 -04:00
hobokenchicken 59b72aa184 feat(white-noise): add Web Audio generated white/pink/brown/rain/cafe noise player
Separate from lofi music per original spec. Toggleable, volume control, always available in focus column.
Finishes item 5.
2026-06-04 15:49:18 -04:00
hobokenchicken 3f1497174d feat(ui): integrate Notes component into main grid (was dead import)
Per original spec and AUDIT/PLAN item 4. Notes now visible alongside Timer/Music.
2026-06-04 15:38:52 -04:00
hobokenchicken 771c00830a feat(ui): display livePartial / transcript_delta in ChatBubble as 'Hearing:' indicator
- REST STT now sends delta (full text) so UI lights up immediately with what was heard.
- Works as 'live' for the final transcript (true partials would stream words if Realtime was available).
- Per PLAN item 3.
2026-06-04 15:34:37 -04:00
hobokenchicken 77cbd91b93 fix(tts): play Opus chunks immediately as they arrive instead of buffering until speaking_end
This makes the voice start playing the first words while the rest of the response is still generating (big win for perceived latency).
Per PLAN item 2.
2026-06-04 15:28:40 -04:00
hobokenchicken 0e74a16b40 fix(stt): revert to reliable REST gpt-4o-transcribe + MediaRecorder full-blob (Realtime WS not accessible on key)
- Backend: added transcribe_audio (gpt-4o-transcribe), switched audio handler to full blob -> REST -> LLM -> streaming TTS
- Frontend: MediaRecorder (webm/opus) full recording sent on stop (one blob per utterance)
- Removed dead WhisperStream callbacks and pending_transcript/lock
- This unblocks voice per AUDIT item 1 (Option B fallback). Deltas will come in later item.
- Also preps for deprecation fix (MediaRecorder is the good path).
2026-06-04 15:23:57 -04:00
hobokenchicken 7502f201c7 feat: Realtime WebSocket STT via gpt-realtime-whisper
Replaces REST-based transcription (gpt-4o-transcribe) with WebSocket
streaming via gpt-realtime-whisper. Frontend captures PCM16 audio and
streams it through the backend to a Realtime transcription session.

- Server-side VAD detects utterance boundaries automatically
- Word-level transcript deltas stream to the client in real-time
- On utterance end, gpt-5.4-nano generates a response
- TTS streams back via with_streaming_response
- Total pipeline: PCM16 → Realtime WS → LLM → streaming TTS
2026-06-04 14:26:19 -04:00
hobokenchicken 9cd183a83b fix: streaming TTS via with_streaming_response
Replaced synchronous TTS (waiting for full audio at 5.9s) with
streaming TTS that sends audio chunks as they arrive. Backend now
accumulates chunks in audioBufferRef and plays the complete stream
on speaking_end. Reduces TTS latency from ~6s to ~1s first byte.
2026-06-04 14:17:54 -04:00
hobokenchicken c5cc4dd480 fix: replace PCM16 capture with MediaRecorder (Opus/webm)
PCM16 capture via AudioContext was streaming raw audio continuously,
causing massive accumulated buffers that took ~20s to transcribe.
Replaced with MediaRecorder which records compressed Opus/webm and
sends a single blob on release — much smaller, faster to transcribe.

Also removed all unused PCM16/WAV helper functions from both frontend
and backend.
2026-06-04 14:04:44 -04:00
hobokenchicken 537ddcd841 fix: play Opus TTS audio directly instead of WAV-converting it
The backend sends Opus-encoded audio from OpenAI TTS (tts-1 with
response_format=opus). The frontend was treating it as raw PCM16
and wrapping it in a WAV container, which corrupted the audio into
static. Now plays the Opus data directly as audio/ogg.
2026-06-04 13:59:04 -04:00
hobokenchicken f2a5416408 feat: cheapest pipeline — gpt-4o-mini-transcribe + gpt-5.4-nano + TTS
Simple 3-step chat completions pipeline at ~/usr/bin/bash.019/min total.
Streams PCM16 audio from frontend, transcribes on release,
generates response via gpt-5.4-nano, speaks via OpenAI TTS.

Cost breakdown:
  gpt-4o-mini-transcribe: /usr/bin/bash.003/min
  gpt-5.4-nano:          ~/usr/bin/bash.001/min
  OpenAI TTS (nova):     /usr/bin/bash.015/min
  Total:                 ~/usr/bin/bash.019/min (~/usr/bin/bash.57/day at 30min)
2026-06-04 13:51:35 -04:00
hobokenchicken 274d04ea10 feat: hybrid pipeline — gpt-realtime-whisper + gpt-5.4-nano + TTS
Hybrid approach gives streaming STT at ~/usr/bin/bash.017/min + cheap brain
at ~/usr/bin/bash.001/min + TTS at ~/usr/bin/bash.015/min = ~/usr/bin/bash.033/min total.

- gpt-realtime-whisper handles streaming transcription with VAD
- gpt-5.4-nano handles response generation (chat completions)
- OpenAI TTS (nova) for voice output
- Server VAD detects utterance boundaries
- Honcho memory context injected into system prompt
- Removed old full Realtime relay service
2026-06-04 13:48:06 -04:00
hobokenchicken e2332af8d0 feat: OpenAI Realtime API pipeline
Replaced the 3-step sequential pipeline (Whisper STT → DeepSeek LLM
→ OpenAI TTS) with a single OpenAI Realtime API WebSocket using
gpt-4o-mini-realtime-preview.

- ~300-800ms latency vs 1-3s
- Server VAD for automatic turn detection
- Streaming audio chunks during playback
- Interruptions: user can speak over Kira mid-response
- Honcho memory still injected into session instructions
- Frontend captures PCM16 mono 24kHz via AudioContext
- Backend relays client ↔ OpenAI Realtime API
- Supports both voice (PCM16) and text input
2026-06-04 13:32:39 -04:00
hobokenchicken e64698b0ab fix: graceful mic-unavailable handling over HTTP
navigator.mediaDevices.getUserMedia() requires a secure context
(HTTPS or localhost). When accessed over plain HTTP, the API
is undefined. Now shows a friendly chat message instead of a
cryptic TypeError in the console.
2026-06-04 12:12:07 -04:00