Files
kira/backend
hobokenchicken 7875b5d12a fix: cache Honcho memory context per-session (not per-turn)
The memory context was being rebuilt on every conversation turn via
build_system_prompt(), which calls Honcho's dialectic reasoning API
twice (get_user_context + get_kira_context). Each call takes 5-15s.

Now the memory suffix is computed ONCE during identify and cached in
a memory_suffix variable for the session duration. Per-turn latency
drops from ~37s to ~3s.

Also removed duplicated _pcm16_to_wav and cleaned up orphaned code.
2026-06-04 14:11:14 -04:00
..