kira/backend at 7875b5d12ac0d9429dc4526188aa334a2bd71979 - kira

Files

T

hobokenchicken 7875b5d12a fix: cache Honcho memory context per-session (not per-turn)

The memory context was being rebuilt on every conversation turn via
build_system_prompt(), which calls Honcho's dialectic reasoning API
twice (get_user_context + get_kira_context). Each call takes 5-15s.

Now the memory suffix is computed ONCE during identify and cached in
a memory_suffix variable for the session duration. Per-turn latency
drops from ~37s to ~3s.

Also removed duplicated _pcm16_to_wav and cleaned up orphaned code.

2026-06-04 14:11:14 -04:00

models

init: Kira — AI body double with Honcho memory

2026-06-04 10:51:38 -04:00

routers

init: Kira — AI body double with Honcho memory

2026-06-04 10:51:38 -04:00

services

feat: cheapest pipeline — gpt-4o-mini-transcribe + gpt-5.4-nano + TTS

2026-06-04 13:51:35 -04:00

__init__.py

init: Kira — AI body double with Honcho memory