Add new tech article: brewing up the stack (LLM proxies, Caddy migration, Grok-bot)
- New post: brewing up the stack covering GopherGate LLM proxy, Grok-bot, Dissertation Path updates, pi.dev migration, Caddy reverse-proxy switch, and Gemma4 experiments - Add author tone & voice profile for consistent writing reference - Add .gitignore for Pi-lens cache and build artifacts
This commit is contained in:
+66
@@ -0,0 +1,66 @@
|
||||
---
|
||||
layout: post
|
||||
title: "brewing up the stack: llm proxies, cheeky bots, and smaller models"
|
||||
---
|
||||
|
||||
the kettle's on and the terminal is warm. it's been a season of percolating projects across the homelab-some bubbling to the surface, others still steeping. grab a cup and let's walk through what's been brewing in my corner of the tech world.
|
||||
|
||||
## gophergate: my custom llm proxy (because off-the-shelf wasn't accurate enough)
|
||||
|
||||
i've been juggling multiple llm api subscriptions lately: xai (grok), google, openai, deepseek, moonshot... the list grows. tracking usage and spend across each provider was turning into a spreadsheet nightmare. i tried a popular open-source llm proxy, but the usage metrics were often inaccurate-like a coffee scale that drifts after each pour.
|
||||
|
||||
so i built **gophergate**. it's a lightweight go service that sits between my apps and all those llm endpoints. here's what it does:
|
||||
|
||||
- **unified proxy:** one endpoint that routes requests to the right provider based on model name.
|
||||
- **spend tracking:** logs every token, calculates costs per provider, and gives me a real-time dashboard of where my credits are going.
|
||||
- **usage quotas:** i can set soft limits per project or user, so no surprise bills.
|
||||
- **fallback routing:** if one api is rate-limited or down, it automatically fails over to a backup.
|
||||
|
||||
it's not fancy, but it's accurate. and in the world of llm budgets, accuracy is everything. think of it as a precision gooseneck kettle for pouring api credits-no spillage.
|
||||
|
||||
## grok-bot: the (kinda rude) nextcloud assistant
|
||||
|
||||
while i was wiring up gophergate, i also wanted to bring a little ai personality into my nextcloud instance. enter **grok-bot**.
|
||||
|
||||
this is a nextcloud talk bot that hooks into grok-4-1-fast-reasoning. it joins group chats, answers questions, fetches files, and-by design-has a bit of an attitude. i trained its personality on scraped linkedin profiles (public ones, naturally) and previous conversation history, so it can toggle between professional snark and outright sarcasm depending on who it's talking to.
|
||||
|
||||
why? because sometimes you want a assistant that says "sure, i'll fetch that report... but you do know you could've searched for it yourself, right?" it keeps things lively. and because it lives inside nextcloud, it has access to all my files, calendars, and contacts-making it genuinely useful beneath the cheek.
|
||||
|
||||
## dissertation path: a phd gantt chart that doesn't overwhelm
|
||||
|
||||
my academic-facing project [dissertationpath.com](https://dissertationpath.com) got a significant upgrade. the core idea remains: break the monumental phd journey into a clear, customizable gantt chart. but now it supports google auth, so students can save their timelines, share them with advisors, and track progress across devices.
|
||||
|
||||
the feedback from early users has been incredibly rewarding. it's one thing to build a tool for yourself; it's another to see it actually reduce someone else's anxiety. if you know a grad student drowning in chapter deadlines, send them this way.
|
||||
|
||||
## shifting from opencode to pi.dev (pi agent)
|
||||
|
||||
i've been a vocal fan of the opencode philosophy, but recently i've started migrating my agent-based workflows to **[pi.dev](https://pi.dev)**. the experience has been... smoother. the agent feels more responsive, the tooling is tighter, and the overall developer flow matches how i think.
|
||||
|
||||
this isn’t a wholesale abandonment—opencode’s ideas still influence how i structure projects—but for day‐to‐day agent‐assisted coding, pi is where i’m planting my flag right now. sometimes you find a tool that just fits your grip better.
|
||||
|
||||
## swapping the kettle: from nginx proxy manager to caddy
|
||||
|
||||
while i was reworking the software layer, i also revisited my reverse‑proxy setup. for years i used nginx proxy manager—it worked, but managing certificates and configs felt like calibrating a fussy espresso machine: doable, but never quite effortless.
|
||||
|
||||
enter **caddy**. its automatic https (via let’s encrypt) and dead‑simple config file won me over. migrating my services was straightforward, and the performance uplift was noticeable. plus, caddy’s native support for http/3 and quic meant some of my internal services (like plex) could now use faster, lower‑latency transports. it’s like switching from a pour‑over that requires constant attention to an automated brewer that just… works.
|
||||
|
||||
## experimenting with gemma4: smaller models, bigger potential
|
||||
|
||||
my homelab hardware isn't a monster by today's standards (hello, aging r720xd), so i've been keeping an eye on the smaller-model frontier. the recent **gemma4** releases have been particularly impressive.
|
||||
|
||||
i've spent a few weekends fine-tuning gemma4-it (the instruction-tuned variant) on my own documentation and code snippets. the results are promising: fast, locally-hosted reasoning that can handle basic code review, documentation generation, and even light planning tasks-all without hitting an external api.
|
||||
|
||||
it's not going to replace grok or gpt-4 for complex reasoning, but for quick, private, and free iterations, it's like having a reliable pour-over setup right on your desk. the hardware barrier is lowering, and that's exciting.
|
||||
|
||||
---
|
||||
|
||||
## what's next on the brew schedule?
|
||||
|
||||
- **open-sourcing gophergate**-after a bit more polish and documentation, i'll drop it on github.
|
||||
- **giving grok-bot a "polite mode"** for when the sarcasm isn't welcome.
|
||||
- **expanding dissertation path** with templated timelines for different disciplines.
|
||||
- **continuing the local-model experiments**-maybe a tiny model that runs entirely on a pi 5.
|
||||
|
||||
the stack keeps evolving, and that's the fun part. thanks for reading. what's bubbling in your lab lately? drop a comment or reach out-i'd love to hear what you're building.
|
||||
|
||||
-dustin
|
||||
Reference in New Issue
Block a user