Running models — and the apps that use them — on hardware I own. The case for local AI is simple: no per-token bill, no data leaving the house, no service getting deprecated out from under a working project.
The setup:
- A home GPU server — a Linux box that hosts self-hosted web apps and runs inference for the projects on this site, behind a tunnel so they’re reachable without exposing the machine.
- Local models — LLMs and purpose-built vision models running on-box. The birdfeeder’s species ID runs entirely here; no cloud vision API is involved.
- Edge hardware — Jetson-class and microcontroller devices for the work that belongs out at the sensor instead of back at the server.
The theme across all of it: keep inference close to where the data is, own the whole stack, and keep it boring enough to stay running. Notes on the hardware, the trade-offs, and what’s actually worth self-hosting land here.