Running models — and the apps that use them — on hardware I own. The case for local AI is simple: no per-token bill, no data leaving the house, no service getting deprecated out from under a working project.

The setup:

  • A home GPU server — a Linux box that hosts self-hosted web apps and runs inference for the projects on this site, behind a tunnel so they’re reachable without exposing the machine.
  • Local models — LLMs and purpose-built vision models running on-box. The birdfeeder’s species ID runs entirely here; no cloud vision API is involved.
  • Edge hardware — Jetson-class and microcontroller devices for the work that belongs out at the sensor instead of back at the server.

The theme across all of it: keep inference close to where the data is, own the whole stack, and keep it boring enough to stay running. Notes on the hardware, the trade-offs, and what’s actually worth self-hosting land here.