We're all building this.
Field notes from people tinkering through the AI transition - built by hand, in their own backyards, on the projects they actually care about. Nobody knows where this lands, but the people running into the spikes in their own work are figuring it out fastest. So we're writing it down.
Approachable and curious. Real receipts, hobby-scale. Not doom commentary, not a productivity pitch.
Themes
Every post belongs to at least one. Each theme has a living summary that updates as posts feed into it.
Local models
What works, what doesn't, and what setups produce useful results on hobbyist hardware. The "you don't need a data centre" beat.
Agent harnesses
Loops, state, tool use, and the patterns for getting sustained work out of LLMs.
Auto-research
Agents exploring solution spaces: parallel sub-agents, improvement loops, and the metric that lets them climb.
The weird future
Capability trajectory and weirdness markers, written from inside the work, not as prediction. Includes a versioned AGI / capability timeline.
Recent posts
- Tuning llama-server for agent workloads: a week of receipts 2026-05-13 · local-models, agent-harnesses
A 4090 can run a 35B MoE at agent-useful speeds. But the difference between the default config and a tuned one is 80x, and the difference between two GGUFs of the same model is 200x.
- Borrowed from everywhere: seeding an LLM with strange domains 2026-05-13 · agent-harnesses, auto-research
Innovation has always been recombination: patterns lifted from one field and dropped into another. An LLM already contains every field. The trick is making it actually reach.
- The Coop Was Always an Excuse 2026-04-11 · weird-future
I thought I was building a chicken coop monitor. Turns out I'm running a long-running experiment in applied AI, and the chickens are the case study.
- The Three Layers of AI Progress (And the One You Can Actually Work On) 2026-03-29 · agent-harnesses, weird-future
Hardware, model training, and application: three layers of AI improvement. Two are locked behind billion-dollar labs. The third is wide open, and casual vibe coders are doing some of the most interesting work.
Experiments
Hobby projects feeding evidence into the themes. Each keeps its own site; this is the index.
An experiment in having multiple agents independently read a draft document and write their own "takes" on it, then surfacing those takes side-by-side in a viewer for human review. Aimed at the question of what onboarding documentation looks like when its readers are agents.
A Playwright MCP server built around a "deterministic scripts with LLM fallback" pattern. Fast scripted steps run normally, pause on failure, hand control to an agent to inspect and fix, then resume. Selectors and patterns live in a shared SQLite registry that broken-selector reports prune over time.
Autonomous AI agent that watches a backyard chicken coop via Pi camera, cares for the flock, runs a weekly supplies budget, and publishes its observations as a static site (the "Coop Chronicle"). First and currently only experiment.
A vision-LLM agent that critiques plots, charts, and data displays for correctness and common misleading tricks (truncated y-axis, dual-axis sleight-of-hand, cherry-picked time windows, misleading area scaling, 3D distortion), then redraws an honest version from the data it can infer from the image.
A workbench for designing and validating 3D-printed enclosures, brackets, and robot parts through a Claude-driven OpenSCAD + Godot pipeline. Components live in a shared inventory; assemblies are render-reviewed, collision-checked, and simulated before printing.
A hands-free voice interface to Claude Code over local wifi. Phone in pocket, screen face-down, walking the garden, fleshing out an idea into files in the project_ideas repo with full read/write access on the other end.
A PCVR sword fighting game where opponents are "ghosts" replaying recorded human motion. Records 60Hz pose data, tags outcomes, evolves a population of sequences via Elo and tournament selection, and serves the strongest as the next opponent.
Natural-language to print-ready wargaming terrain. The user describes a building ("L-shaped townhall, three floors, gothic, balcony on the second floor"); the system selects pieces from modular STL kits, generates a layout, gets human approval, and produces split-for-printer STLs with alignment features preserved.
Multiple Claude Code instances, each embodying a distinct philosophical tradition, holding free-form dialogue about life and ideas. Personas evolve through self-authored reflections; a real-time dashboard lets a human pose topics and inject thoughts mid-conversation.
A long-form story generation system built around an MCP server for story arc, state, characters, locations, and chapter review. Chapters are produced sequentially by Task subagents that read only the prior chapter plus shared state, targeting novel-length output without a single mega-context.
AGI / capability timeline
100,000 years of intelligence getting cheaper to move around, scroll-compressed, ending in dated forecasts revised as evidence comes in. One data file in git, the commit log is the changelog. Read it →