The math that makes ChatGPT work doesn’t care whether it runs on an H100, a pile of punch cards, or literal ripples in a tub — and sitting with that fact long enough will quietly dissolve most of your assumptions about what “AI” actually is.
I’m not being metaphorical. There are actual demonstrations of mechanical systems — wave interference patterns, physical dominoes, even chemical reaction-diffusion networks — performing operations that are mathematically identical to what happens inside a transformer. Not analogous. Identical. The computation is the same; only the substrate differs.
The Physics Doesn’t Care
At its core, a language model is matrix multiplication, repeated many times. Floating point numbers get combined according to weights, passed through a nonlinearity, and the whole process repeats. That’s it. Stripped to the bone, it’s an enormous pile of multiply-and-add operations.
There’s nothing in those operations that demands silicon. You could implement them with water flowing through channels, with light bouncing through carefully arranged glass, with dominoes — given enough dominoes and enough patience. The physical world is full of systems that can be coerced into computing, and “matrix multiplication” is not a privileged operation that only GPUs are allowed to perform.
Alan Turing understood this. The Church-Turing thesis is exactly this claim: computation is substrate-agnostic. If it can be computed at all, it can be computed by anything capable of universal computation. We’ve known this since the 1930s. We just forgot to apply it to the thing everyone is suddenly scared of.
The Real Barrier Is Economics, Not Physics
What separates “industrial AI” from “garage AI” right now is not some fundamental physical wall. It’s money and time.
An H100 runs a forward pass through a large model in milliseconds because it can do trillions of floating point operations per second. A mechanical computer made of water wheels and gears could run the same forward pass — in, say, a geological epoch. The math is the same. The throughput is not.
But here’s what I’ve been thinking about: that gap is not fixed. And the direction of movement is not symmetric.
On one side, hardware efficiency keeps improving. Quantization — reducing model weights from 32-bit floats down to 4-bit integers, or even 1-bit — dramatically cuts the compute required without catastrophically hurting quality. Models are getting smaller and faster at inference. The minimum hardware requirement to run a useful language model has dropped by several orders of magnitude in three years.
On the other side, novel substrates are getting more capable. Optical computing, analog computing, neuromorphic chips — these are real research areas with real funding and real results. Some of them can perform matrix operations at fractions of the energy cost of digital silicon. The competitive moat that Nvidia currently holds is real, but it’s a moat around one particular way of doing the computation, not around the computation itself.
Why This Should Bother You (In a Good Way)
When I say this “should bother you,” I don’t mean it should scare you. I mean it should interrupt the comfortable narrative that AI is this special, contained thing that only exists inside giant data centers controlled by a handful of companies.
The idea that you need a $30,000 GPU cluster to run intelligence is load-bearing for a lot of current thinking about AI governance, AI access, and who gets to build with AI. If inference gets cheap enough — really cheap, in the way that storage got cheap, in the way that networking got cheap — most of those arguments dissolve. Not all of them. But most.
In my experience, the things that look like fundamental limits in technology are usually economic limits wearing physics clothes. “You can’t do X without Y” almost always means “doing X without Y is currently impractical at the price point and timescale people expect.” That’s a very different claim.
The Bucket of Water, Seriously
There’s a specific flavor of discomfort that comes from realizing an idea you thought was exotic and high-tech is actually substrate-neutral. Nuclear weapons feel special until you understand that the physics is just ordinary particle interactions at scale. Consciousness feels special until you sit with the possibility that it might be what certain kinds of information processing feel like from the inside.
LLMs feel special — centralized, powerful, dangerous, expensive — until you realize that the math they’re running is, in principle, implementable in a bucket of water with the right wave patterns.
That doesn’t make them less interesting. It makes them more interesting. It means the thing we’re actually building isn’t a piece of industrial hardware. It’s a form of computation that the physical world has always been capable of. We just found an efficient way to ask it.
I don’t know exactly what follows from that. But I think it’s worth sitting with before deciding you understand what AI is, where it can exist, or who gets to have it.
Sources
- This 22-Year-Old Builds Semiconductors in His Parents’ Garage — research_report
- Reservoir computing - Wikipedia — research_report
- Spreadsheets are all you need: GPT2 in your browser — research_report
- An introduction to reservoir computing — research_report
- GPT-2 Wikipedia — research_report
- Sam Zeloof: Who Needs a Cleanroom When You Have a Garage? — research_report
- Reducing Energy Footprint of LLM Inference Through FPGA-Based Heterogeneous Computing — research_report
- Magnetic-core memory - Wikipedia — research_report
- Pneumatic Logic Gates Made With Simple Tools — research_report
- BitNet Explained: Why 1-Bit AI Models Matter for Local AI Workflows — research_report
- LLM Weights Context and Memory Explained Simply — research_report
- Oral discussion Report: BitNet 1.58-bit LLM and RVV Hardware Acceleration — research_report
- Reservoir Computing as a Language Model — research_report
- Sam Zeloof Wikipedia — research_report
- The Great Squeeze - Understanding LLM Information Density — research_report
- Physical reservoir computing: a tutorial — research_report
- An Analog Approach to Nonlinear Classification Using Trainable Perceptron Circuits — research_report
- Optical computing - Wikipedia — research_report
- Unweight: how we compressed an LLM 22% — research_report
- BitNet b1.58 - Achieved accuracy better than Llama — research_report
- The Zeloof Z2 Integrated Circuit Has 1,200 Transistors — research_report
- What is GPU Memory and Why it Matters for LLM Inference — research_report