The C64 Runs a Transformer and You're Still Paying $25/Million Tokens

Thu, 04 Jun 2026 00:26:53 +0000

A 1MHz computer from 1982 just generated a token using the same math as ChatGPT. It took 60 seconds. It has no FPU, no GPU, no cloud contract — just a MOS 6510 processor, 25KB of RAM, and a floppy disk. And somehow, that one slow, broken token is the most clarifying thing I’ve seen in AI this year.

The project is called Soul Player C64. It runs a real 2-layer decoder-only transformer on a Commodore 64 — multi-head causal self-attention, RMSNorm, feed-forward residuals, all of it, hand-written in 6502 assembly. The weights are INT8. The vocabulary is 128 tokens. The context window is 20 tokens. The output is, charitably, linguistically broken.

Scaling-Laws on BRYSGO

The C64 Runs a Transformer and You're Still Paying $25/Million Tokens