<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Scaling-Laws on BRYSGO</title><link>https://www.brysgo.com/tags/scaling-laws/</link><description>Recent content in Scaling-Laws on BRYSGO</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Thu, 04 Jun 2026 00:26:53 +0000</lastBuildDate><atom:link href="https://www.brysgo.com/tags/scaling-laws/index.xml" rel="self" type="application/rss+xml"/><item><title>The C64 Runs a Transformer and You're Still Paying $25/Million Tokens</title><link>https://www.brysgo.com/post/2026-06-04-the-c64-runs-a-transformer-and-you-re-still-paying-25-million-tokens/</link><pubDate>Thu, 04 Jun 2026 00:26:53 +0000</pubDate><guid>https://www.brysgo.com/post/2026-06-04-the-c64-runs-a-transformer-and-you-re-still-paying-25-million-tokens/</guid><description>&lt;p&gt;A 1MHz computer from 1982 just generated a token using the same math as ChatGPT. It took 60 seconds. It has no FPU, no GPU, no cloud contract — just a MOS 6510 processor, 25KB of RAM, and a floppy disk. And somehow, that one slow, broken token is the most clarifying thing I&amp;rsquo;ve seen in AI this year.&lt;/p&gt;
&lt;p&gt;The project is called Soul Player C64. It runs a real 2-layer decoder-only transformer on a Commodore 64 — multi-head causal self-attention, RMSNorm, feed-forward residuals, all of it, hand-written in 6502 assembly. The weights are INT8. The vocabulary is 128 tokens. The context window is 20 tokens. The output is, charitably, linguistically broken.&lt;/p&gt;</description></item></channel></rss>