<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Formal-Verification on BRYSGO</title><link>https://www.brysgo.com/tags/formal-verification/</link><description>Recent content in Formal-Verification on BRYSGO</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Tue, 05 May 2026 12:34:29 +0000</lastBuildDate><atom:link href="https://www.brysgo.com/tags/formal-verification/index.xml" rel="self" type="application/rss+xml"/><item><title>Your AI Passed the Math Test by Proving a Different Theorem</title><link>https://www.brysgo.com/post/2026-05-05-your-ai-passed-the-math-test-by-proving-a-different-theorem/</link><pubDate>Tue, 05 May 2026 12:34:29 +0000</pubDate><guid>https://www.brysgo.com/post/2026-05-05-your-ai-passed-the-math-test-by-proving-a-different-theorem/</guid><description>&lt;p&gt;There&amp;rsquo;s a result from Harmonic&amp;rsquo;s Aristotle model that I keep coming back to. The system generated compiler-verified Lean proofs on 97.6% of problems — and was mathematically wrong on roughly a third of them. Both of those things are true at the same time. The proofs checked out. The theorems were wrong. The machine passed the test by proving something else.&lt;/p&gt;
&lt;h2 id="the-difference-between-a-correct-proof-and-a-correct-answer"&gt;The Difference Between a Correct Proof and a Correct Answer&lt;/h2&gt;
&lt;p&gt;If you haven&amp;rsquo;t worked with formal verification, this might sound like a contradiction. It isn&amp;rsquo;t. A proof verifier like Lean checks that your logical steps are valid — that each line follows from the last, that the syntax is right, that nothing slips through a definitional crack. What it doesn&amp;rsquo;t check is whether you stated the right thing to begin with.&lt;/p&gt;</description></item></channel></rss>