<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Reinforcement-Learning on Beyond the Prior</title><link>https://beyondtheprior.com/tags/reinforcement-learning/</link><description>Recent content in Reinforcement-Learning on Beyond the Prior</description><generator>Hugo</generator><language>en-us</language><copyright>Copyright © 2025 Matthew Dangerfield</copyright><lastBuildDate>Tue, 09 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://beyondtheprior.com/tags/reinforcement-learning/index.xml" rel="self" type="application/rss+xml"/><item><title>Why LLMs (still) lack taste</title><link>https://beyondtheprior.com/post/why-llms-lack-taste/</link><pubDate>Tue, 09 Jun 2026 00:00:00 +0000</pubDate><guid>https://beyondtheprior.com/post/why-llms-lack-taste/</guid><description>&lt;p>Frontier LLMs are really smart, and they&amp;rsquo;re &lt;a href="https://metr.org/time-horizons/">becoming particularly good at software development&lt;/a>. It feels like &lt;a href="https://artificialanalysis.ai/trends">every week&lt;/a> there&amp;rsquo;s a new model release that achieves SOTA scores on a handful of benchmarks. I use LLMs to build software every day, and they&amp;rsquo;re incredibly useful, and getting better. But I&amp;rsquo;m still frequently surprised by the &lt;em>types&lt;/em> of mistakes they make.&lt;/p>
&lt;p>I don&amp;rsquo;t expect LLMs to be perfect. Even smart humans make mistakes! But LLMs often make errors that a human with a similar depth of knowledge would &lt;em>never&lt;/em> make. Their capabilities feel jagged; they&amp;rsquo;ll brilliantly pull together thousands of error logs into a coherent analysis that would&amp;rsquo;ve taken me hours, but then use blatantly flawed reasoning to derive the root cause. So why does &amp;ldquo;PhD-level intelligence&amp;rdquo; make these kinds of mistakes?&lt;/p></description></item></channel></rss>