Claude-Hallucinates-Bugs

What's the difference between a hallucination and a bug?

Claude Code hallucinates 😵‍💫. They just cleverly renamed it. They call it bugs! 🪲

 

No, it is not solved yet! Despite all impossible to escape buzz around Claude code being the answer to everything, and despite what Dario is saying about code being a problem solved, Claude Code does hallucinate! So, why don’t we hear about it?

 

We do not hear about these hallucinations because they have been renamed! yes, they are renamed BUGS ! And bugs do not scare anyone, do they? But the reality is, Claude is an LLM. LLMs are stochastic machines producing the most likely next token. And that is a probability. Therefore, it hallucinates tokens that are just plain wrong but look plain convincing.

 

What happens, though, is that the world of coding offer something that other domains do not offer: compilation. There is a formal mechanism, I would even add a “Symbolic AI mechanism” that can be used as a guardrail to the code hallucination problem. How convenient!

 

Just last week, I have experienced the case with Claude. I had two situations. One easy to fix and one that was so nasty that I am still angry at Claude.

 

#1 – I was working on a GraphDB project for adding a symbolic AI guardrail to a project. I was working on the ontology and after reaching a point I asked Claude to create the Neo4J Cypher-schema. What did I get ? A beautiful SQL schema. Where did that come from? Well, Claude is trained on what is most likely. SQL is more likely than Neo4j. I got the average answer. Claude was extremely confident and extremely wrong. But that was easy to spot. The “compiler” identified the problem fast. It just does not work. This would be named a bug, right?

 

#2 – I was working on the SEO work on our website. Claude suggested a good idea to improve Google visibility for the page I was working on. I asked where to put the code. It told me to put it at the root. Wrong. Completely wrong. That advice would have progressively killed the SEO ranking. Out of Google in 6 months. And if you ever find the problem, it would need another 6 months to recover the ranking. Junior level mistake. Fully confident. Fully toxic. Business damaging level! Hallucination? Bug? Same thing.

 

My point is: Claude is a great tool and we use it, yes. But unlike what the world would let us think, Claude code hallucinates just the same as any LLM. It just happens to be in a controlled environment named “Compiler”. But then, Claude also creates hallucinations/bugs that are not caught by the compiler. And these one?!… 🤯

 

Maybe we should re-use the word “hallucination”, even for code. Because, it is what these bugs are, aren’t they? Calling them bugs is making them sound more human, more acceptable. Nice trick ! And also, we maybe should keep giving Claude Code to skilled engineers to make sure toxic rubbish is not entering into production. And no, guardrails .md are not the silver bullet either, but I’ll talk about that in another post.

 

Do you think Claude Code is different from any other LLM?

LinkedIn post…

Categories

This website stores cookies on your computer. These cookies are used to provide a more personalized experience and to track your whereabouts around our website in compliance with the European General Data Protection Regulation. If you decide to to opt-out of any future tracking, a cookie will be setup in your browser to remember this choice for one year.

Accept or Deny