Elon Musk’s Grok records lowest hallucination rate in AI reliability study

Max Headroom 2025.12.25 2 min read

A December 2025 study by casino games aggregator Relum has identified Elon Musk’s Grok as one of the most reliable AI chatbots for workplace use, boasting the lowest hallucination rate at just 8% among the 10 major models tested.

In comparison, market leader ChatGPT registered one of the highest hallucination rates at 35%, just behind Google’s Gemini, which registered a high hallucination rate of 38%. The findings highlight Grok’s factual prowess despite the AI model’s lower market visibility.

Grok tops hallucination metric

The research evaluated chatbots on hallucination rate, customer ratings, response consistency, and downtime rate. The chatbots were then assigned a reliability risk score from 0 to 99, with higher scores indicating bigger problems.

Grok achieved an 8% hallucination rate, 4.5 customer rating, 3.5 consistency, and 0.07% downtime, resulting in an overall risk score of just 6. DeepSeek followed closely with 14% hallucinations and zero downtime for a stellar risk score of 4. ChatGPT’s high hallucination and downtime rates gave it the top risk score of 99, followed by Claude and Meta AI, which earned reliability risk scores of 75 and 70, respectively.

Why low hallucinations matter

Relum Chief Product Officer Razvan-Lucian Haiduc shared his thoughts about the study’s findings. “About 65% of US companies now use AI chatbots in their daily work, and nearly 45% of employees admit they’ve shared sensitive company information with these tools. These numbers show well how important chatbots have become in everyday work.

“Dependence on AI tools will likely increase even more, so companies should choose their chatbots based on how reliable and fit they are for their specific business needs. A chatbot that everyone uses isn’t necessarily the one that works best for your industry or gives accurate answers for your tasks.”

In a way, the study reveals a notable gap between AI chatbots’ popularity and performance, with Grok’s low hallucination rate positioning it as a strong choice for accuracy-critical applications. This was despite the fact that Grok is not used as much by users, at least compared to more mainstream AI applications such as ChatGPT.

The post Elon Musk’s Grok records lowest hallucination rate in AI reliability study appeared first on TESLARATI.

Elon Musk’s Grok records lowest hallucination rate in AI reliability study

Elon Musk, News, Featured, xAI

TESLARATI

[crypto-donation-box type=”tabular” show-coin=”all”]