‘Grok, verify’: Why AI chatbots shouldn’t be considered reliable fact-checkers | Technology News

At the height of the recent India-Pakistan conflict, a parallel battle unfolded online – a battle of narratives. While independent fact-checkers and the government-run Press Information Bureau scrambled to debunk fake news, unsubstantiated claims, and AI-generated misinformation, many users turned to AI chatbots like Grok and Ask Perplexity to verify claims circulating on X.
Here is an example: On May 10, India and Pakistan agreed to cease all military activity — on land, air and sea — at 5 PM. While responding to some user queries the next day, Grok called it a “US-brokered ceasefire”. However, on May 10, when a user asked about Donald Trump’s role in mediating the ceasefire, Grok added some missing context, saying, “Indian officials assert the ceasefire was negotiated directly between the two countries’ military heads. Pakistan acknowledges US efforts alongside others,” presenting a more rounded version of the events.
Such inconsistencies demonstrate a deeper issue with AI responses. Experts warned that though AI chatbots can provide accurate information, they are far from reliable “fact-checkers”. These chatbots can give real-time responses, but more often than not, they may add to the chaos, especially in evolving situations.
Prateek Waghre, an independent tech policy researcher, attributed this to the “non-deterministic” nature of AI models: “The same question won’t always give you the same answer,” he said, “It depends on a setting called ‘temperature’.”
Large language models (LLMs) work by predicting the next word amid a range of probabilities. The “temperature” determines the variability of responses the AI can generate. A lower temperature would mean that the most probable next word is picked, generating less variable and more predictable responses. A higher temperature allows LLMs to give unpredictable, creative responses.
According to Waghre, what makes the use of AI bots for fact-checking claims more worrisome is that “they are not objectively bad.” “They are not outright terrible. On some occasions, they do give you accurate responses, which means that people tend to have a greater amount of belief in their capability than is warranted,” he said.
What makes AI chatbots unreliable?
1. Hallucinations
The term “hallucination” is used to describe situations when AI chatbots generate false or fabricated information and present it as factual information.
Story continues below this ad
Alex Mahadevan, director of MediaWise, said AI chatbots like Grok and Ask Perplexity “hallucinate facts, reflect online biases and tend to agree with whatever the user seems to want,” and hence, “are not reliable fact-checkers.”
“They don’t vet sources or apply any editorial standard,” Mahadevan said. MediaWise is a digital literacy programme of Poynter, a non-profit journalism school based in the US, which helps people spot misinformation online.
xAI admits to this in the “terms of service” available on its website. “Artificial intelligence is rapidly evolving and is probabilistic in nature; it may therefore sometimes result in Output that contains “hallucinations,” may be offensive, may not accurately reflect real people, places or facts, may be objectionable, inappropriate, or otherwise not be suitable for your intended purpose,” the company states.
Perplexity’s terms of service, too, carry a similar disclaimer: “You acknowledge that the Services may generate Output containing incorrect, biased, or incomplete information.”
2. Bias and lack of transparency
Mahadevan flagged another risk with AI chatbots — inherent bias.
Story continues below this ad
“They are built and beholden to whoever spent the money to create them. For example, just yesterday (May 14), X’s Grok was caught spreading misleading statements about ‘white genocide’, which many attribute to Elon Musk’s views on the racist falsehood,” he wrote in an e-mail response to indianexpress.com.
The “white genocide” claims gained traction after US President Donald Trump granted asylum to 54 white South Africans earlier this year, citing genocide and violence against white farmers. The South African government has strongly denied these allegations.
Waghre said that users assume AI is objective because it’s not human, and that is misleading. “We don’t know to what extent or what sources of data were used for training them,” he said.
Both xAI and Perplexity say their tools rely on real-time internet searches; Grok also taps into public posts on X. But it’s unclear how they assess credibility or filter misinformation. Indianexpress.com reached out to both firms to understand this better, but did not receive any response at the time of publishing.
3. Scale and speed
Story continues below this ad
Perhaps the most concerning issue is the scale at which these chatbots operate.
With Grok embedded directly into X, AI-generated errors can be amplified instantly to millions. “We’re not using these tools to assist trained fact-checkers,” Waghre said, “They’re operating at population scale – so their mistakes are too.”
Waghre also said that these AI chatbots are likely to learn and improve from mistakes, but “You have situations where they are putting out incorrect answers, and those are then being used as further evidence for things.”
What AI firms should change
Mahadevan questioned the “design choice” that AI firms employ. “These bots are built to sound confident even when they’re wrong. Users feel they are talking to an all-knowing assistant. That illusion is dangerous,” he said.
Story continues below this ad
He recommended stronger accuracy safeguards – chatbots should refuse to answer if they can’t cite credible sources, or flag “low-quality and speculative responses”.
Vibhav Mithal, a lawyer specialising in AI and intellectual property, has a different take. He insisted there is no need to write off AI chatbots entirely since their reliability as fact-checkers depends largely on context, and more importantly, on the quality of data they’ve been trained on. But responsibility, in his opinion, lies squarely with the companies building these tools. “AI firms must identify the risks in their products and seek proper advice to fix them,” Mithal said.
What can users do?
Mithal stressed that this isn’t about AI versus human fact-checkers. “AI can assist human efforts, it’s not an either/or scenario,” he said. Concurring, Mahadevan listed two simple steps users can take to protect themselves:
Always double-check: If something sounds surprising, political or too good to be true, verify it through other sources.
Story continues below this ad
Ask for sources: If the chatbot can’t point to a credible source or just name-drops vague websites, be skeptical.
According to Mahadevan, users should treat AI chatbots like overconfident interns: useful, fast, but not always right. “Use them to gather context, not confirm truth. Treat their answers as leads, not conclusions,” he said.