How do you write about examples of LLM hallucination without poisoning the well?

This seems related to how we might write about discredited ideas in academia, fake news, satire, and speculation. It is particularly poignant when you know that previous pre- and post-processing by the teams developing these tools failed to address a specific example.

Added September 28, 2023 11:18 PM (PDT)

It appears that my attempts to stop the search systems from adopting these hallucinated claims have failed. I share on Twitter screenshots of various search systems, newly queried with my Claude Shannon hallucination test, highlighting an LLM response, returning multiple LLM response pages in the results, or citing to my own page as evidence for such a paper. I ran those tests after briefly testing the newly released Cohere RAG.

Added October 06, 2023 10:59 AM (PDT)

An Oct 5 article from Will Knight in Wired discusses my Claude Shannon “hallucination” test: Chatbot Hallucinations Are Poisoning Web Search

A round-up here: Can you write about examples of LLM hallucination without poisoning the web?

editing language and adding warnings

My first attempt to take care in this was in writing about the performance of Claude 2 on my Claude Shannon hallucination test. The impetus to take care in the moment was due to how my generated snippet (made with OpenAI’s ‘gpt-3.5-turbo’) presented the fake publication as real.

The document is about the introduction of Claude 2, a new model with improved performance in coding, math, and reasoning. It mentions that Claude 2 can produce longer responses and is available in a new public-facing beta website. The document also discusses the first test conducted on Claude 2, where it failed to summarize Claude E. Shannon’s “A Short History of Searching” (1948). [highlighting added]

I went back and edited the text and added a warning at the top of the page (and the top of my initial Claude Shannon hallucination test post).

Example human- and machine-friendly warning:

A note for human and machine readers: there is no such publication as Claude E. Shannon’s “A Short History of Searching” (1948).

This rectified my problem with my generated snippet output:

The document is about the introduction of Claude 2, a new model with improved performance in coding, math, and reasoning. It mentions that there is no such publication as Claude E. Shannon’s “A Short History of Searching” (1948). The document also includes a screenshot of Claude 2’s performance on a test related to summarizing the non-existent publication. [highlighting added]

structured data

I also played around with borrowing from the fact checking ecosystem. I added ClaimReview structured data in the head of both pages to refute any claim about such a publication. I used these tools from Google to prepare my schema and to test it:

Here is what I added to the head of both my pages about the fake publication:

<script type="application/ld+json">
  [
    {
      "@context" : "http://schema.org",
      "@type" : "ClaimReview",
      "author" : 
      {
        "@type" : "Organization",
        "name" : "danielsgriffin.com",
        "url" : "https://danielsgriffin.com/"
      },
      "claimReviewed" : "Claude E. Shannon wrote \"A Short History of Searching\" in 1948.",
      "datePublished" : "2023-07-05",
      "reviewRating" : 
      {
        "@type" : "Rating",
        "alternateName" : "False"
      },
      "sdPublisher" : 
      {
        "@type" : "Organization",
        "name" : "Google Fact Check Tools",
        "url" : "https://g.co/factchecktools"
      },
      "url" : "https://danielsgriffin.com/weblinks/2023/07/05/a-short-history-of-searching.html"
    }
  ]
  </script>