Can you write about examples of LLM “hallucination” without poisoning the web? How do you share about misinformation without spreading it? How do you link to the outputs of chatbots and generative search engines without deceiving folks?
My research on the lack of diligence from major search engines has gone from examining Google’s unmarked results from white supremacist Holocuast-denier websites at the top for Carole Cadwalladr’s [did the holocaust happen] query (Mulligan & Griffin 2018 to now looking at Bing and Google and others not marking the top rankings they give results from chatbots and generative search engines (from melting eggs to imaginary Claude Shannons).
Note: I do not think search engines should de-index this sort of content. I think people can learn from seeing what others do with these new tools, whether that is showing things to do, not to do, or why they should be refused or restricted in certain situations. But perhaps search engines should provide some more notice/labeling/marking?
Round-up:
Here is my Sep 28 thread that prompted it:
Oops. It looks like my links to chat results for my Claude Shannon hallucination test have poisoned @bing.
I have a short thread from me sharing the article:
““It gives no indication to the user that several of these results are actually sending you straight to conversations people have with LLMs,” Griffin says."
I keep thinking about this part.
While I wrote about Bard no longer failing the test on Sep 28, @nunohipolito shared a worse failing on Oct 05: Bard inserted Shannon’s name into a query about the fake publication.
rel
attribute: bot-ixn: Highlighting LLM-Generated Content in Search Results: Do We Need a New Link Attribute?