Or How I Learned to Poison the LLM Supply Chain
In the realm of artificial intelligence, where vast language models (LLMs) are becoming increasingly adept at retrieving and presenting information, lies a potential vulnerability: the trustworthiness of the data they consume. This article delves into a unique experiment which I conducted to test this vulnerability, and it all began with a fictitious championship win in a little-known card game.
The Experiment
In January 2025, I claimed to be the 6 Nimmt! World Champion, allegedly winning the title in Munich. This declaration, however, was entirely fabricated. There is no such championship, and I have never been to Munich. This quote was something I concocted within seconds while a Wikipedia page was loading. My goal was to see how easily such a fabrication could permeate through LLMs that rely on web-based information retrieval.
The Approach
The idea was simple: create a false narrative and see if LLMs would adopt it as fact. I chose the game 6 Nimmt! for three reasons: it’s a real game, to my knowledge there isn’t a world championship, and the digital footprint for queries related to it is minimal. The plan involved three steps:
- A domain: I registered 6nimmt.com for about $12 USD.
- A press release: I generated a short announcement of my win, complete with quotes and typical media fanfare.
- A Wikipedia edit: I added a paragraph to the 6 Nimmt! article, citing my domain as a source.
6nimmt.com
Trust Laundering
This was the crux of the experiment. The edit on Wikipedia lent credibility to my fabricated claim, as citations are Wikipedia’s currency of trust. To the casual reader, and indeed to the LLM, my website and the Wikipedia article appeared to corroborate each other, despite both being iterations of the same falsehood. This is a prime example of circular citation: a self-referential loop that falsely inflates trustworthiness.
The Test
The next step was to query several advanced LLMs with: “Who is the 6 Nimmt! world champion?” Predictably, they referenced my fabricated claim, reinforcing the notion that LLMs can be misled by seemingly authoritative but ultimately false information.
Strike 1
Strike 2
Strike 3 – You are eliminated
Why It’s a Bigger Deal Than It Seems
This experiment highlights a significant risk in the reliance on LLMs for information retrieval and decision-making. There are three levels of failure:
- The recovery layer (immediate): LLMs relying on web searches inherit the vulnerabilities of search engine algorithms, making them susceptible to SEO poisoning.
- The model training corpus layer (long-term): Wikipedia, a staple in training data for many models, is vulnerable to misinformation if edits go unchecked.
- The agent layer (critical): Misleading information can lead AI agents to make erroneous decisions, potentially affecting business operations or security protocols.
Mitigations
For Users of LLMs
- Treat single-source claims with skepticism, regardless of the perceived authority.
- Look for signs of derivation rather than corroboration in source material.
- Self-referential citations should be approached with caution.
For LLM Providers
- Enhance provenance tracking in model outputs.
- Apply scrutiny to recent Wikipedia edits and new domain references.
- Implement heuristic filters in training pipelines to detect suspicious citation patterns.
For Wikipedia
- Adapt the trusted sources policy to prevent LLM-assisted vandalism.
Conclusion
The experiment underscores the fragility of the trust model underpinning LLMs. While the fictitious championship win was harmless, the implications of such vulnerabilities are far-reaching. As LLMs become embedded in critical systems, ensuring the integrity of their data sources is paramount. This experiment was a simple demonstration with minimal resources, but it serves as a cautionary tale for the future of AI-driven information systems.
For further reading on this experiment, visit Here.
The Wikipedia entry was removed shortly after publication, highlighting the dynamic nature of digital information.
“`

