AI Poisoning: Black Hat SEO is Back
Explore how Black Hat SEO evolved, why it declined, and how search engines like Google worked for decades to stop ranking manipulation.
AI Poisoning: Black Hat SEO is Back
New research finds that Black Hats can poison LLMs.
With the existence of online search, there has been a subset of marketers, webmasters, and SEOs eager to cheat the system to gain an unfair and undeserved advantage.
Black Hat SEO refers to a set of practices used for increasing site or page rank in search engines through means that violate the search engines’ terms of service. Back Hat SEO is only less common these days as Google spent two-plus decades developing ever-more sophisticated algorithms to neutralize and penalize the techniques they used to game search rankings.
What’s Your Poison?
TikTok hackers are irrelevant. What if I told you that AI replies about your brand may currently be manipulated and influenced?
For instance, malicious actors may alter the large language model’s (LLM) training data to such an extent that, should a prospective buyer ask the AI to evaluate comparable products from other brands, it would produce a result that materially misrepresents your product. Or worse, completely removes your brand from the comparison. That’s Black Har right now.
Consumers do tend to trust AI responses. This becomes a problem when those responses can be manipulated. Practically, these are deliberately crafted hallucinations, designed and seeded into the LLM for someone’s benefit.
The only antidote for this AI poisoning to have is awareness.
Anthropic, the company behind the AI platform Claude, published the findings of a joint study with the UK AI Security Institute and the Alan Turing Institute into the impact of AI poisoning on training datasets.
The LLMs powering AI platforms are trained on vast datasets that include trillions of tokens scraped from webpages across the internet, as well as social media posts, books, and more.
Until now, it was thought that the size of the training dataset would determine how much harmful information was required to poison an LLM. The amount of harmful content would increase with the dataset size. Additionally, some of these datasets are enormous.
But even if your bogus content does get scraped and included in the training dataset, you still wouldn’t have any control over how it is filtered, weighted, and balanced against the mountains of legitimate content that quite clearly state the convincing content would not be true.
Black Hats, therefore, need to insert themselves directly into that training process. They do this by creating a “backdoor” into the LLM, usually by seeding a trigger word into the training data hidden within the malicious content.
The Best Antidote is to Avoid Poisoning in the First Place
What you can do is - regularly test brand-relevant prompts on each AI platform and keep an eye out for suspicious responses. You could also track how much traffic comes to your site from LLM citations by separating AI sources from other referral traffic in Google Analytics.
If the traffic suddenly drops, something may be amiss.
Until LLMs develop more sophisticated measures against AI poisoning, the best defense we have is prevention.
Don’t Mistake this for an Opportunity
Plenty of websites, including some major brands, certainly regretted taking a few shortcuts to the top of the rankings once Google started actively penalizing Black Hat practices.
A lot of brands saw their rankings completely collapse following the Panda and Penguin updates in 2011. Not only did they suffer months of lost sales as search traffic fell away, but they also faced huge bills to repair the damage in the hopes of eventually regaining their lost rankings.
LLMs aren’t oblivious to the problem. They do have blacklists and filters to try to keep out malicious content, but these are largely retrospective measures.
Only add URLs and domains to a blacklist after they have been caught doing the wrong thing.
You really don’t want your brand to be caught up in any algorithm crackdown in the future.
Rather, keep concentrating on creating quality, factual, and well-researched content that is prepared for asking- that is, ready for LLMs to extract information in response to expected user queries.
Forewarned is Forearmed
AI poisoning represents a clear and present danger, alarming anyone with responsibility for your brand’s reputation and AI visibility.
Anthropic acknowledged that there was a risk that the findings might encourage more bad actors to experiment with AI poisoning. On the contrary, their ability to do so largely relies on no one noticing or taking down malicious content as they attempt to reach the necessary critical mass of nearly 250.
Waiting for various LLMs for developing stronger defences, we’re not entirely helpless. Vigilance is essential.
Additionally, keep in mind that AI poisoning could be a shortcut that finally sends your brand off a cliff if you’re thinking that a little AI manipulation could be the short-term boost your brand needs right now. Keep your brand from becoming just another cautionary tale.
Make every effort to provide AI with engaging, citation-worthy information if you want your brand to succeed in this innovative era of AI search. Construct upon request. The remainder would come later.