Remember when the biggest threat online was a computer virus? Those were simpler times. Today, we face a far more insidious digital danger: AI-powered social media bots. A study by researchers from the University of Washington and Xi’an Jiaotong University reveals both the immense potential and concerning risks of using large language models (LLMs) like ChatGPT in the detection and creation of these deceptive fake profiles.
Social media bots — automated accounts that can mimic human behavior — have long been a thorn in the side of platform operators and users alike. These artificial accounts can spread misinformation, interfere with elections, and even promote extremist ideologies. Until now, the fight against bots has been a constant game of cat and mouse, with researchers developing increasingly sophisticated detection methods, only for bot creators to find new ways to evade them.
Enter the era of large language models. These AI marvels, capable of understanding and generating human-like text, have shown promise in various fields. But could they be the secret weapon in the war against social media bots? Or might they instead become a powerful tool for creating even more convincing fake accounts?
The research team, led by Shangbin Feng, set out to answer these questions by putting LLMs to the test in both bot detection and bot creation scenarios. Their findings paint a picture of both hope and caution for the future of social media integrity.
“There’s always been an arms race between bot operators and the researchers trying to stop them,” says Feng, a doctoral student in Washington’s Paul G. Allen School of Computer Science & Engineering, in a university release. “Each advance in bot detection is often met with an advance in bot sophistication, so we explored the opportunities and the risks that large language models present in this arms race.”
On the detection front, the news is encouraging. The researchers developed a novel approach using LLMs to analyze various aspects of user accounts, including metadata (like follower counts and account age), the text of posts, and the network of connections between users. By combining these different streams of information, their LLM-based system was able to outperform existing bot detection methods by an impressive margin—up to 9.1% better on standard datasets.
What’s particularly exciting about this approach is its efficiency. While traditional bot detection models require extensive training on large datasets of labeled accounts, the LLM-based method achieved its superior results after being fine-tuned on just 1,000 examples. This could be a game-changer in a field where high-quality, annotated data is often scarce and expensive to obtain.
However, the study’s findings weren’t all rosy. The researchers also explored how LLMs might be used by those on the other side of the battle — the bot creators themselves. By leveraging the language generation capabilities of these AI models, they were able to develop strategies for manipulating bot accounts to evade detection.
These LLM-guided evasion tactics proved alarmingly effective. When applied to known bot accounts, they were able to reduce the detection rate of existing bot-hunting algorithms by up to 29.6%. The manipulations ranged from subtle rewrites of bot-generated text to make it appear more human-like to strategic changes in which accounts a bot follows or unfollows.
Perhaps most concerning is the potential for LLMs to create bots that are not just evasive but truly convincing. The study demonstrated that LLMs could generate user profiles and posts that capture nuanced human behaviors, making them far more difficult to distinguish from genuine accounts.
This dual-use potential of LLMs in the realm of social media integrity presents a challenge for platform operators, researchers, and policymakers alike. On one hand, these powerful AI tools could revolutionize our ability to identify and remove malicious bot accounts at scale. On the other, they risk becoming a sophisticated weapon in the arsenal of those seeking to manipulate online discourse.