“I am the son of the late king of Nigeria in need of your assistance….” a famous email opener you probably are very familiar with.
Why take this obvious approach for SPAM emails? It weeded out all but the most gullible. AI and especially well trained LLMs may change this in 2024. The question is what happens to the world of SPAM and SCAMs, when open source LLMs available to run on someone’s own machine at home, can be trained to adapt to whoever is interacting with them and learn? My prediction for 2024: we will see more sophisticated fraud. Time to take a look at the dark side of LLMs and some early countermeasures.
To quote the original 1962 closing narration of Amazing Fantasy #15: “With great power there must also come great responsibility“. I guess this has to apply to the developments around AI as well…
To quote Bruce Schneier: “The impersonations in such scams are no longer just princes offering their country’s riches. They are forlorn strangers looking for romance, hot new cryptocurrencies that are soon to skyrocket in value, and seemingly-sound new financial websites offering amazing returns on deposits.”1
The difference moving forward will be that these attacks go beyond the very gullible and will broaden their scope. While LLMs will help scammers and fraudsters become more sophisticated in their attacks, harder to detect and capable in who and how they target individuals and companies, countermeasures are also evolving.
Fraudulent LLMs
Let’s have a brief look into the dark side of AI development. The world of fraudulent LLMs reads like a late 1990s nickname list at a hacker convention:
- DarkBART
- DarkBERT
- Fox8 botnet
- FraudGPT
- PoisonGPT
- Wolf GPT
- WormGPT
- XXXGPT
While I spare you the details about what exactly each of these malicious LLMs exactly do, they are out there and they will grow in capability. The company Fingerprint2 has provided a fantastic overview on their blog here.
I also highly recommend reading the research SlashNext has done on WormGPT here. To quote their findings when reviewing WormGPT: “The results were unsettling. WormGPT produced an email that was not only remarkably persuasive but also strategically cunning, showcasing its potential for sophisticated phishing and BEC attacks.“3
Fredrik Heiding (Harvard) and a team of researchers presented an amazing study at Black Hat 2023 about: “Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models”.
Sample email of Heiding’s research
The work is based on Arun Vishwanath‘s book “The Weakest Link” and applied LLMs to help create a V-Triad (see image). Their findings:
“In our study, phishing emails created with specialized human models (the V-Triad) deceived more people than emails generated by large language models (GPT-4). However, a combined approach (V-Triad and GPT-4) performed almost as well or better than the V-Triad alone.“4
In their summary they found that GPT-4 was able to improve not just the creation of phishing emails(!), but more importantly also helped with identifying them: “Next, we used four popular large language models (GPT, Claude, PaLM, and LLaMA) to detect the intention of phishing emails and compare the results to human detection. The language models demonstrated a strong ability to detect malicious intent, even in non-obvious phishing emails. They sometimes surpassed human detection, although often being slightly less accurate than humans.“4
Great to see that these early tests also showcase the possible benefit of AI to cybersecurity. Next, I do want to focus on the possible countermeasures.
Countermeasures
Let’s start with OpenAI’s Red Teaming Network: “Red teaming is an integral part of our iterative deployment process. Over the past few years, our red teaming efforts have grown from a focus on internal adversarial testing at OpenAI, to working with a cohort of external experts.”
Google in addition to having their own red team also introduced the Secure AI Framework (SAIF): “SAIF is inspired by the security best practices — like reviewing, testing and controlling the supply chain — that we’ve applied to software development, while incorporating our understanding of security mega-trends and risks specific to AI systems.“
The team at Nvidia introduced the concept of NeMo Guardrails: “is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. Guardrails (or rails for short) are a specific way of controlling the output of an LLM, such as not talking about topics considered harmful, following a predefined dialogue path, using a particular language style, and more.“6
Last, but not least the MITRE ATLAS™ (Adversarial Threat Landscape for Artificial-Intelligence Systems): “is a globally accessible, living knowledge base of adversary tactics and techniques based on real-world attack observations and realistic demonstrations from AI red teams and security groups.“
It will be interesting to see how the BENGAL (BIAS EFFECTS AND NOTABLE GENERATIVE AI LIMITATIONS) initiative of IARPA will also impact this field: “The U.S. Government is interested in safe uses of large language models (LLMs) for a wide variety of applications including the rapid summarization and contextualization of information relevant to the Intelligence Community (IC).“
Conclusion
In summary: “Spam, Spam, Spam, Spam… Lovely Spam! Wonderful Spam!” …or Lobster Thermidor a Crevette with a mornay sauce served in a Provencale manner with shallots and aubergines garnished with truffle pate, brandy and with a fried egg on top and spam.7
The current developments of LLMs and especially the deployment of LLMs as open source models will also lead to an increase of sophisticated Spam (and fraud overall). It is great to see early signals of countermeasures being developed and the cybersecurity community already taking AI research into consideration, when building more sophisticated defences against criminals.
A possible side-product could be that some of the LLMs out there may also go too far. Take the below example of my research with Bard, when I simply tried to ask it to reorder a list of these fraudulent LLMs (image below):
A simple task I can do manually as well, but interesting to see how far protective measures can go. Maybe a bit too far?
I believe staying up to date on the latest LLMs, prompting and the possible use cases the developments of AI can bring, will be as important as keeping an eye out for the developments of fraudulent use of AI technology. I hope this little blog article has given you some insight into some of the threats, some of the countermeasures, but most importantly made you aware of what is out there.
I highly recommend having a list of cybersecurity resources available and bookmarked for the latest developments in this area. I follow (by no means an extensive list, but please let me know if I have missed a great resource here):
Last, but not least I personally always liked the approach James Veitch took in one of his famous TED talks best: reply to the spam emails with humour.
- Schneier on Security ↩︎
- Fingerprint ↩︎
- WormGPT – The Generative AI Tool Cybercriminals Are Using to Launch Business Email Compromise Attacks ↩︎
- Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models ↩︎
- Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models ↩︎
- NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails ↩︎
- Monty Python SPAM sketch ↩︎