AI Security - LLM-DOS, and predictions of 2025 and beyond
Introduction
Hello again, this article is part of AI security series. I have been discussing AI security along with the OWASP LLM top10.
LLM01 and LLM02 were discussed in the "AI Security : Prompt Injection and Insecure Output Handling", and LLM03 and its basic concepts were discussed in the "Using ChatGPT for security and introduction of AI security". In this article, I am going to discuss LLM04. And, since we are almost at the end of the year 2024, I would like to present some discussions and predictions for AI security in 2025 and beyond.
LLM04: Model Denial of Service
LLM04 is relatively easy to understand for security engineers who is familiar with conventional cyber attack methods. Denial of Service (DoS) is a common method of cyber attack, in which a large amount of data is given to the server to make it unable to provide services and/or crash. DoS attacks usually aim to exhaust computational resources and block services rather than stealing data, but the disruption they cause can be used as a smokescreen for more malicious activities, such as data breaches or malware installation.
DOS attack against LLM (LLM-DOS) is same. It aims to exhaust computational resources of DOS (like CPU/GPU usage) and block services (like responding to chat). LLM-DOS can be done in two ways. One is a simple LLM-DOS attack which is to mass input against the LLM's input, similar to a DOS attack against a server. This method, as described in this article, can deplete the LLM's resources, like CPU/GPU usages. If you call this as a simple DoS attack, in such a scenario would be to instruct the model to keep repeating Hello, but we see that relying only on natural instructions limits the output length, which is limited by the maximum length of the LLM's Supervised Fine-Tuning (SFT) data
The another method of LLM-DOS is to include code in the input that over-consumes resources. Denial-of-Service Poisoning Attacks on Large Language Models is discussing this. In the paper, this is called as a poisoning-based DoS (P-DoS) attack and it demonstrates that the output length limit can be broken by injecting a single poisoning sample designed for DoS purposes. Experiments reveal that an attacker can easily compromise models such as GPT-4o and GPT-4o mini by injecting a single poisoning sample through the OpenAI API at a minimal cost of less than $1.
To understand this, it is easier to think about simple programming - for example, if you put an inescapable loop statement in your code, it can hang the computer (in fact, the IDE will warn you before it compiles). And if the network does not have a Spanning Tree Protocol, it will loop and hang the router. So same things happens on prompt injection.
When using this idea LLLM-DOS, we must consider that such input should be blacklisted, so the simple way of using inescapable loop is impossible. Also, even if it is possible against WhiteBox, but we do not know what kind of attack is possible in BlackBox. However, according to "Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings", a Prompt input to the BlackBox can generate multiple sub-prompts (e.g., 25 sub-prompts). Its experiments show that the delay could be increased by a factor of 250.
Given these serious safety concerns, researchers advocate further research aimed at defending against LLM-DoS threats in custom fine tuning of aligned LLMs.
What will happen in 2025 and beyond?
Some news site predicts an intensifying AI arms race in coming year. I would like to share an article on AI security predictions for the coming year and beyond.
According to an article by EG Secure solutions, the generative AI makes it possible to create a malware without specialized skills, that makes easier to do cyber attacks. Thus, the article predicted that cyber attacks by malware created by generative AI would increase. The article also points out that LLM-generated applications such as RAGs are being used, but their code may contain vulnerabilities, and that will be another threat in 2025 and beyond.
McAfee has released "McAfee Unveils 2025 Cybersecurity Predictions: AI-Powered Scams and Emerging Digital Threats Take Center Stage". According to the article, cyber attacks by malicious attackers will be highly optimized by generative AI, and the quality of DeepFake and AI-generated images/videos will increase, making it difficult to determine whether they are created by humans or generative AI. Thus it is expected that fake emails generated by generative AI, such as phishing emails, will also become harder to distinguish from real emails. Furthermore, the article points out that malware which is using (maybe created by) generative AI will become more sophisticated, thereby breaking through conventional security defense systems and may succeed to extracting personal information and sensitive data.
Finally, the "Infosec experts divided on AI's potential to assist red teams" discusses the pros and cons of using generative AI for red teaming, one type of security audit. According to the article, the benefit of using generative AI is that it accelerates threat detection by allowing AI to scour multiple data feeds, applications, and other sources of performance data and run them as part of a larger automated workflow. On the other hand, the article also argues that using generative AI for red teaming is still limited, because the vulnerability discovery process by AI is a black box so the pen-tester cannot explain how they discovered to their clients.