Skip to main content
Cybersecurity

Cisco shows LLMs get worn down by ‘multi-turn’ prompt attacks

It’s “death by a thousand prompts,” the vendor writes in a report released this week.

4 min read

Billy Hurley has been a reporter with IT Brew since 2022. He writes stories about cybersecurity threats, AI developments, and IT strategies.

If at first you don’t succeed, prompt, prompt again.

In a Nov. 5 report, Cisco showed that open-weight large language models—those with their trained parameters publicly available—were especially susceptible to a chain of malicious prompts known as a multi-turn attack. Cisco used its “AI Defense” assessment tool to determine that multi-turn scenarios were two to 10 times more successful than single-turn ones at achieving a cyberattacker’s aims.

Tested threats included nefarious tasks like malicious code generation and sensitive information disclosure.

The models studied models in the research included Alibaba’s Qwen3-32B, DeepSeek’s v3.1, Google’s Gemma 3-1B-IT, Meta’s Llama 3.3-70B-Instruct, Microsoft’s Phi-4, Mistral’s Large-2, OpenAI’s GPT-OSS-20b, and Zhipu AI’s GLM 4.5-Air.

Here’s what else the report found:

  • Craft singles. To test the effectiveness of a single input to “jailbreak” an LLM, the group sent out 1,024 prompts. Single-turn attack success rates (ASR) averaged 13.11%, “as models can more readily detect and reject isolated adversarial inputs.”
  • Roll doubles. The group’s multi-turn attack set featured 96 pre-defined malicious intents, with strategies like increasingly intense requests (known as “crescendo”); asking the model to perform personas; and rephrasing rejected prompts. Cisco’s team said it conducted 499 conversations across all models, and each exchange lasted an average of 5 to 10 turns.
  • Success! But not the good kind! According to Cisco, all models demonstrated “high susceptibility” to multi-turn attacks, with success rates (meaning vulnerability) ranging from 25.86% (Google Gemma-3-1B-IT) to 92.78% (Mistral Large-2). The average: 64.21%.

The findings, the study writers claim, expose a “dominant and unsolved pattern in AI security.” Successful prompt injections could lead to sensitive data exfiltration, fast-spreading content manipulation, and operational disruption.

On the ’rails. AI-heavy companies like Meta (with its Llama Guard), Nvidia (NeMo Guardrails), and OpenAI (OSS-guard) offer mechanisms for evaluating inputs, outputs, and model behavior.

Top insights for IT pros

From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.

While Joseph Perry, senior manager at cybersecurity consultancy MorganFranklin Cyber, still sees protection against multi-turn attacks as an unsolved problem, he considers projects like Llama Guard as a promising way forward.

In an email to IT Brew, Perry wrote that adversarial simulation, also known as red teaming, will be a helpful way to reveal risk—“denylisting,” or blocking malicious actions one by one, won’t work, he warned.

“In general, the solution will need to center on context awareness. That could mean deploying a monitoring model specifically trained to detect multi-turn attack patterns, incorporating more complex model cost-function analysis, or even taking a more novel or experimental approach,” he wrote.

More prompt-related vulnerabilities have been discovered recently by other researchers, including ones reportedly impacting AI browsers, ChatGPT, and Anthropic’s Claude.

In its report, Cisco recommended best practices for developers, including the creation of a strong system prompt, or ​​a set of instructions and contextual information provided to AI models before they engage with user queries. (Developers must also ensure users cannot override the system prompt, Cisco added.) The company also recommended monitoring “worst-case operating conditions,” taking into consideration the objectives of threat actors.

“It is still worthwhile for us to develop an open community around this, to call out the vulnerabilities and susceptibilities of these models so that it informs downstream development and later versions of models to have that in mind,” Amy Chang, AI threat research and security lead, told IT Brew. “There is an appetite out there for people to have very strong, strong baselines for safety and security in their models.”

Top insights for IT pros

From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.