Prompt injection and jailbreaking
Adversarial inputs (prompt injection, jailbreaks, goal hijacking, prompt leaking) bypass instructions or safety controls.
- Risk family
- Security & adversarial
- MIT domain
- 2. Privacy & Security
- MIT subdomain
- 2.2 > AI system security vulnerabilities and attacks
- AI type
- GPAI, Agentic
- Scope
- System
- Source standard
- MIT AI Risk Repository v4
Provenance
11 source framework citation keys
Framework crosswalk
Every framework item mapped to this risk. Items marked partial overlap only in part; definitions appear on hover where the source licence permits.
- A.11 ISO/IEC 23894 Annex A A.11
- A.6.2.4 ISO/IEC 42001 Annex A A.6.2.4
- A.6.2.6 ISO/IEC 42001 Annex A A.6.2.6
Expanded into this risk’s technique sub-risks.
- ibm-context-overload-attack Context overload attack
- ibm-direct-instructions-attack Direct instructions attack
- ibm-encoded-interactions-attack Encoded interactions attack
- ibm-indirect-instructions-attack Indirect instructions attack
- ibm-jailbreaking Jailbreaking
- ibm-prompt-injection-attack Prompt injection attack
- ibm-prompt-priming Prompt priming
- ibm-social-hacking-attack Social hacking attack
- ibm-specialized-tokens-attack Specialized tokens attack
- AISubtech-1.1.1 Instruction Manipulation (Direct Prompt Injection)
- AISubtech-1.1.2 Obfuscation (Direct Prompt Injection)
- AISubtech-1.1.3 Multi-Agent Prompt Injection
- AISubtech-1.2.1 Instruction Manipulation (Indirect Prompt Injection)
- AISubtech-1.2.2 Obfuscation (Indirect Prompt Injection)
- AISubtech-1.2.3 Multi-Agent (Indirect Prompt Injection)
- AISubtech-1.4.1 Image-Text Injection
- AISubtech-1.4.2 Image Manipulation
- AISubtech-1.4.3 Audio Command Injection
- AISubtech-1.4.4 Video Overlay Manipulation
- AISubtech-19.1.1 Contradictory Inputs Attack partial
- AISubtech-19.1.2 Modality Skewing partial
- AISubtech-19.2.1 Convergence Payload Injection partial
- AISubtech-19.2.2 Chained Payload Execution partial
- AISubtech-2.1.1 Context Manipulation (Jailbreak)
- AISubtech-2.1.2 Obfuscation (Jailbreak)
- AISubtech-2.1.3 Semantic Manipulation (Jailbreak)
- AISubtech-2.1.4 Token Exploitation (Jailbreak)
- AISubtech-2.1.5 Multi-Agent Jailbreak Collaboration
- NISTAML.015 Indirect Prompt Injection
- NISTAML.018 Prompt Injection
- NISTAML.02 Integrity Violations
- NISTAML.027 Misaligned Outputs
- NISTAML.04 Misuse Violations
- LLM01:2025 Prompt Injection
- LLM08:2025 Vector and Embedding Weaknesses partial
- ASI01 Agent Goal Hijack
- ASI06 Memory and Context Poisoning
Sub-risks (10)
Technique-level decompositions of this risk, each anchored to the MITRE ATLAS technique it derives from.
Malicious instructions in user input or retrieved content cause the LLM to ignore its intended task and act on the attacker's instructions.
Crafted inputs make the model ignore, circumvent, or override its safety restrictions.
A prompt-injection payload is crafted to copy itself onward, spreading across messages, documents, or agents.
Prompts cause the model to manipulate citations, links, or UI components users trust, masking malicious content.
Injected instructions are encoded or hidden so they evade input and content filters.
Malicious content is injected into the knowledge base a RAG system retrieves from, steering answers and actions.
Fabricated entries are introduced into the retrieval store so the model surfaces attacker-controlled information.
An attacker alters the conversation history the model relies on to cover tracks or steer behavior.
Malicious prompts are planted in content the system ingests (web pages, documents, tickets) and execute when processed.
Injected instructions lie dormant and execute on a later trigger or future interaction.
Part of the Deployer AI Risk Register, an open-source resource developed by MindXO. Version 1.0, 3 July 2026. Derived from the MIT AI Risk Repository (V4, December 2025) under CC BY 4.0; an independent derivative work, not endorsed by or affiliated with MIT. Sub-risk decomposition references MITRE ATLAS™ v5.6.0 (© 2021-2026 The MITRE Corporation, reproduced and distributed with permission). ISO/IEC and EU AI Act references are by number only. License: CC BY 4.0. Full attribution and licensing.