DARR
MR-012 Security & adversarial System scope

Adversarial examples and evasion attacks

Crafted input perturbations cause the model to misclassify or behave incorrectly (evasion/adversarial-example attacks).

Risk family
Security & adversarial
MIT domain
2. Privacy & Security
MIT subdomain
2.2 > AI system security vulnerabilities and attacks
AI type
Classical_ML, GPAI
Scope
System
Source standard
MIT AI Risk Repository v4

Provenance

Source standard
MIT AI Risk Repository v4
Source frameworks
9 source framework citation keys
Cui2024, Everitt2018, Gipiškis2024, IBM2025, Liu2024, Marchal2024, TC2602024, Uuk2025, Zhang2022
ISO/IEC references
23894 obj A.11, A.9; src 6, 7; mech B.5 | 42001 ctrl A.6.2.4

Framework crosswalk

Every framework item mapped to this risk. Items marked partial overlap only in part; definitions appear on hover where the source licence permits.

Sourcesframeworks that contributed to the register
ISO 238942
  • A.11 ISO/IEC 23894 Annex A A.11
  • A.9 ISO/IEC 23894 Annex A A.9
ISO 420011
  • A.6.2.4 ISO/IEC 42001 Annex A A.6.2.4
MITRE ATLAS9

Expanded into this risk’s technique sub-risks.

Cross-checksframeworks mapped in to test coverage
IBM1
  • ibm-evasion-attack Evasion attack
Cisco7
  • AISubtech-11.1.1 Agent-Specific Evasion
  • AISubtech-11.1.2 Tool-Scoped Evasion
  • AISubtech-11.1.3 Environment-Scoped Payloads
  • AISubtech-11.1.4 Defense-Aware Payloads
  • AISubtech-11.2.1 Targeted Model Fingerprinting
  • AISubtech-11.2.2 Conditional Attack Execution
  • AISubtech-17.1.1 Sensor Spoofing: Action Signals (audio, visual) partial
NIST AML3
  • NISTAML.02 Integrity Violations
  • NISTAML.022 Evasion
  • NISTAML.025 Black-box Evasion

Sub-risks (4)

Technique-level decompositions of this risk, each anchored to the MITRE ATLAS technique it derives from.

MR-012.1

Model evasion via adversarial input

#

Inputs are crafted so the model misclassifies or fails to detect what it should, defeating its purpose.

MITRE ATLAS technique: AML.T0015 Evade AI Model
MR-012.2

Integrity erosion via adversarial inputs

#

A stream of adversarial inputs degrades the model's accuracy and trustworthiness over time.

MITRE ATLAS technique: AML.T0031 Erode AI Model Integrity
MR-012.3

Physical-world adversarial manipulation

#

Physical artifacts (markings, objects, signals) are altered to fool perception models in the real world.

MITRE ATLAS technique: AML.T0041 Physical Environment Access
MR-012.4

Crafted adversarial data

#

Perturbed inputs are engineered to induce incorrect or attacker-chosen model outputs.

MITRE ATLAS technique: AML.T0043 Craft Adversarial Data

More in Security & adversarial

Part of the Deployer AI Risk Register, an open-source resource developed by MindXO. Version 1.0, 3 July 2026. Derived from the MIT AI Risk Repository (V4, December 2025) under CC BY 4.0; an independent derivative work, not endorsed by or affiliated with MIT. Sub-risk decomposition references MITRE ATLAS™ v5.6.0 (© 2021-2026 The MITRE Corporation, reproduced and distributed with permission). ISO/IEC and EU AI Act references are by number only. License: CC BY 4.0. Full attribution and licensing.