MR-012 Security & adversarial System scope

Adversarial examples and evasion attacks

Crafted input perturbations cause the model to misclassify or behave incorrectly (evasion/adversarial-example attacks).

Risk family: Security & adversarial
MIT domain: 2. Privacy & Security
MIT subdomain: 2.2 > AI system security vulnerabilities and attacks
AI type: Classical_ML, GPAI
Scope: System
Source standard: MIT AI Risk Repository v4

Provenance

Source standard

MIT AI Risk Repository v4

Source frameworks

9 source framework citation keys

Cui2024, Everitt2018, Gipiškis2024, IBM2025, Liu2024, Marchal2024, TC2602024, Uuk2025, Zhang2022

ISO/IEC references

23894 obj A.11, A.9; src 6, 7; mech B.5 | 42001 ctrl A.6.2.4

Framework crosswalk

Every framework item mapped to this risk. Items marked partial overlap only in part; definitions appear on hover where the source licence permits.

Sourcesframeworks that contributed to the register

ISO 238942

A.11 ISO/IEC 23894 Annex A A.11
A.9 ISO/IEC 23894 Annex A A.9

ISO 420011

A.6.2.4 ISO/IEC 42001 Annex A A.6.2.4

MITRE ATLAS9

Expanded into this risk’s technique sub-risks.

Cross-checksframeworks mapped in to test coverage

IBM1

ibm-evasion-attack Evasion attack

Cisco7

AISubtech-11.1.1 Agent-Specific Evasion
AISubtech-11.1.2 Tool-Scoped Evasion
AISubtech-11.1.3 Environment-Scoped Payloads
AISubtech-11.1.4 Defense-Aware Payloads
AISubtech-11.2.1 Targeted Model Fingerprinting
AISubtech-11.2.2 Conditional Attack Execution
AISubtech-17.1.1 Sensor Spoofing: Action Signals (audio, visual) partial

NIST AML3

NISTAML.02 Integrity Violations
NISTAML.022 Evasion
NISTAML.025 Black-box Evasion

Sub-risks (4)

Technique-level decompositions of this risk, each anchored to the MITRE ATLAS technique it derives from.

MR-012.1

Model evasion via adversarial input

Inputs are crafted so the model misclassifies or fails to detect what it should, defeating its purpose.

MITRE ATLAS technique: AML.T0015 Evade AI Model

MR-012.2

Integrity erosion via adversarial inputs

A stream of adversarial inputs degrades the model's accuracy and trustworthiness over time.

MITRE ATLAS technique: AML.T0031 Erode AI Model Integrity

MR-012.3

Physical-world adversarial manipulation

Physical artifacts (markings, objects, signals) are altered to fool perception models in the real world.

MITRE ATLAS technique: AML.T0041 Physical Environment Access

MR-012.4

Crafted adversarial data

Perturbed inputs are engineered to induce incorrect or attacker-chosen model outputs.

MITRE ATLAS technique: AML.T0043 Craft Adversarial Data

More in Security & adversarial

MR-010 MR-014 MR-015 MR-016 MR-020 MR-071

Part of the Deployer AI Risk Register, an open-source resource developed by MindXO. Version 1.0, 3 July 2026. Derived from the MIT AI Risk Repository (V4, December 2025) under CC BY 4.0; an independent derivative work, not endorsed by or affiliated with MIT. Sub-risk decomposition references MITRE ATLAS™ v5.6.0 (© 2021-2026 The MITRE Corporation, reproduced and distributed with permission). ISO/IEC and EU AI Act references are by number only. License: CC BY 4.0. Full attribution and licensing.