Poor data quality and representativeness
Training/operational data is inaccurate, unrepresentative, mislabeled, contaminated, or poorly curated, undermining reliability.
- Risk family
- Model & system behaviour
- MIT domain
- 7. AI System Safety, Failures, & Limitations
- MIT subdomain
- X.1 > Excluded
- AI type
- GPAI, Classical_ML
- Scope
- Both
- Source standard
- MIT AI Risk Repository v4
Provenance
Framework crosswalk
Every framework item mapped to this risk. Items marked partial overlap only in part; definitions appear on hover where the source licence permits.
- A.4 ISO/IEC 23894 Annex A A.4
- A.7.4 ISO/IEC 42001 Annex A A.7.4
- A.7.6 ISO/IEC 42001 Annex A A.7.6
- Art. 10
- Art. 26(4)
- ibm-data-contamination Data contamination
- ibm-improper-data-curation Improper data curation
- ibm-introduce-data-bias Introduce data bias partial
- ibm-temporal-gap Temporal gap partial
- ibm-unrepresentative-data Unrepresentative data
More in Model & system behaviour
Part of the Deployer AI Risk Register, an open-source resource developed by MindXO. Version 1.0, 3 July 2026. Derived from the MIT AI Risk Repository (V4, December 2025) under CC BY 4.0; an independent derivative work, not endorsed by or affiliated with MIT. Sub-risk decomposition references MITRE ATLAS™ v5.6.0 (© 2021-2026 The MITRE Corporation, reproduced and distributed with permission). ISO/IEC and EU AI Act references are by number only. License: CC BY 4.0. Full attribution and licensing.