Data Integrity Check
Specification
Regularly validate the consistency and conformity of training, fine-tuning or augmentation data. Implement dataset versioning to ensure traceability and enforce restrictions to prevent unauthorized changes.
Threat coverage
Architectural relevance
Lifecycle
Data curation, Data storage
Training, Guardrails
Validation/Red Teaming
Orchestration, AI Services supply chain
Operations, Continuous monitoring
Data deletion, Archiving
Ownership / SSRM
PI
Owned by the Customer (AIC)
The Customer (AIC) is responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies services or products they consume.
Model
Owned by the Customer (AIC)
The Customer (AIC) is responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies services or products they consume.
Orchestrated
Owned by the Customer (AIC)
The Customer (AIC) is responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies services or products they consume.
Application
Owned by the Customer (AIC)
The Customer (AIC) is responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies services or products they consume.
Implementation guidelines
Auditing guidelines
1. Verify that all data sources handled by the infrastructure services are identified and traceable. 2. Verify that logging systems track all changes or updates to data processed or stored on infrastructure platforms. 3. Verify that automated integrity monitoring tools are implemented at the infrastructure layer to detect anomalies. 4. Verify that infrastructure access controls prevent unauthorized data modifications. 5. Verify that encryption is enforced for sensitive data at rest and in transit within infrastructure systems. 6. Verify that version control tracks changes to datasets and AI models managed by the infrastructure. 7. Verify that infrastructure staff are trained on data integrity best practices and system controls. 8. Verify that documented procedures address data integrity incidents occurring within infrastructure services.
Standards mappings
42001: A.5.2 AI system impact assessment process 42001: A.6.1.3 Processes for responsible design and development of AI systems 42001: 6.3 Planning of changes 42001: A.7.2 Data for development and enhancement of AI system 42001: A.7.3 Acquisition of data 42001: 7.5.3 Control of documented information
Addendum
N/A
Article 10 (2) Article 11 (1) Article 15 Recital 67
Addendum
N/A
GV-6.1-008 MS-2.5-005 MS-2.8-003 MG-4.1-006
Addendum
N/A
DQ-03
Addendum
N/A
AI-CAIQ questions (2)
Is the consistency and conformity of training, fine-tuning or augmentation data regularly validated?
Is dataset versioning to ensure traceability implemented and are restrictions to prevent unauthorized changes, enforced?