Model Failure
Specification
Perform a risk-based evaluation of the model and model serving infrastructure for model failure. Define and implement measures to mitigate model and model serving infrastructure failures, and regularly evaluate throughout the AI system's lifecycle.
Threat coverage
Architectural relevance
Lifecycle
Data storage, Resource provisioning
Design, Training
Evaluation, Validation/Red Teaming, Re-evaluation
Orchestration, AI applications
Operations, Maintenance, Continuous monitoring, Continuous improvement
Archiving, Model disposal
Ownership / SSRM
PI
Shared Cloud Service Provider-Model Provider (Shared CSP-MP)
The CSP and MP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Model
Owned by the Model Provider (MP)
The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.
Orchestrated
Shared Model Provider-Orchestrated Service Provider (Shared MP-OSP)
The MP and OSP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Application
Shared Orchestrated Service Provider-Application Provider (Shared OSP-AP)
The OSP and AP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Implementation guidelines
Auditing guidelines
1. Review CSP's infrastructure resilience and high-availability measures for hosting AI models. 2. Assess failover mechanisms that ensure model availability during infrastructure failures. 3. Verify documentation of redundancy architecture and recovery procedures. 4. Confirm that redundancy implementation aligns with service level agreements and business continuity requirements. Verify that redundant implementations don't contribute to data poisoning or model theft.
Standards mappings
ISO 42001 A.4.5 - System and computing resources ISO 42001 B.4.5 System and computing resources ISO 27001 6.1.2 - Information security risk assessment ISO 27001 A.8.13 - Information backup ISO 27001 A.8.14 - Redundancy of information processing facilities => disagree => I would expect the use of the cloud and therefore ensuring that redundant availability zones/regions are selected.
Addendum
N/A
Article 15 (4)
Addendum
N/A
No Mapping
Addendum
No NIST AI 600-1 controls address risk-based evaluation of the model-serving infrastructure.
C4 BC-04 C4 SR-01 C4 SR-02 C4 SR-06 C5 BC-03 C5 BC-04 C5 OPS-05 C5 OPS-18
Addendum
N/A
AI-CAIQ questions (2)
Are risk-based evaluation of the model and model serving infrastructure for model failure performed?
Are measures defined and implemented to mitigate model and model serving infrastructure failures, and are they regularly evaluated throughout the AI system's lifecycle?