CCC · Change Control and Configuration Management

CCC-08Cloud & AI Related

Exception Management

Specification

Implement a procedure for the management of exceptions, including emergencies, in the change and configuration process. Align the procedure with the requirements of GRC-04: Policy Exception Process.

Threat coverage

Model manipulation

Data poisoning

Sensitive data disclosure

Model theft

Model/Service Failure

Insecure supply chain

Insecure apps/plugins

Denial of Service

Loss of governance

Architectural relevance

Physical infrastructure

Network

Compute

Storage

Application

Data

Lifecycle

Preparation

Data collection, Data curation, Data storage, Resource provisioning, Team and expertise

Development

Design, Training, Guardrails

Evaluation

Evaluation, Validation/Red Teaming, Re-evaluation

Deployment

Orchestration, AI Services supply chain, AI applications

Delivery

Operations, Maintenance, Continuous monitoring, Continuous improvement

Retirement

Archiving, Data deletion, Model disposal

Ownership / SSRM

Shared across the supply chain

Shared control ownership refers to responsibilities and activities related to LLM security that are distributed across multiple stakeholders within the AI supply chain, including the Cloud Service Provider (CSP), Model Provider (MP), Orchestrated Service Provider (OSP), Application Provider (AP), and Customer (AIC). These controls require coordinated actions, communication, and governance across all involved parties to ensure their effectiveness.

Model

Owned by the Model Provider (MP)

The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.

Orchestrated

Shared Model Provider-Orchestrated Service Provider (Shared MP-OSP)

The MP and OSP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.

Application

Shared Orchestrated Service Provider-Application Provider (Shared OSP-AP)

The OSP and AP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.

Implementation guidelines

[All Actors]
1. Define a formal exception workflow that specifies: eligibility (regular vs. emergency), required risk assessment, approvers, maximum duration and/or compensating controls.

2. Embed the workflow in change-/config-management tooling (e.g., CI/CD gates, IaC pull-requests) so unauthorised changes are blocked or tagged until the exception is approved.

3. Log every exception with full metadata (requester, reason, scope, approvals, expiry date); store in the same repository used for normal change records.

4. Implement an emergency fast-track that allows rapid change when service impact or security risk is imminent; require retrospective approval and review within organization defined SLA.

5. Review the open-exception register on a regular, risk-based cadence and either close, renew with justification, or convert to standard policy updates.

6. Align all steps with GRC-04 (policy-exception requirements) to ensure single governance across policies, risk assessments, and stakeholder notifications.

Auditing guidelines

1. Inquiry with Control Owners

1.1 Understand Infrastructure Exception Handling Practices: Interview infrastructure operations leaders, hardware engineers, and data center managers responsible for exception handling. Review documented exception policies covering: emergency hardware/firmware updates and infrastructure maintenance, expedited capacity expansion and resource reallocations (e.g., GPU/TPU), disruption response (e.g., network, storage, power, environmental), and post-incident review and documentation requirements. Verify that exception criteria are clearly defined for: emergencies requiring immediate changes (e.g., failures, vulnerabilities), expedited patches or reallocations due to performance constraints, and authorization levels needed based on severity and customer impact.

1.2 Review Exception Process Documentation: Examine procedures and artifacts that detail: exception request templates and approval workflows, risk assessment steps for infrastructure-related exceptions, temporary approval and escalation pathways, required documentation and post-change validation, and exception tracking and status monitoring.

1.3 Assess Emergency Response Protocols: Evaluate documented procedures for handling critical infrastructure events: hardware failures and firmware vulnerabilities, storage/data integrity issues or cache corruption, network fabric disruptions or latency spikes, power, cooling, or physical plant failures, and emergency resource quota adjustments.

1.4 Evaluate Governance and Oversight Structures: Confirm existence of: designated approval authorities and escalation paths, on-call emergency response teams per infrastructure domain, exception review boards and governance charters, executive oversight and GRC-04 alignment, and integration with enterprise risk and incident management.

2. Define and Verify Population of Exception Records

2.1 Complete Exception Inventory: Obtain a full inventory of exception records, including: emergency hardware/firmware updates, capacity expansion approvals, resource reallocation (e.g., accelerator pooling), and unplanned maintenance and retroactive exceptions.

2.2 Cross-Verify for Completeness: Ensure population accuracy by cross-referencing monitoring alerts and change tickets, incident and escalation records, service status reports and customer impact notifications, post-incident reviews, and risk registers.

3. Exception Sample Selection and Testing

3.1 Select Representative Exceptions: Choose samples that vary by: type (e.g., hardware update, network fix, quota increase), affected infrastructure (compute, storage, network), customer impact (high, medium, low), approval level and timeframe, justification category (performance, failure, security).

3.2 Evaluate Lifecycle of Each Exception: Review the following categories. Justification: clear rationale and urgency documented, evidence from monitoring or capacity thresholds, risk assessment and consideration of alternatives, and fit within defined exception criteria. Approval: approval by appropriate authority (or retroactively for emergencies), conditions/time limitations documented and followed. Implementation: verified through logs or infrastructure management tools, confined to approved scope and components, monitoring and mitigation applied during exception, stakeholder communication documented (e.g., customer alerts). Closure and Follow-up: timely closure and rollback (if applicable), validation tests conducted, lessons captured and documented, reintegration into standard processes completed.

4. Exception Tracking, Governance, and Continuous Improvement

4.1 Assess Tracking and Oversight: Verify centralized tracking of infrastructure exceptions, expiration tracking for temporary approvals, governance reporting and executive visibility, trend analysis and identification of recurring issues, and integration with customer impact and risk reporting.

4.2 Evaluate Improvement Mechanisms: Assess maturity of the CSP’s improvement processes: regular exception pattern reviews, incident-driven process refinements, reductions in emergency change frequency, improved emergency response calibration, updates to exception criteria as operations evolve, and
infrastructure architecture adaptations to minimize exceptions.

From CCM:

1. Verify that the organization establishes and documents mandatory configuration settings for information technology products employed within the information system, as determined by adoption of the latest suitable security configuration baselines.
2. Confirm that the process identifies, documents, and approves exceptions from the mandatory established configuration settings for individual components based on explicit operational requirements.
3. Determine that the organization monitors and controls changes to the configuration settings in accordance with organizational policy and procedures.

Standards mappings

ISO 42001No Gap

42001: Clause 6.3 Planning of changes
42001: Clause 8.1 Operational planning and control
42001: Clause 10.2 Nonconformity and corrective action

Addendum

N/A

EU AI ActFull Gap

No Mapping

Addendum

The EU AI Act does not cover the CCC-08 topic, "Implement a procedure for the management of exceptions, including emergencies, in the change and configuration process. Align the procedure with the requirements of GRC-04: Policy Exception Process," for any of the AI structures defined within the EU AI Act.

NIST AI 600-1Partial Gap

GV-1.3-007
GV-6.2-003
GV-2.3-001
MG-2.4-002
MG-4.3-001
GV-1.5-002
GV-2.1-002
GV-6.2-006

Addendum

NIST AI 600-1 does not cover "emergency change management process requirements."

BSI AIC4No Gap

DEV-03
SIM-01
DEV-08
SP-03

Addendum

N/A

AI-CAIQ questions (1)

CCC-08.1

Is a procedure implemented (aligning with the requirements of GRC-04: Policy Exception Process) for the management of exceptions, including emergencies, in the change and configuration process?