CCC · Change Control and Configuration Management

CCC-02Cloud & AI Related

Quality Testing

Specification

Establish, maintain and implement a defined quality change control, approval and testing process incorporating baselines, testing, and release standards.

Threat coverage

Model manipulation

Data poisoning

Sensitive data disclosure

Model theft

Model/Service Failure

Insecure supply chain

Insecure apps/plugins

Denial of Service

Loss of governance

Architectural relevance

Physical infrastructure

Network

Compute

Storage

Application

Data

Lifecycle

Preparation

Data collection, Data curation, Data storage, Resource provisioning, Team and expertise

Development

Training, Design, Guardrails, Supply Chain

Evaluation

Evaluation, Validation/Red Teaming, Re-evaluation

Deployment

Orchestration, AI Services supply chain, AI applications

Delivery

Continuous monitoring, Continuous improvement, Maintenance, Operations

Retirement

Archiving, Data deletion, Model disposal

Ownership / SSRM

Shared across the supply chain

Shared control ownership refers to responsibilities and activities related to LLM security that are distributed across multiple stakeholders within the AI supply chain, including the Cloud Service Provider (CSP), Model Provider (MP), Orchestrated Service Provider (OSP), Application Provider (AP), and Customer (AIC). These controls require coordinated actions, communication, and governance across all involved parties to ensure their effectiveness.

Model

Owned by the Model Provider (MP)

The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.

Orchestrated

Owned by the Orchestrated Service Provider (OSP)

The Orchestrated Service Provider (OSP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The OSP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the OSP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The OSP is accountable for ensuring that its providers upstream (e.g MPs) implement the control as it relates to the service/product the develop and offered by the OSP. This refers to entities that create the technical building blocks and management tools that enable AI implementation. This can include platforms, frameworks, and tools that facilitate the integration, deployment, and management of AI models within enterprise workflows. These providers focus on model orchestration and offer services like API access, automated scaling, prompt management, workflow automation, monitoring, and governance rather than end-user functionality or raw infrastructure. They help businesses implement AI in a structured and efficient manner. Examples: AWS, Azure, GCP, OpenAI, Anthropic, LangChain (for AI workflow orchestration), Anyscale (Ray for distributed AI workloads), Databricks (MLflow), IBM Watson Orchestrate, and developer platforms like Google AI Studio.

Application

Owned by the Application Provider (AP)

The Application Provider (AP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The AP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the AP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The AP is accountable for carrying out the due diligence on its upstream providers (e.g MPs, Orchestrated Services) to verify that they implement the control as it relates to the service/product develop and offered by the AP. These providers build and offer end-user applications that leverage generative AI models for specific tasks such as content creation, chatbots, code generation, and enterprise automation. These applications are often delivered as software-as-a-service (SaaS) solutions. These providers focus on user interfaces, application logic, domain-specific functionality, and overall user experience rather than underlying model development. Example: OpenAI (GPTs,Assistants), Zapier, CustomGPT, Microsoft Copilot (integrated into Office products), Jasper (AI-driven content generation), Notion AI (AI-enhanced productivity tools), Adobe Firefly (AI-generated media), and AI-powered customer service solutions like Amazon Rufus, as well as any organization that develops its AI-based application internally.

Implementation guidelines

[All Actors]
1. Define quality assurance standards for changes, including minimum thresholds for code quality, test coverage, and acceptable vulnerability levels.

2. Document standardized testing strategies for unit, integration, regression, and user acceptance testing before promoting changes.

3. Mandate security testing (e.g., static code analysis, vulnerability scanning) as part of the change control lifecycle.

4. Leverage automated testing frameworks and continuous integration pipelines to validate changes efficiently and consistently.

5. Ensure all testing activities and results are logged and traceable as part of change documentation.

Auditing guidelines

1. Inquiring with Control Owners

1.1 Interview infrastructure engineering leads and examine operations documentation to understand: Hardware and Platform Management (AI-optimized hardware procurement and qualification processes, hardware platform versioning and baseline configurations, accelerator (GPU/TPU) driver and firmware management with update qualification procedures, compute cluster architecture and configuration baselines, high-performance storage system configuration management with allocation and tiering protocols, hardware-software compatibility validation procedures, hardware refresh and deprecation procedures); Software and Framework Governance (infrastructure-as-code template governance, AI framework optimization and library management, driver and firmware update qualification procedures, infrastructure component version control and tracking); Deployment and Change Management (staged deployment approaches and validation testing before infrastructure changes, compatibility verification across hardware and software stacks, performance regression testing after changes and infrastructure performance monitoring frameworks, customer impact assessment for infrastructure modifications, rollback procedures for problematic deployments); Resource Management and Operations (resource allocation and scheduling policies, compute quota management and GPU/TPU/accelerator provisioning with access control, network bandwidth reservation, quality of service, and networking fabric optimization, capacity management and scaling processes, multi-tenancy isolation controls and resource monitoring with utilization optimization, infrastructure redundancy and failover configurations, cost optimization and resource efficiency mechanisms).

2. Obtaining and Verifying the Population of Records

2.1 Collect a complete population of infrastructure change records from independent sources, including infrastructure-as-code repositories and deployment logs, hardware inventory management systems, driver and firmware update records, cluster management platform logs, configuration management databases (CMDBs), network configuration repositories, storage system setup and configuration records, and container image registries for infrastructure components. Select a sample of these deployments and trace them forward to the change management record system, confirming a corresponding record exists for each, thus ensuring all deployments are captured in the population.

3. Inspecting Records and Documents

3.1 Select Representative Sample: Choose a balanced sample of infrastructure changes including: major hardware platform introductions, accelerator (GPU/TPU) updates and driver changes, storage system configuration modifications, network fabric and interconnect upgrades, resource scheduling algorithm changes, infrastructure scaling implementations, and performance optimization changes.

3.2. Infrastructure Change Validation, Deployment, and Performance Assurance: For each sampled infrastructure change, confirm comprehensive validation, deployment adherence, and performance documentation including: Validation and Testing (hardware-software compatibility testing and integration testing with AI frameworks and libraries, performance benchmark evaluation with AI workloads and scalability testing under various load conditions, reliability verification through stress testing and security assessment of configuration changes, resource isolation validation in multi-tenant environments); Deployment Procedures (pre-deployment environment validation and progressive rollout strategies across availability zones or regions, deployment window compliance and customer communication for service-impacting changes, concurrent monitoring during deployment and success criteria verification after implementation, rollback readiness and contingency planning); Performance Documentation and Metrics (compute throughput benchmarks for AI workloads and storage I/O performance metrics, network bandwidth and latency measurements, resource utilization efficiency metrics and scaling characteristics under load, performance consistency across identical resources, comparison with previous infrastructure generations).

3.3 Confirm Stakeholder Approvals for Changes: For each sampled infrastructure change, verify approvals from relevant stakeholders including: infrastructure engineering leadership, platform reliability engineers, security teams for infrastructure configuration changes, cost management teams for resource allocation changes, performance engineering teams for optimization changes, customer support teams for user-impacting changes, and procurement teams for hardware introductions.

3.4 Assess Configuration Reproducibility: For each sampled infrastructure change, verify documentation that enables infrastructure reproducibility: complete infrastructure-as-code templates, driver and firmware version specifications, hardware configuration parameters, networking topology and configuration details, storage system setup parameters, resource allocation and scheduling policies, and monitoring and alerting thresholds.

3.5 Evaluate Infrastructure Documentation: For each sampled infrastructure change, confirm the quality and completeness of documentation including: resource specifications and capabilities, compatible AI framework versions, known limitations and constraints, recommended configuration practices, performance optimization guidelines, resource utilization best practices, and maintenance window schedules and procedures.

Standards mappings

ISO 42001No Gap

42001: A.6.2.4 - AI system verification and validation
42001: A.6.2.3 - Documentation of AI system design and development
42001: A.6.2.7 - AI system technical documentation
42001: A.6.2.5 - AI system deployment
42001: B.6.1.3 - Processes for responsible design and development of AI systems

Addendum

N/A

EU AI ActPartial Gap

Article 17

Addendum

The EU AI Act does not cover CCC-02 topic for General-Purpose AI Models or General Purpose AI Models with Systemic Risks.

NIST AI 600-1Partial Gap

GV-1.3-002
MP-2.3-005
MS-2.3-003
MG-3.2-002
MS-2.7-008
MP-2.3-001
GV-1.5-003
GV-1.3-003
MP-2.1-002

Addendum

The clear requirement to “have and follow a defined quality change control, approval, and testing process with established baselines, testing, and release standards” is missing in NIST AI 600-1.

BSI AIC4No Gap

C4 PF-06
C5 DEV-03
DEV-05
DEV-06
DEV-07
DEV-08
DEV-09
DEV-10

Addendum

N/A

AI-CAIQ questions (1)

CCC-02.1

Is a defined quality change control, approval and testing process incorporating baselines, testing and release standards, established, maintained and implemented?