Quality Testing
Specification
Establish, maintain and implement a defined quality change control, approval and testing process incorporating baselines, testing, and release standards.
Threat coverage
Architectural relevance
Lifecycle
Data collection, Data curation, Data storage, Resource provisioning, Team and expertise
Training, Design, Guardrails, Supply Chain
Evaluation, Validation/Red Teaming, Re-evaluation
Orchestration, AI Services supply chain, AI applications
Continuous monitoring, Continuous improvement, Maintenance, Operations
Archiving, Data deletion, Model disposal
Ownership / SSRM
PI
Shared across the supply chain
Shared control ownership refers to responsibilities and activities related to LLM security that are distributed across multiple stakeholders within the AI supply chain, including the Cloud Service Provider (CSP), Model Provider (MP), Orchestrated Service Provider (OSP), Application Provider (AP), and Customer (AIC). These controls require coordinated actions, communication, and governance across all involved parties to ensure their effectiveness.
Model
Owned by the Model Provider (MP)
The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.
Orchestrated
Owned by the Orchestrated Service Provider (OSP)
The Orchestrated Service Provider (OSP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The OSP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the OSP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The OSP is accountable for ensuring that its providers upstream (e.g MPs) implement the control as it relates to the service/product the develop and offered by the OSP. This refers to entities that create the technical building blocks and management tools that enable AI implementation. This can include platforms, frameworks, and tools that facilitate the integration, deployment, and management of AI models within enterprise workflows. These providers focus on model orchestration and offer services like API access, automated scaling, prompt management, workflow automation, monitoring, and governance rather than end-user functionality or raw infrastructure. They help businesses implement AI in a structured and efficient manner. Examples: AWS, Azure, GCP, OpenAI, Anthropic, LangChain (for AI workflow orchestration), Anyscale (Ray for distributed AI workloads), Databricks (MLflow), IBM Watson Orchestrate, and developer platforms like Google AI Studio.
Application
Owned by the Application Provider (AP)
The Application Provider (AP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The AP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the AP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The AP is accountable for carrying out the due diligence on its upstream providers (e.g MPs, Orchestrated Services) to verify that they implement the control as it relates to the service/product develop and offered by the AP. These providers build and offer end-user applications that leverage generative AI models for specific tasks such as content creation, chatbots, code generation, and enterprise automation. These applications are often delivered as software-as-a-service (SaaS) solutions. These providers focus on user interfaces, application logic, domain-specific functionality, and overall user experience rather than underlying model development. Example: OpenAI (GPTs,Assistants), Zapier, CustomGPT, Microsoft Copilot (integrated into Office products), Jasper (AI-driven content generation), Notion AI (AI-enhanced productivity tools), Adobe Firefly (AI-generated media), and AI-powered customer service solutions like Amazon Rufus, as well as any organization that develops its AI-based application internally.
Implementation guidelines
Auditing guidelines
1. Inquiring with Control Owners 1.1 Interview infrastructure engineering leads and examine operations documentation to understand: Hardware and Platform Management (AI-optimized hardware procurement and qualification processes, hardware platform versioning and baseline configurations, accelerator (GPU/TPU) driver and firmware management with update qualification procedures, compute cluster architecture and configuration baselines, high-performance storage system configuration management with allocation and tiering protocols, hardware-software compatibility validation procedures, hardware refresh and deprecation procedures); Software and Framework Governance (infrastructure-as-code template governance, AI framework optimization and library management, driver and firmware update qualification procedures, infrastructure component version control and tracking); Deployment and Change Management (staged deployment approaches and validation testing before infrastructure changes, compatibility verification across hardware and software stacks, performance regression testing after changes and infrastructure performance monitoring frameworks, customer impact assessment for infrastructure modifications, rollback procedures for problematic deployments); Resource Management and Operations (resource allocation and scheduling policies, compute quota management and GPU/TPU/accelerator provisioning with access control, network bandwidth reservation, quality of service, and networking fabric optimization, capacity management and scaling processes, multi-tenancy isolation controls and resource monitoring with utilization optimization, infrastructure redundancy and failover configurations, cost optimization and resource efficiency mechanisms). 2. Obtaining and Verifying the Population of Records 2.1 Collect a complete population of infrastructure change records from independent sources, including infrastructure-as-code repositories and deployment logs, hardware inventory management systems, driver and firmware update records, cluster management platform logs, configuration management databases (CMDBs), network configuration repositories, storage system setup and configuration records, and container image registries for infrastructure components. Select a sample of these deployments and trace them forward to the change management record system, confirming a corresponding record exists for each, thus ensuring all deployments are captured in the population. 3. Inspecting Records and Documents 3.1 Select Representative Sample: Choose a balanced sample of infrastructure changes including: major hardware platform introductions, accelerator (GPU/TPU) updates and driver changes, storage system configuration modifications, network fabric and interconnect upgrades, resource scheduling algorithm changes, infrastructure scaling implementations, and performance optimization changes. 3.2. Infrastructure Change Validation, Deployment, and Performance Assurance: For each sampled infrastructure change, confirm comprehensive validation, deployment adherence, and performance documentation including: Validation and Testing (hardware-software compatibility testing and integration testing with AI frameworks and libraries, performance benchmark evaluation with AI workloads and scalability testing under various load conditions, reliability verification through stress testing and security assessment of configuration changes, resource isolation validation in multi-tenant environments); Deployment Procedures (pre-deployment environment validation and progressive rollout strategies across availability zones or regions, deployment window compliance and customer communication for service-impacting changes, concurrent monitoring during deployment and success criteria verification after implementation, rollback readiness and contingency planning); Performance Documentation and Metrics (compute throughput benchmarks for AI workloads and storage I/O performance metrics, network bandwidth and latency measurements, resource utilization efficiency metrics and scaling characteristics under load, performance consistency across identical resources, comparison with previous infrastructure generations). 3.3 Confirm Stakeholder Approvals for Changes: For each sampled infrastructure change, verify approvals from relevant stakeholders including: infrastructure engineering leadership, platform reliability engineers, security teams for infrastructure configuration changes, cost management teams for resource allocation changes, performance engineering teams for optimization changes, customer support teams for user-impacting changes, and procurement teams for hardware introductions. 3.4 Assess Configuration Reproducibility: For each sampled infrastructure change, verify documentation that enables infrastructure reproducibility: complete infrastructure-as-code templates, driver and firmware version specifications, hardware configuration parameters, networking topology and configuration details, storage system setup parameters, resource allocation and scheduling policies, and monitoring and alerting thresholds. 3.5 Evaluate Infrastructure Documentation: For each sampled infrastructure change, confirm the quality and completeness of documentation including: resource specifications and capabilities, compatible AI framework versions, known limitations and constraints, recommended configuration practices, performance optimization guidelines, resource utilization best practices, and maintenance window schedules and procedures.
Standards mappings
42001: A.6.2.4 - AI system verification and validation 42001: A.6.2.3 - Documentation of AI system design and development 42001: A.6.2.7 - AI system technical documentation 42001: A.6.2.5 - AI system deployment 42001: B.6.1.3 - Processes for responsible design and development of AI systems
Addendum
N/A
Article 17
Addendum
The EU AI Act does not cover CCC-02 topic for General-Purpose AI Models or General Purpose AI Models with Systemic Risks.
GV-1.3-002 MP-2.3-005 MS-2.3-003 MG-3.2-002 MS-2.7-008 MP-2.3-001 GV-1.5-003 GV-1.3-003 MP-2.1-002
Addendum
The clear requirement to “have and follow a defined quality change control, approval, and testing process with established baselines, testing, and release standards” is missing in NIST AI 600-1.
C4 PF-06 C5 DEV-03 DEV-05 DEV-06 DEV-07 DEV-08 DEV-09 DEV-10
Addendum
N/A
AI-CAIQ questions (1)
Is a defined quality change control, approval and testing process incorporating baselines, testing and release standards, established, maintained and implemented?