Training Pipeline Security
Specification
Define, implement, and evaluate policies, procedures, and technical measures that ensure the security of the Training Pipeline. Regularly review and update policies, procedures and technical measures to address new security threats and best practices.
Threat coverage
Architectural relevance
Lifecycle
Data collection, Data curation, Data storage, Resource provisioning
Training, Supply Chain, Design
Validation/Red Teaming
Orchestration
Operations, Maintenance
Not applicable
Ownership / SSRM
PI
Owned by the Cloud Service Provider (CSP)
The Cloud Service Provider (CSP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with cloud computing (processing, storage, and networking) technologies in the context of the services or products they develop and offer. The CSP is responsible and accountable for implementing the control within its own infrastructure/environment. The CSP is responsible for enabling the customer and/or upstream partner to implement/configure the control within their risk management approach. The CSP is accountable for ensuring that its providers upstream implement the control related to the service/product developed and offered by the CSP.
Model
Owned by the Model Provider (MP)
The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.
Orchestrated
Shared Model Provider-Orchestrated Service Provider (Shared MP-OSP)
The MP and OSP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Application
Owned by the Application Provider (AP)
The Application Provider (AP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The AP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the AP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The AP is accountable for carrying out the due diligence on its upstream providers (e.g MPs, Orchestrated Services) to verify that they implement the control as it relates to the service/product develop and offered by the AP. These providers build and offer end-user applications that leverage generative AI models for specific tasks such as content creation, chatbots, code generation, and enterprise automation. These applications are often delivered as software-as-a-service (SaaS) solutions. These providers focus on user interfaces, application logic, domain-specific functionality, and overall user experience rather than underlying model development. Example: OpenAI (GPTs,Assistants), Zapier, CustomGPT, Microsoft Copilot (integrated into Office products), Jasper (AI-driven content generation), Notion AI (AI-enhanced productivity tools), Adobe Firefly (AI-generated media), and AI-powered customer service solutions like Amazon Rufus, as well as any organization that develops its AI-based application internally.
Implementation guidelines
Auditing guidelines
1. Review security measures implemented to protect the CSP infrastructure used for AI model training pipelines. 2. Verify controls around data storage, access, and transit used in training. Assess the configuration of network security, including firewalls and intrusion detection/prevention systems protecting training environments. 3. Evaluate the physical security and environmental controls for the facilities where training infrastructure is housed. Verify the incident response procedures related to the training pipeline infrastructure. Evaluate how access control is maintained in the training environment. 4. Confirm regular reviews and updates of security measures and procedures.
Standards mappings
ISO 42001 A.6.1.2 - Objectives for responsible development of AI system ISO 42001 A.6.1.3 - Processes for responsible AI system design and development IOS 42001 B.6.1.2 - Objectives for responsible development of AI system IOS 42001 B.6.1.3 Processes for responsible design and development of AI system IOS 42001 B.6.2.3 Documentation of AI system design and development
Addendum
N/A
Article 15 (1) Article 15 (5)
Addendum
N/A
MP-4.1-004 MS-1.1-007 MS-2.5-005 MS-2.10-003 MS-2.11-005
Addendum
N/A
C4 PC-01 C4 SR-04 C4 SR-05 C5 SP-01 C5 SP-02
Addendum
N/A
AI-CAIQ questions (2)
Are processes, procedures, and technical measures defined, implemented, and evaluated to ensure the security of the Training Pipeline?
Are policies, procedures and technical measures to address new security threats and best practices regularly review and update?