Output Validation
Specification
Validate, filter, modify or block, as necessary, output against adversarial patterns, failure patterns and unwanted behaviour according to organisational policies and applicable laws and regulations.
Threat coverage
Architectural relevance
Lifecycle
Not applicable
Guardrails, Design
Validation/Red Teaming
Orchestration, AI applications
Operations, Continuous monitoring
Not applicable
Ownership / SSRM
PI
Shared Cloud Service Provider-Model Provider (Shared CSP-MP)
The CSP and MP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Model
Shared Cloud Service Provider-Model Provider (Shared CSP-MP)
The CSP and MP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Orchestrated
Shared Model Provider-Orchestrated Service Provider (Shared MP-OSP)
The MP and OSP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Application
Owned by the Application Provider (AP)
The Application Provider (AP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The AP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the AP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The AP is accountable for carrying out the due diligence on its upstream providers (e.g MPs, Orchestrated Services) to verify that they implement the control as it relates to the service/product develop and offered by the AP. These providers build and offer end-user applications that leverage generative AI models for specific tasks such as content creation, chatbots, code generation, and enterprise automation. These applications are often delivered as software-as-a-service (SaaS) solutions. These providers focus on user interfaces, application logic, domain-specific functionality, and overall user experience rather than underlying model development. Example: OpenAI (GPTs,Assistants), Zapier, CustomGPT, Microsoft Copilot (integrated into Office products), Jasper (AI-driven content generation), Notion AI (AI-enhanced productivity tools), Adobe Firefly (AI-generated media), and AI-powered customer service solutions like Amazon Rufus, as well as any organization that develops its AI-based application internally.
Implementation guidelines
Auditing guidelines
1. Verify that the CSP has processes, procedures, and technical measures clearly defined and documented to regularly perform security tests of the AI output validation (AI Red Teaming), specifically addressing risks such as unsafe outputs, OWASP insecure output handling, excessive agency attacks, and adversarial outputs (including adversarial prompts and unsafe multimodal outputs). The documentation should clearly outline the testing scope, objectives, roles, responsibilities, and frequency of which those tests are conducted. 2. Confirm the alignment of these validation measures with relevant regulatory frameworks, industry best practices, and the OWASP Top 10 for Large Language Model applications (including protection against output-driven security risks). 3. Verify that the defined processes specifically test and mitigate AI-generated outputs that may pose security, privacy, reputational, or compliance risks. Tests should explicitly cover unsafe outputs (e.g., harmful, malicious, biased content), OWASP insecure output handling vulnerabilities (e.g., complex markdown injections, conversational exfiltration through malicious formatting or link parameters), and excessive agency attacks (outputs that prompt autonomous unsafe actions or responses from downstream AI agents or users). 4. Validate that the regularly conducted AI Red Teaming exercises encompass realistic adversarial scenarios using linguistic logic manipulation, encoded malicious code snippets, adversarial tokens, multimodal prompt injections, and multilingual attack vectors. 5. Verify that the security testing and AI Red Teaming findings are systematically reviewed, documented, and translated into actionable mitigation and improvement measures for AI services, infrastructure, and controls. 6. Confirm that the CSP has implemented metrics or indicators to continuously monitor the effectiveness and efficiency of output validation measures, ensuring a rapid identification and remediation of emerging adversarial output patterns. 7. Inspect that the CSP regularly reviews, updates, and adapts its output validation controls and procedures in response to rapidly evolving threat landscapes and that these address both newly identified AI vulnerabilities and regulatory requirements.
Standards mappings
42001: A.6.2.4 / B.6.2.4 - AI system verification and validation 42001: A.6.2.6 – AI system operation and monitoring 42001: B.6.2.6 – Monitoring and evaluation 42001: B.6.2.1 – Technical robustness & security 42001: A.7.4 / B.7.4 - Quality of data for AI systems 27001: A.8.11 Data masking 27001: A.8.26 – Application security requirements 27001: A.8.29 – Security testing
Addendum
N/A
Article 15 (4) Article 15 (5) Article 53 Article 55 (1)
Addendum
N/A
MP-2.3-001 MP-2.3-003 MP-2.3-005 MS-2.6-004 MS-2.6-005 MS-2.6-006 MS-4.2-001 MG-2.2-001 MG-2.2-005 MG-4.1-004
Addendum
N/A
SR-05 SR-06 RE-04
Addendum
N/A
AI-CAIQ questions (1)
Is the output against adversarial patterns, failure patterns and unwanted behaviour, validated, filtered, modified, or blocked as necessary, according to organisational policies, applicable laws and regulations?