AIS-09AI-Specific

Output Validation

Specification

Validate, filter, modify or block, as necessary, output against adversarial patterns, failure patterns and unwanted behaviour according to organisational policies and applicable laws and regulations.

Threat coverage

Model manipulation

Data poisoning

Sensitive data disclosure

Model theft

Model/Service Failure

Insecure supply chain

Insecure apps/plugins

Denial of Service

Loss of governance

Architectural relevance

Physical infrastructure

Network

Compute

Storage

Application

Data

Lifecycle

Preparation

Not applicable

Development

Guardrails, Design

Evaluation

Validation/Red Teaming

Deployment

Orchestration, AI applications

Delivery

Operations, Continuous monitoring

Retirement

Not applicable

Ownership / SSRM

Shared Cloud Service Provider-Model Provider (Shared CSP-MP)

The CSP and MP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.

Model

Shared Cloud Service Provider-Model Provider (Shared CSP-MP)

Orchestrated

Shared Model Provider-Orchestrated Service Provider (Shared MP-OSP)

The MP and OSP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.

Application

Owned by the Application Provider (AP)

The Application Provider (AP) is responsible for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer. The AP is responsible and accountable for the implementation of the control within its own infrastructure/environment. If the control has downstream implications on the users/customers, the AP is responsible for enabling the customer and/or upstream partner in the implementation/configuration of the control within their risk management approach. The AP is accountable for carrying out the due diligence on its upstream providers (e.g MPs, Orchestrated Services) to verify that they implement the control as it relates to the service/product develop and offered by the AP. These providers build and offer end-user applications that leverage generative AI models for specific tasks such as content creation, chatbots, code generation, and enterprise automation. These applications are often delivered as software-as-a-service (SaaS) solutions. These providers focus on user interfaces, application logic, domain-specific functionality, and overall user experience rather than underlying model development. Example: OpenAI (GPTs,Assistants), Zapier, CustomGPT, Microsoft Copilot (integrated into Office products), Jasper (AI-driven content generation), Notion AI (AI-enhanced productivity tools), Adobe Firefly (AI-generated media), and AI-powered customer service solutions like Amazon Rufus, as well as any organization that develops its AI-based application internally.

Implementation guidelines

[Applicable to all providers (CSP, MP, OSP, AP) excluding AIC unless otherwise specified]
1. Define Output Validation Scope: Establish provider-specific processes for validating, filtering, modifying, or blocking outputs to protect against adversarial patterns, failure patterns, and unwanted behavior, as outlined in the provider-specific sections below. Ensure processes cover all output types (e.g., API responses, model predictions, user-facing data) throughout the application lifecycle, including development, deployment, and operation.

2. Output Validation Governance Structure: Establish clear roles and responsibilities for managing output validation processes, involving cross-functional teams (e.g., security, data science, development) and oversight by governance bodies. Define approval workflows involving senior management, security leads, and compliance teams to ensure alignment with organizational policies and regulatory requirements.

3. Output Validation Documentation Standards: Define structured documentation standards for output validation processes, including detailed procedures, adversarial pattern libraries, and mitigation strategies specific to each provider’s scope. Use consistent templates to document output validation rules, risk assessments, and response actions for detected threats to ensure traceability and auditability.

4. Output Validation Management Framework: Implement a structured review process for output validation policies and mechanisms. Conduct reviews and updates at least annually or following significant system changes (e.g., new output types, updated threat models, or regulatory changes). Ensure alignment with relevant laws, regulations, and standards (e.g., GDPR, CCPA, ISO 27001, OWASP Top 10) and AI-specific frameworks such as NIST AI Risk Management Framework and OWASP LLM Top 10. Incorporate automated tools for real-time output validation and threat detection where possible.

5. Communication and Training Standards: Define requirements for communicating output validation policies, including formal distribution of documentation, mandatory training programs for developers and operators on adversarial pattern recognition and mitigation, and awareness campaigns for stakeholders. Establish standards for ensuring policy accessibility (e.g., internal portals) and comprehension across relevant teams.

6. Quality Control Standards: Define policies for quality assurance of output validation processes, including requirements for testing validation mechanisms (e.g., adversarial testing, output monitoring), monitoring output-related incidents, and validating mitigation effectiveness. Incorporate automated validation tools and anomaly detection systems to ensure robust protection against adversarial outputs and compliance with organizational policies.

Auditing guidelines

1. Verify that the CSP has processes, procedures, and technical measures clearly defined and documented to regularly perform security tests of the AI output validation (AI Red Teaming), specifically addressing risks such as unsafe outputs, OWASP insecure output handling, excessive agency attacks, and adversarial outputs (including adversarial prompts and unsafe multimodal outputs). The documentation should clearly outline the testing scope, objectives, roles, responsibilities, and frequency of which those tests are conducted.

2. Confirm the alignment of these validation measures with relevant regulatory frameworks, industry best practices, and the OWASP Top 10 for Large Language Model applications (including protection against output-driven security risks).

3. Verify that the defined processes specifically test and mitigate AI-generated outputs that may pose security, privacy, reputational, or compliance risks. Tests should explicitly cover unsafe outputs (e.g., harmful, malicious, biased content), OWASP insecure output handling vulnerabilities (e.g., complex markdown injections, conversational exfiltration through malicious formatting or link parameters), and excessive agency attacks (outputs that prompt autonomous unsafe actions or responses from downstream AI agents or users).

4. Validate that the regularly conducted AI Red Teaming exercises encompass realistic adversarial scenarios using linguistic logic manipulation, encoded malicious code snippets, adversarial tokens, multimodal prompt injections, and multilingual attack vectors.

5. Verify that the security testing and AI Red Teaming findings are systematically reviewed, documented, and translated into actionable mitigation and improvement measures for AI services, infrastructure, and controls.

6. Confirm that the CSP has implemented metrics or indicators to continuously monitor the effectiveness and efficiency of output validation measures, ensuring a rapid identification and remediation of emerging adversarial output patterns.

7. Inspect that the CSP regularly reviews, updates, and adapts its output validation controls and procedures in response to rapidly evolving threat landscapes and that these address both newly identified AI vulnerabilities and regulatory requirements.

Standards mappings

ISO 42001No Gap

42001: A.6.2.4 / B.6.2.4 - AI system verification and validation
42001: A.6.2.6 – AI system operation and monitoring
42001: B.6.2.6 – Monitoring and evaluation
42001: B.6.2.1 – Technical robustness & security
42001: A.7.4 / B.7.4 - Quality of data for AI systems
27001: A.8.11 Data masking
27001: A.8.26 – Application security requirements
27001: A.8.29 – Security testing

Addendum

N/A

EU AI ActNo Gap

Article 15 (4)
Article 15 (5)
Article 53
Article 55 (1)

Addendum

N/A

NIST AI 600-1No Gap

MP-2.3-001
MP-2.3-003
MP-2.3-005
MS-2.6-004
MS-2.6-005
MS-2.6-006
MS-4.2-001
MG-2.2-001
MG-2.2-005
MG-4.1-004

Addendum

N/A

BSI AIC4No Gap

SR-05
SR-06
RE-04

Addendum

N/A

AI-CAIQ questions (1)

AIS-09.1

Is the output against adversarial patterns, failure patterns and unwanted behaviour, validated, filtered, modified, or blocked as necessary, according to organisational policies, applicable laws and regulations?