AIS-13AI-Specific

AI Sandboxing

Specification

Implement sandboxing techniques to execute AI tools and plugins in isolated environments to prevent unintended interactions with critical systems or data and limit the possibility of lateral movement.

Threat coverage

Model manipulation

Data poisoning

Sensitive data disclosure

Model theft

Model/Service Failure

Insecure supply chain

Insecure apps/plugins

Denial of Service

Loss of governance

Architectural relevance

Physical infrastructure

Network

Compute

Storage

Application

Data

Lifecycle

Preparation

Resource provisioning, Team and expertise

Development

Design, Guardrails, Training

Evaluation

Evaluation, Validation/Red Teaming, Re-evaluation

Deployment

Orchestration, AI applications

Delivery

Operations, Continuous monitoring

Retirement

Archiving, Data deletion, Model disposal

Ownership / SSRM

Shared Cloud Service Provider-Model Provider (Shared CSP-MP)

The CSP and MP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.

Model

Owned by the Model Provider (MP)

The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.

Orchestrated

Shared Orchestrated Service Provider-Application Provider (Shared OSP-AP)

The OSP and AP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.

Application

Shared Application Provider-AI Customer (Shared AP-AIC)

The AP and AIC both share responsibility and accountability for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they offer and consume.

Implementation guidelines

[Applicable to all providers (CSP, MP, OSP, AP) excluding AIC unless otherwise specified]
1. Establish provider-specific policy scopes for implementing sandboxing techniques in tools and plugins used by agents. Ensure policies address isolation of execution environments (e.g., containers, VMs), prevention of unintended interactions with critical systems/data, and restriction of lateral movement across all deployment and runtime scenarios.

2. Define roles for overseeing sandboxing implementation, involving cross-functional teams (e.g., security architects, DevOps, compliance). Set approval workflows with senior management and security leads to ensure alignment with organizational isolation standards and risk management goals.

3. Create structured documentation standards for sandboxing policies, including procedures for setting up isolated environments (e.g., Docker, Firejail), defining access boundaries, and monitoring for breaches. Use templates to document sandbox configurations, isolation tests, and mitigation strategies for breakout risks.

4. Implement a review process for sandboxing policies. Conduct reviews at least annually or after significant changes (e.g., new tools/plugins, updated sandbox tech, identified vulnerabilities). Align with standards like NIST 800-53 (SC-7 Boundary Protection), OWASP Secure Isolation Guidelines, and AI-specific frameworks like NIST AI RMF where applicable.

5. Define requirements for communicating sandboxing policies: distribute formal documentation, mandate training for teams managing sandboxed environments (e.g., developers, operators), and run awareness campaigns on isolation best practices. Ensure accessibility via internal portals and comprehension across teams.

6. Set policies for quality assurance in sandboxing, including requirements for automated isolation testing (e.g., container escape tests), monitoring for unintended interactions (e.g., system call auditing), and restricting lateral movement (e.g., network segmentation). Require audit logs for sandbox activities and breach attempts.

Auditing guidelines

Focus: The Cloud Service Provider/AI Processing Infrastructure Provider has implemented effective sandboxing techniques to execute AI workloads in isolated environments, preventing unintended interactions with critical systems or data and limiting the possibility of lateral movement.

1. Inquiry with Control Owners

1.1 Interview Infrastructure Security Leadership: Interview cloud infrastructure security architects, data center operations managers, and AI platform engineers responsible for implementing sandboxing in AI-optimized computing environments. Obtain and review the organization's infrastructure isolation policies covering accelerator (GPU/TPU) resource allocation, tenant isolation, virtualization boundaries, storage access controls, and network segmentation for AI workloads. Verify documented security requirements exist for multi-tenant AI infrastructure, high-performance computing environments, specialized accelerator access, distributed training isolation, and shared storage protection.

1.2 Review Sandboxing Technical Implementation: Examine documentation describing the technical implementation of hardware virtualization for AI accelerators, hypervisor isolation, container security for AI workloads, resource allocation enforcement, and network microsegmentation between tenants. Assess how storage isolation, memory protection, driver access controls, and hardware-level security features are implemented across different AI computing environments.

1.3 Assess AI Resource Allocation Security: Review mechanisms implementing security for specialized AI hardware resources, including accelerator allocation and isolation, memory partitioning on GPUs/TPUs, hardware queuing systems, power and thermal management isolation, and multi-instance GPU (MIG) implementations. Evaluate how resource quotas, fair scheduling, and priority systems are secured across tenant boundaries.

1.4 Evaluate Data Storage and Transfer Security: Review procedures for securing high-performance storage systems used for AI workloads, including parallel file system isolation, storage traffic separation, cache isolation between tenants, temporary storage cleanup, and data transfer security across infrastructure components. Assess protection mechanisms for training datasets, model weights, and checkpoints in shared infrastructure.

2. Obtaining and Verifying the Population of Records

2.1 Define the Complete Population of Infrastructure Components: Obtain a comprehensive inventory of AI-optimized infrastructure components, including GPU/TPU clusters, high-performance computing resources, specialized AI accelerators, high-throughput storage systems, low-latency networking fabrics, resource schedulers, virtualization platforms, and container orchestration systems. Include hardware management interfaces, driver and firmware components, and infrastructure monitoring systems in this inventory.

2.2 Verify Population Completeness: Cross-reference the inventory against hardware asset management systems, data center capacity documentation, virtualization management platforms, network configuration databases, and infrastructure deployment automation tools. Ensure the inventory aligns with procurement records, customer-facing service catalogs, and resource allocation databases to confirm completeness.

2.3 Categorize Components by Risk Level: Segment the infrastructure component population based on resource sharing levels, hardware specialization, customer data exposure, access to specialized accelerators, network connectivity, resource costs, utilization patterns, and deployment environments. This risk-based categorization should guide the depth and frequency of security assessment for each infrastructure component.

3. Inspection of Evidence

3.1 Sandbox Implementation Review: Select a representative sample of AI infrastructure components based on risk levels and verify the implementation of isolation mechanisms, resource access controls, and security boundaries. For isolation, examine hardware virtualization configurations, hypervisor security settings, container security policies, and network isolation implementations. For resource controls, review accelerator allocation mechanisms, memory partitioning approaches, quota enforcement, and time-sharing protections. For security boundaries, evaluate authentication systems, permission enforcement, escalation prevention, and tenant boundary enforcement.

3.2 Multi-Tenant Isolation Testing: Review evidence of security testing including hypervisor boundary testing, container escape prevention evaluation, GPU/TPU memory isolation verification, network segmentation validation, and storage access control assessment. Evaluate penetration testing results for hardware resource isolation, virtualization boundaries, container security, and accelerator access controls in multi-tenant environments.

3.3 Runtime Monitoring and Security Controls: Verify implementation of infrastructure monitoring systems, anomalous resource usage detection, hardware access pattern analysis, privileged operation logging, and tenant boundary enforcement monitoring. Assess the effectiveness of accelerator usage auditing, network traffic analysis between tenant boundaries, storage access monitoring, and hardware resource contention detection in identifying potential security issues.

3.4 Data Protection within Infrastructure: Assess controls for data protection including storage isolation between tenants, prevention of data leakage through shared hardware resources, ephemeral storage sanitization, cached data isolation, and memory clearance between workloads. Evaluate how GPU/TPU memory is protected against side-channel attacks, how shared cache systems are secured, and how storage traffic is isolated across tenant boundaries.

3.5 Hardware Resource Security: Examine the implementation of hardware resource allocation, accelerator virtualization, specialized instruction access controls, driver isolation, and firmware integrity protection. Assess secure boot implementations, hardware initialization procedures, privileged instruction limitations, and driver security boundaries between tenant workloads.

3.6 Security Incident Response: Review documentation and evidence of infrastructure isolation breach procedures, hardware resource contention incident playbooks, and customer notification processes. Evaluate how affected infrastructure components are identified, potentially compromised workloads are
isolated, and security incidents are investigated. Assess recovery procedures and post-incident security enhancement processes.

3.7 Cloud Infrastructure Compliance: Verify the adequacy of infrastructure compliance controls, hardware security certifications, virtualization security standards alignment, and tenant isolation governance. Evaluate compliance with industry security standards for infrastructure providers, implementation of defense-in-depth for cloud environments, and regular security assessment procedures for AI infrastructure.

4. Evaluation and Reporting

4.1 Sandbox Effectiveness Assessment: Evaluate how well infrastructure isolation implementations prevent unauthorized resource access, maintain tenant boundaries, control access to specialized hardware, limit infrastructure visibility, maintain appropriate resource quotas, and withstand multi-tenant attack scenarios. Assess the overall effectiveness in preventing unintended system interactions while delivering high-performance AI computing capabilities.

4.2 Isolation Strategy Assessment: Assess the effectiveness of infrastructure isolation strategies based on hardware capabilities, virtualization techniques, network architecture, storage isolation methods, and tenant boundary enforcement. Evaluate whether isolation approaches appropriately balance performance requirements with security boundaries, particularly for specialized AI accelerator hardware.

4.3 Documentation and Process Adequacy: Evaluate the quality of infrastructure security documentation, including clarity of isolation architecture, completeness of hardware access controls, definition of tenant boundaries, and security requirements for AI workloads. Assess whether documentation is maintained as infrastructure evolves and new accelerator technologies are introduced.

4.4 Continuous Improvement Mechanisms: Evaluate processes for improving infrastructure isolation through regular boundary testing, incorporation of lessons learned, adaptation to new hardware technologies, security architecture reviews, and vulnerability management. Assess whether the organization demonstrates a commitment to continuously enhancing isolation controls as new AI accelerator technologies and attack vectors emerge.

Standards mappings

ISO 42001Partial Gap

42001: 5.2 - AI policy
42001: A 6.2.2
42001: A.6.2.3
42001: B.6.2.5 - AI system deployment
42001: B.6.2.6 - AI system operation and monitoring
27001: A.8.9 - Configuration management
27001: A.5.15 - Access control
27001: A.8.8 - Management of technical vulnerabilities
27001: A.8.15 - Logging
27001: A.8.16 - Monitoring activities
27001: A.8.31 - Separation of development
test and production environments

Addendum

Although there are controls from ISO 42001 that support the capabilities and functions of creating and maintaining the AIS-13 topic of having a "sandboxed" environment to deploy AI tools, this is not specifically called out. Also, ISO 27001 does have a control for separate testing and production environments that can support.

EU AI ActFull Gap

No Mapping

Addendum

The EU AI Act does not provide requirements related to AIS-13: Implement sandboxing techniques to execute AI tools and plugins.

NIST AI 600-1Full Gap

No Mapping

Addendum

NIST AI 600-1 is missing a specific requirement to perform sandboxing for agentic tools and plugins.

BSI AIC4Partial Gap

SR-06

Addendum

N/A

AI-CAIQ questions (1)

AIS-13.1

Are sandboxing techniques implemented to execute AI tools and plugins in isolated environments to prevent unintended interactions with critical systems or data and to limit the possibility of lateral movement?