AI Sandboxing
Specification
Implement sandboxing techniques to execute AI tools and plugins in isolated environments to prevent unintended interactions with critical systems or data and limit the possibility of lateral movement.
Threat coverage
Architectural relevance
Lifecycle
Resource provisioning, Team and expertise
Design, Guardrails, Training
Evaluation, Validation/Red Teaming, Re-evaluation
Orchestration, AI applications
Operations, Continuous monitoring
Archiving, Data deletion, Model disposal
Ownership / SSRM
PI
Shared Cloud Service Provider-Model Provider (Shared CSP-MP)
The CSP and MP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Model
Owned by the Model Provider (MP)
The model provider (MP) designs, develops, and implements the control as part of their services or products to mitigate security, privacy, or compliance risks associated with the Large Language Model (LLM). Model Providers are entities that develop, train, and distribute foundational and fine-tuned AI models for various applications. They create the underlying AI capabilities that other actors build upon. Model Providers are responsible for model architecture, training methodologies, performance characteristics, and documentation of capabilities and limitations. They operate at the foundation layer of the AI stack and may provide direct API access to their models. Examples: OpenAI (GPT, DALL-E, Whisper), Anthropic(Claude), Google(Gemini), Meta(Llama), as well as any customized model.
Orchestrated
Shared Orchestrated Service Provider-Application Provider (Shared OSP-AP)
The OSP and AP are jointly responsible and accountable for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they develop and offer.
Application
Shared Application Provider-AI Customer (Shared AP-AIC)
The AP and AIC both share responsibility and accountability for the design, development, implementation, and enforcement of the control to mitigate security, privacy, or compliance risks associated with Large Language Model (LLM)/GenAI technologies in the context of the services or products they offer and consume.
Implementation guidelines
Auditing guidelines
Focus: The Cloud Service Provider/AI Processing Infrastructure Provider has implemented effective sandboxing techniques to execute AI workloads in isolated environments, preventing unintended interactions with critical systems or data and limiting the possibility of lateral movement. 1. Inquiry with Control Owners 1.1 Interview Infrastructure Security Leadership: Interview cloud infrastructure security architects, data center operations managers, and AI platform engineers responsible for implementing sandboxing in AI-optimized computing environments. Obtain and review the organization's infrastructure isolation policies covering accelerator (GPU/TPU) resource allocation, tenant isolation, virtualization boundaries, storage access controls, and network segmentation for AI workloads. Verify documented security requirements exist for multi-tenant AI infrastructure, high-performance computing environments, specialized accelerator access, distributed training isolation, and shared storage protection. 1.2 Review Sandboxing Technical Implementation: Examine documentation describing the technical implementation of hardware virtualization for AI accelerators, hypervisor isolation, container security for AI workloads, resource allocation enforcement, and network microsegmentation between tenants. Assess how storage isolation, memory protection, driver access controls, and hardware-level security features are implemented across different AI computing environments. 1.3 Assess AI Resource Allocation Security: Review mechanisms implementing security for specialized AI hardware resources, including accelerator allocation and isolation, memory partitioning on GPUs/TPUs, hardware queuing systems, power and thermal management isolation, and multi-instance GPU (MIG) implementations. Evaluate how resource quotas, fair scheduling, and priority systems are secured across tenant boundaries. 1.4 Evaluate Data Storage and Transfer Security: Review procedures for securing high-performance storage systems used for AI workloads, including parallel file system isolation, storage traffic separation, cache isolation between tenants, temporary storage cleanup, and data transfer security across infrastructure components. Assess protection mechanisms for training datasets, model weights, and checkpoints in shared infrastructure. 2. Obtaining and Verifying the Population of Records 2.1 Define the Complete Population of Infrastructure Components: Obtain a comprehensive inventory of AI-optimized infrastructure components, including GPU/TPU clusters, high-performance computing resources, specialized AI accelerators, high-throughput storage systems, low-latency networking fabrics, resource schedulers, virtualization platforms, and container orchestration systems. Include hardware management interfaces, driver and firmware components, and infrastructure monitoring systems in this inventory. 2.2 Verify Population Completeness: Cross-reference the inventory against hardware asset management systems, data center capacity documentation, virtualization management platforms, network configuration databases, and infrastructure deployment automation tools. Ensure the inventory aligns with procurement records, customer-facing service catalogs, and resource allocation databases to confirm completeness. 2.3 Categorize Components by Risk Level: Segment the infrastructure component population based on resource sharing levels, hardware specialization, customer data exposure, access to specialized accelerators, network connectivity, resource costs, utilization patterns, and deployment environments. This risk-based categorization should guide the depth and frequency of security assessment for each infrastructure component. 3. Inspection of Evidence 3.1 Sandbox Implementation Review: Select a representative sample of AI infrastructure components based on risk levels and verify the implementation of isolation mechanisms, resource access controls, and security boundaries. For isolation, examine hardware virtualization configurations, hypervisor security settings, container security policies, and network isolation implementations. For resource controls, review accelerator allocation mechanisms, memory partitioning approaches, quota enforcement, and time-sharing protections. For security boundaries, evaluate authentication systems, permission enforcement, escalation prevention, and tenant boundary enforcement. 3.2 Multi-Tenant Isolation Testing: Review evidence of security testing including hypervisor boundary testing, container escape prevention evaluation, GPU/TPU memory isolation verification, network segmentation validation, and storage access control assessment. Evaluate penetration testing results for hardware resource isolation, virtualization boundaries, container security, and accelerator access controls in multi-tenant environments. 3.3 Runtime Monitoring and Security Controls: Verify implementation of infrastructure monitoring systems, anomalous resource usage detection, hardware access pattern analysis, privileged operation logging, and tenant boundary enforcement monitoring. Assess the effectiveness of accelerator usage auditing, network traffic analysis between tenant boundaries, storage access monitoring, and hardware resource contention detection in identifying potential security issues. 3.4 Data Protection within Infrastructure: Assess controls for data protection including storage isolation between tenants, prevention of data leakage through shared hardware resources, ephemeral storage sanitization, cached data isolation, and memory clearance between workloads. Evaluate how GPU/TPU memory is protected against side-channel attacks, how shared cache systems are secured, and how storage traffic is isolated across tenant boundaries. 3.5 Hardware Resource Security: Examine the implementation of hardware resource allocation, accelerator virtualization, specialized instruction access controls, driver isolation, and firmware integrity protection. Assess secure boot implementations, hardware initialization procedures, privileged instruction limitations, and driver security boundaries between tenant workloads. 3.6 Security Incident Response: Review documentation and evidence of infrastructure isolation breach procedures, hardware resource contention incident playbooks, and customer notification processes. Evaluate how affected infrastructure components are identified, potentially compromised workloads are isolated, and security incidents are investigated. Assess recovery procedures and post-incident security enhancement processes. 3.7 Cloud Infrastructure Compliance: Verify the adequacy of infrastructure compliance controls, hardware security certifications, virtualization security standards alignment, and tenant isolation governance. Evaluate compliance with industry security standards for infrastructure providers, implementation of defense-in-depth for cloud environments, and regular security assessment procedures for AI infrastructure. 4. Evaluation and Reporting 4.1 Sandbox Effectiveness Assessment: Evaluate how well infrastructure isolation implementations prevent unauthorized resource access, maintain tenant boundaries, control access to specialized hardware, limit infrastructure visibility, maintain appropriate resource quotas, and withstand multi-tenant attack scenarios. Assess the overall effectiveness in preventing unintended system interactions while delivering high-performance AI computing capabilities. 4.2 Isolation Strategy Assessment: Assess the effectiveness of infrastructure isolation strategies based on hardware capabilities, virtualization techniques, network architecture, storage isolation methods, and tenant boundary enforcement. Evaluate whether isolation approaches appropriately balance performance requirements with security boundaries, particularly for specialized AI accelerator hardware. 4.3 Documentation and Process Adequacy: Evaluate the quality of infrastructure security documentation, including clarity of isolation architecture, completeness of hardware access controls, definition of tenant boundaries, and security requirements for AI workloads. Assess whether documentation is maintained as infrastructure evolves and new accelerator technologies are introduced. 4.4 Continuous Improvement Mechanisms: Evaluate processes for improving infrastructure isolation through regular boundary testing, incorporation of lessons learned, adaptation to new hardware technologies, security architecture reviews, and vulnerability management. Assess whether the organization demonstrates a commitment to continuously enhancing isolation controls as new AI accelerator technologies and attack vectors emerge.
Standards mappings
42001: 5.2 - AI policy 42001: A 6.2.2 42001: A.6.2.3 42001: B.6.2.5 - AI system deployment 42001: B.6.2.6 - AI system operation and monitoring 27001: A.8.9 - Configuration management 27001: A.5.15 - Access control 27001: A.8.8 - Management of technical vulnerabilities 27001: A.8.15 - Logging 27001: A.8.16 - Monitoring activities 27001: A.8.31 - Separation of development test and production environments
Addendum
Although there are controls from ISO 42001 that support the capabilities and functions of creating and maintaining the AIS-13 topic of having a "sandboxed" environment to deploy AI tools, this is not specifically called out. Also, ISO 27001 does have a control for separate testing and production environments that can support.
No Mapping
Addendum
The EU AI Act does not provide requirements related to AIS-13: Implement sandboxing techniques to execute AI tools and plugins.
No Mapping
Addendum
NIST AI 600-1 is missing a specific requirement to perform sandboxing for agentic tools and plugins.
SR-06
Addendum
N/A
AI-CAIQ questions (1)
Are sandboxing techniques implemented to execute AI tools and plugins in isolated environments to prevent unintended interactions with critical systems or data and to limit the possibility of lateral movement?