How to Govern Autonomous Agents in Enterprise AI Factories
Mirrored from NVIDIA Developer Blog for archival readability. Support the source by reading on the original site.
How to Govern Autonomous Agents in Enterprise AI Factories
AI-Generated Summary
- The NVIDIA Secure Agent Workspace Reference Design separates the presentation layer (user's device) from the execution layer (managed workspace), enforcing secure agent operation through controlled identity, network, and policy management.
- Implementation involves provisioning dedicated, company-managed virtual machines per user, enforcing single sign-on authentication, blocking unapproved network access, requiring human approval for significant actions, and centralizing logging for monitoring and audit.
- Further security is achieved by active agent sandboxing, centrally signed security policies, strict credential protection via proxies, continuous verification of rules, and leveraging GitOps and enterprise identity management to ensure repeatable, auditable, and isolated agent operations both on-premises and in the cloud.
AI-generated content may summarize information incompletely. Verify important information. Learn more
AI agents are quickly moving beyond chat. They inspect code, run tests, read documents, search knowledge bases, query internal systems, and operate for hours on behalf of a user. This unlocks productivity, but can also give agents access to sensitive enterprise data and the ability to complete tasks and take action across business systems, making a secure, governed environment essential.
The NVIDIA Secure Agent Workspace Reference Design introduces a clear architectural shift: the user’s laptop, browser, integrated development environment (IDE), or terminal serves as the presentation layer, not the execution layer. Agent execution occurs in a managed workspace where identity, network access, credentials, runtime policy, audit, and human review can be enforced consistently.
As the AI factory industrializes AI for enterprise, this reference design outlines how to build a secure environment for autonomous agents to operate at an organizational scale.
This post outlines the steps to implement the Secure Agent Workspace Reference Design so enterprises can provide access to autonomous, always-on AI agents to their entire employee base. The architecture creates a more secure environment that governs agent behavior and network access. Employees can accomplish more advanced, complex tasks with AI that works for longer and uses more enterprise tools.
Getting started with the secure agent workspace
- Preparation
Identify the agent workflow owners and stakeholders. This will inform resource requirements and access policies. In order to govern an agent, you need to define the range of expected behaviors and draw boundaries that prevent unexpected access.
Implementations for phases I and II sit on top of the standard enterprise managed-VM baseline, which includes configuration management, patch and vulnerability management, image governance, SOC telemetry, and rebuild / revocation features.
- Secure the perimeter outside the virtual machine
The first phase for the implementation of the secure agent workspace is about controlling the perimeter around it: who is allowed to enter, how they enter, what workspace they receive, and which services that workspace can reach. At this stage, the VM acts as the primary isolation boundary, and the goal is to make agent activity observable, bounded, and revocable before introducing deeper runtime controls.
- Provision managed workspaces: Give every user their own secure, company-managed virtual machine (VM) for their tasks.
- Enforce login gates: Use your company’s single sign-on (SSO) to control access; no one should be able to open a workspace without authenticated permission.
- Lock down the network: Block all internet traffic by default. Only allow connections to specific, pre-approved internal and external services.
- Require human approval: Ensure any agent action that changes a system (like merging code or updating tickets) must be approved by a human, not just the agent.
- Centralize logging: Send all logs about workspace activity to a single place so security teams can monitor for suspicious behavior.
- Add Runtime Security Inside the Virtual Machine
In the second phase of the implementation, add controls inside the workspace to govern the agent’s actual behavior. This shifts protection closer to the tool-call boundary: what files the agent can read, what commands it can run, and which services it can access. Secrets stay behind a proxy, policy stays centrally controlled, and the agent cannot silently expand its own permissions.
- Active sandboxing: Run the agent inside a dedicated runtime (such as NVIDIA OpenShell) that watches every action in real-time.
- Signed security policies: Use a central system to define exactly what the agent is allowed to do (e.g., which files it can read) and send these rules as signed, secure bundles to the workspace.
- Credential protection: Don’t store passwords or secret keys directly in the workspace. Use a secure proxy that handles those keys behind the scenes so the agent never sees the raw secrets.
- Continuous verification: Automatically check that the security rules are active and working before every single action the agent performs.
Set up agent blueprints for the agent workspace
Blueprints are repeatable workflow templates that run on top of the workspace. Each blueprint is configured with its goal, required tools, allowed services, data scope, write permissions, review gates, and logging expectations.
They use the maximum range of tools and exemplify best practices for their target use case. From that, agent developers make minimal modifications to narrow the behavior to their needs.
Blueprints must integrate into the secure agent workspace environment with the following steps:
- Define agent identity: Register the agent with a logical identity that ties back to the user or sponsor through SSO. Use a delegation record to define exactly what the agent is allowed to do.
- Handle secrets: Never hardcode secrets. Use a credential proxy so your agent works with short-lived capability tokens instead of raw API keys or passwords.
- Configure inference: A gateway layer manages quotas, role-based access control (RBAC), and dynamic rate limiting to ensure a secure and scalable inference service.
- Lockdown governance: Set up “blast radius” controls. Define which actions (like merging code or changing ticket status) require human review before execution, and make sure all logs are piped out in Open Cybersecurity Schema Framework (OCSF) format so they’re ready for audit.
Deploy the secure agent workspace on-prem or in the cloud
Setting up the workspace starts with choosing Red Hat OpenShift Virtualization for on-premises environments, or Microsoft Azure for cloud-native deployments. The core pattern is the same for both. Each user receives a dedicated virtual machine, and the local endpoint only attaches to that workspace. Agent execution remains within a managed boundary with a centralized policy, access control, and auditing.
These are the steps for deployment:
1. Provision one workspace VM per user: Create a dedicated Linux or Windows VM for each user.
2. Establish the access path: Put a trusted access broker in front of the workspace. Users should connect through enterprise SSO and short-lived, auditable sessions. The endpoint should act only as a presentation surface, with no autonomous agent work running locally.
3. Define the network boundary: Start with default-deny egress and allow only approved destinations. On OpenShift, use primitives such as `NetworkPolicy`, `EgressFirewall`, routes, and approved ingress paths. On Azure, route outbound traffic through Azure Firewall Premium, disable BGP route propagation, deny corporate CIDR access, and avoid any public inbound path.
4. Manage images and VM profiles centrally: Use approved VM images only. OpenShift environments should manage VM profiles and platform state through GitOps. Azure environments should build golden images with Packer and publish them through Azure Compute Gallery.
5. Use GitOps for policy intent: Store VM profiles, network rules, policy metadata, and release information in Git. GitOps should reconcile the desired platform state, while signed runtime policy bundles are distributed through a controlled release channel.
6. Protect secrets and identity flows: Keep raw secrets out of the agent process wherever possible. Azure deployments should use Workload Identity Federation for secretless provisioning, managed identities for VM runtime access, Azure Key Vault over Private Endpoints, and a narrow runtime identity before agent code starts.
7. Centralize audit and observability: Capture workspace lifecycle events, broker sessions, policy releases, network allow/deny activity, and runtime/tool events. Send logs to the enterprise SIEM or platform logging stack, such as Azure Monitor, Log Analytics, Microsoft Sentinel, or an OCSF-compatible audit path.
The end state is a practical Secure Agent Workspace pattern: single-user VMs provide isolation, GitOps provides repeatable operations, enterprise identity controls access, network policy limits reachability, and runtime enforcement adds a deeper policy layer for autonomous-agent safety.
Get started implementing the Secure Agent Workspace Reference Design in your enterprise AI factory.
Tags
About the Authors
Alex Sandu is a principal product marketing manager at NVIDIA. With a keen eye for market drivers and a deep understanding of the AI landscape, Alex works on defining the value proposition, benefits and positioning for NVIDIA software for enterprise AI. Alex collaborates closely with partners to showcase the transformative impact of deploying NVIDIA software, driving agentic AI adoption and innovation across the ecosystem. Prior to NVIDIA, Alex’s experience includes Product Marketing, Product Management and Corporate Strategy roles in the browser and web technologies space.
Sergey Marunich is senior solutions architect at NVIDIA focused on AI infrastructure networking, AI factories and enterprise AI platforms. He helps customers design, architect, test and deploy AI networking solutions across cloud, HPC AI and enterprise data center environments, and works with NVIDIA IT teams on enterprise AI factory initiatives. Prior to NVIDIA, he led enterprise architecture and customer success initiatives spanning infrastructure networking, service mesh, APIs and infrastructure modernization, helping scale teams and support Fortune 500 customers.
Jon Fernandez is director of Enterprise AI Compute, focused on secure agent workspaces for NVIDIA enterprise. He leads teams focused on engineering the NVIDIA agent compute platform, configuration management, and unified communications. Before NVIDIA, Jon spent nearly 20 years fine-tuning employee productivity solutions across enterprises ranging from startups to megacap companies.
Ashwin Jha is a senior director of Enterprise Productivity at NVIDIA, where he leads global teams across AI, cloud, software, enterprise applications, and data platforms. He focuses on building intelligent systems that make enterprise predictive, reliable, and deeply integrated with business outcomes.
Nic Borensztein is a distinguished solutions architect at NVIDIA focused on agentic AI and enterprise AI factories. He is the lead architect for the NVIDIA IT-managed AI factory, serving inference and agentic apps to 40,000 employees internally. Previously, he was the founder/CTO of CrowdAI, a no-code computer-vision platform.
Comments
More from NVIDIA Developer Blog
-
Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure
Jun 26
-
Creating the NVIDIA Nemotron 3 Ultra NVFP4 Checkpoint with NVIDIA Model Optimizer
Jun 26
-
Streamlining Resource Binding with End-to-End Support for Vulkan Descriptor Heaps
Jun 25
-
Scaling AI Inference Across Multiple GPUs Using NVIDIA TensorRT with Multi-Device Inference Support
Jun 25
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.