GOVERNANCE

Your AI Agents Need to Pass an Audit. Here's What SOC 2 and ISO 27001 Actually Require.

Osarenren I.March 9, 202614 min read

A compliance auditor walks into your office and asks a simple question: “How does your AI agent decide what to do, and how do you prove it?”

If you’re building with autonomous AI agents, this question is no longer hypothetical. SOC 2 Type II is now a deal requirement for B2B contracts over $50,000 [1]. ISO 27001 certification is showing up in every enterprise vendor questionnaire. And the EU AI Act entered enforcement in 2025, with industry-specific regulators like FINRA, the OCC, and the FDA issuing their own AI-specific guidance [2].

The problem is that both frameworks were designed for a world where humans make decisions. SOC 2’s five Trust Service Criteria assume a human is accountable for every privileged action [3]. ISO 27001’s 93 Annex A controls assume that actions are attributable to an identifiable individual, not an autonomous agent running at machine speed [4]. AI agents break both assumptions — and the gap between what these frameworks require and what most teams can actually demonstrate is growing wider every quarter.

This post is a practical guide to that gap. Not a product walkthrough — a compliance education piece. We’ll cover what SOC 2 and ISO 27001 actually require, where AI agents create control failures, what governance architecture closes those failures, and what the industry data says about how prepared (or unprepared) most organizations are.

A Quick Primer: What SOC 2 and ISO 27001 Actually Require

Before discussing where AI agents create problems, it helps to understand what these frameworks demand in the first place. Most engineers encounter SOC 2 and ISO 27001 as checkboxes on a vendor questionnaire. But the underlying requirements are more nuanced — and more relevant to AI agent architecture — than most teams realize.

SOC 2: The Five Trust Service Criteria

SOC 2 is a US auditing standard developed by the AICPA. It evaluates how organizations protect customer data and operate securely across five criteria [3]:

Criterion	What It Requires	Key Control Families
Security (mandatory)	Protect systems from unauthorized access	CC6 — Logical access, encryption, role-based permissions
Availability	Systems available for operation as committed	CC3 — Uptime, failover, graceful degradation
Processing Integrity	Data processed accurately, completely, and timely	CC7 — Output validation, error detection, monitoring
Confidentiality	Sensitive information protected from disclosure	CC6 — Encryption, access controls, data classification
Privacy	Personal data handled lawfully	PII handling, consent management, data retention

The critical detail: SOC 2 is not a checklist. It’s an audit of operating effectiveness. Auditors don’t just ask “do you have a policy?” They ask “show me evidence that this control worked, continuously, over the audit period.” This distinction matters enormously for AI agents, because continuous evidence is exactly what most agent architectures fail to produce.

ISO 27001: The Annex A Controls

ISO 27001 is an international standard for information security management systems. The 2022 revision reorganized its controls into four themes and added 11 new controls, several of which are directly relevant to AI agent deployments [4] [5]:

Control	What It Requires	AI Relevance
A.5.1 — Information Security Policies	Documented, approved, communicated policies	Must explicitly cover AI agent behavior and boundaries
A.5.7 — Threat Intelligence	Collect and analyze threat information	AI-specific threats: prompt injection, jailbreaks, data poisoning
A.8.5 — Secure Authentication	Verify identity before granting access	Agent identity: how do you authenticate a non-human actor?
A.8.9 — Configuration Management	Manage configurations across the lifecycle	Agent configurations: model versions, tool permissions, thresholds
A.8.15 — Logging	Record events for investigation and monitoring	Every LLM call, tool invocation, and decision must be logged
A.8.16 — Monitoring Activities	Detect anomalous behavior in real time	Continuous behavioral monitoring of agent actions
A.8.28 — Secure Coding	Apply secure development principles	Agents that generate or execute code need security scanning

The 2022 revision also introduced ISO 42001, the first international standard specifically for AI management systems. It covers the full AI lifecycle from concept through deployment and operation, and is designed to complement ISO 27001 for organizations deploying AI [2]. While ISO 42001 certification is still emerging, auditors are increasingly referencing its principles during ISO 27001 audits that include AI systems.

Where AI Agents Break These Frameworks

The fundamental problem is not that SOC 2 and ISO 27001 don’t apply to AI agents. They do. The problem is that AI agents violate the assumptions these frameworks were built on. As IBM’s 2026 analysis put it: “Most enterprises still govern access by using identity models designed in the past two decades — models built for human operators, not machines” [6].

Here are the four structural breaks:

1. The Accountability Gap

SOC 2’s CC6 (Logical Access) requires that privileged actions are attributable to an accountable individual. When a human operator runs a database query, the audit log shows who ran it, when, and from where. When an AI agent runs the same query autonomously, the log shows “agent-service-account” — and the auditor asks: who authorized this?

According to Teleport’s 2026 analysis of AI agents and SOC 2, auditors treat “no human request” as a major accountability gap [7]. When logs show only a tool name or a shared service account, auditors question who initiated and approved the action. The Cloud Security Alliance’s survey found that only 28% of organizations can reliably trace agent actions to a human or system across all environments. Another 46% can do so only in some environments, and 9% cannot trace agent actions at all [8].

2. The Speed Problem

Traditional SOC 2 access reviews happen quarterly. A human operator might make dozens of privileged actions per day, and a quarterly review can meaningfully assess whether those actions were appropriate. An AI agent can make thousands of decisions per hour. It can call APIs, access databases, generate code, and execute tool invocations at machine speed — generating audit events faster than any human can review [7].

This breaks the assumption behind CC8 (Change Management) and A.8.16 (Monitoring Activities). Periodic review is not a viable control when the system being reviewed operates continuously at scale. The control must be continuous too.

3. The Non-Determinism Problem

ISO 27001’s A.8.9 (Configuration Management) assumes that a system’s behavior is determined by its configuration. Change the configuration, change the behavior — and document both. But LLM-based agents are non-deterministic. The same input can produce different outputs. The same agent configuration can lead to different decisions depending on the model’s internal state, the conversation history, and stochastic sampling. As one compliance analyst put it: “Does ISO 27001 apply to models that change behavior over time? The answer is reassuring and challenging at the same time: SOC 2 and ISO 27001 still apply; what changes is how risk behaves once AI becomes part of the system” [9].

This means that configuration management alone is insufficient. You need behavioral monitoring — continuous observation of what the agent actually does, not just what it’s configured to do.

4. The Delegation Problem

Multi-agent systems introduce a new class of accountability challenge that neither SOC 2 nor ISO 27001 anticipated. When Agent A delegates to Agent B, which delegates to Agent C, and the final output is wrong — who is responsible? IBM identifies this as “invisible delegation”: “Many agents reuse user tokens instead of receiving delegated authority. This method erases audit separation and shifts liability onto individuals who never approved the action” [6].

The CSA survey confirms this is widespread. Only 21% of organizations maintain a real-time registry of their agents. Another 32% rely on non-real-time records, and 8% have no registry at all [8]. If you don’t know what agents you have, you certainly can’t trace their delegation chains.

The Industry Is Not Ready

The data on organizational preparedness is sobering. Vanta’s 2026 State of Trust report, surveying 2,500 business and IT leaders globally, found that 65% of organizations say their use of agentic AI outpaces their understanding of it, and only 48% have a framework for granting or limiting autonomy in AI systems [10]. Meanwhile, 79% are already using or actively planning to use agentic AI this year.

The compliance gap is not theoretical. It’s a measurable distance between what organizations are deploying and what they can demonstrate to an auditor:

Capability	What Auditors Expect	Industry Reality
Agent inventory	Real-time registry of all agents	Only 21% have one [8]
Action traceability	Link every action to authorization	Only 28% can do this reliably [8]
Autonomy governance	Framework for limiting agent authority	Only 48% have one [10]
AI risk assessment	Regular, documented assessments	Only 45% conduct them [10]
AI policy	Documented, approved, communicated	Only 44% have one [10]

Gartner projects that by 2028, 33% of enterprise software applications will include agentic AI, up from less than 1% in 2024 [1]. The compliance gap will only widen unless teams build governance into their agent architecture from the start.

What a Governance Layer for AI Agents Needs to Look Like

Given the four structural breaks described above, a governance layer for AI agents needs to address four corresponding requirements. These are not product features — they’re architectural requirements that any compliant agent system must satisfy.

Requirement 1: Financial and Resource Monitoring (Processing Integrity)

SOC 2’s Processing Integrity criterion (CC7) requires that system processing is complete, valid, accurate, and authorized. For AI agents, this means tracking the cost and resource consumption of every operation. An agent that enters an infinite reasoning loop and burns through $10,000 in API calls is a processing integrity failure — the processing was neither authorized nor controlled.

A governance layer needs to track per-session and per-project cost trajectories, detect deviations from established baselines, and enforce hard budget limits. This is the equivalent of the spending controls that financial systems have had for decades — but applied to AI compute.

Requirement 2: Access Boundary Enforcement (Logical Access)

SOC 2’s CC6 (Logical Access) requires that systems enforce access controls based on the principle of least privilege. For AI agents, this means defining explicit boundaries around what tools an agent can call, what data it can access, and what operations it can perform. An agent that can access production databases should never be able to call the “delete all records” endpoint without explicit authorization.

A governance layer needs to maintain a registry of agent permissions, enforce those permissions at runtime, and generate audit evidence that the boundaries were maintained throughout the agent’s execution.

Requirement 3: Continuous Behavioral Monitoring (Availability & Monitoring)

SOC 2’s CC3 (Availability) and ISO 27001’s A.8.16 (Monitoring Activities) require continuous observation of system behavior. For AI agents, this means monitoring every decision, every tool invocation, and every API call in real time. The monitoring needs to be continuous because agents operate at machine speed — quarterly reviews are too slow to catch problems.

A governance layer needs to track agent behavior patterns, detect anomalies, and trigger alerts when agents deviate from expected behavior. This is not just for compliance — it’s essential for reliability. An agent that suddenly starts making 10x more API calls than usual might be in an infinite loop. An agent that suddenly starts accessing data it never accessed before might be compromised.

Requirement 4: Multi-Agent Traceability (Accountability)

When agents delegate to other agents, the governance layer needs to maintain an unbroken chain of accountability. Agent A delegates to Agent B with explicit authority constraints. Agent B delegates to Agent C with further constraints. The audit trail must show who authorized each delegation, what constraints were applied, and whether those constraints were respected.

This is the hardest requirement to implement because it requires coordination across multiple agents and potentially multiple organizations. But it’s also the most important for multi-agent systems. Without it, you have “invisible delegation” — actions that appear authorized at the surface but are actually unauthorized when you trace them back through the delegation chain.

What This Means in Practice

These four requirements are not theoretical. They’re the difference between passing an audit and failing one. Here’s what they look like when implemented:

Financial Monitoring: An agent is configured with a $100/session budget. It starts a task that enters a reasoning loop. After 30 seconds, it has spent $87. The governance layer projects that it will exceed the budget in 15 more seconds, so it halts the agent and generates an alert. The audit log shows the budget, the spending trajectory, the projection, and the halt decision. The auditor sees that processing was controlled and authorized.

Access Boundaries: An agent is authorized to call the “list customers” endpoint and the “send email” endpoint, but not the “delete customer” endpoint. It attempts to call the delete endpoint. The governance layer blocks the call and logs the attempt. The audit trail shows that the agent exceeded its authority and was stopped.

Behavioral Monitoring: An agent normally makes 5-10 API calls per task. Today it has made 500 calls in the last hour. The governance layer detects this anomaly, compares it to the baseline, and generates an alert. An engineer investigates and discovers that the agent is in a retry loop due to a rate-limiting issue. The governance layer has caught the problem before it cascades.

Multi-Agent Traceability: Agent A is asked to process a customer request. It delegates to Agent B for data retrieval and Agent C for email sending. Each delegation includes explicit constraints: Agent B can only read customer data, Agent C can only send emails to the customer’s registered address. The audit trail shows all three agents, all delegations, all constraints, and all actions taken. If anything goes wrong, the auditor can trace it back through the entire chain.

The Compliance Paradox

Here’s the paradox that most teams miss: the governance architecture that makes your agents auditable is the same architecture that makes them reliable. The financial monitoring that satisfies CC7 also prevents surprise cloud bills. The access boundary enforcement that satisfies CC6 also prevents agents from accidentally calling production endpoints during testing. The behavioral monitoring that satisfies A.8.16 also catches infinite loops and runaway costs before they become disasters.

Compliance is not a separate concern from reliability. They’re the same thing. The teams that build governance into their agent architecture from the start won’t just pass audits — they’ll ship faster, debug faster, and win enterprise deals that competitors can’t close because they can’t answer the auditor’s question.

Non-determinism resists complete coverage. No governance layer can anticipate every possible agent behavior. The goal is not to prevent all failures — it’s to detect failures quickly, halt cascading damage, and produce the audit evidence that demonstrates the control was operating effectively. This is the same standard applied to human-operated systems: not perfection, but reasonable assurance.

Multi-agent governance is immature. The CSA data shows that most organizations are in what they call a “Time-to-Trust” phase — building the visibility, auditability, and control mechanisms necessary before granting full autonomy [8]. The tooling for multi-agent traceability is still emerging, and best practices are being established in real time.

What This Means for Teams Building AI Agents

If you’re building AI agents that will touch enterprise data, process financial transactions, or operate in regulated industries, here is the practical takeaway: governance is not a feature you add before the audit. It’s an architectural decision you make before writing the first line of agent code.

The teams that build governance into their agent architecture from the start won’t just pass audits — they’ll ship faster, debug faster, and win enterprise deals that competitors can’t close because they can’t answer the auditor’s question. The financial monitoring that satisfies CC7 also prevents surprise cloud bills. The access boundary enforcement that satisfies CC6 also prevents agents from accidentally calling production endpoints during testing. The circuit breakers that satisfy CC3 also stop agents from burning tokens at 3 AM when no one is watching.

Compliance and reliability are the same thing. The governance layer that makes your agents auditable is the same layer that makes them trustworthy.

At Prysm AI, this is what we’re building: the governance and observability layer that gives AI agent teams the monitoring, access controls, circuit breakers, and audit trails that SOC 2 and ISO 27001 require. Not because compliance is a checkbox, but because the teams building the most reliable agents are the ones who can prove what their agents did, why they did it, and that it was authorized.

References

MindStudio. (2026, February). AI Agent Compliance: GDPR, SOC 2, and Beyond. mindstudio.ai
ZTABS. (2026, March). AI Governance and Compliance: A Practical Guide for Production AI Systems. ztabs.co
PolicyLayer. (2025, November). SOC 2 Compliance for AI Agents: Audit Trails, Access Controls & Monitoring. policylayer.com
DataGuard. (2024). ISO 27001 Controls: Overview of All Measures from Annex A. dataguard.com
DataGuard. (2024). ISO 27001:2022 Annex A Controls — New Controls and Changes. dataguard.com
Slocum, B. (2026, March). The Accountability Gap in Autonomous AI. IBM Think. ibm.com
McGladrey, K. (2026, February). How AI Agents Impact SOC 2 Trust Services Criteria. Teleport. goteleport.com
Cloud Security Alliance. (2026, February). The Visibility Gap in Autonomous AI Agents. cloudsecurityalliance.org
Editor Rosemi. (2026, February). SOC 2, ISO 27001, and AI: What Changes When Companies Add AI? The AI Clarity Report, Medium. medium.com
Vanta. (2025, December). Top 6 AI Security Trends for 2026 — and How Companies Can Prepare. vanta.com