Agent Readout

SOC 2 Compliant AI Inference: What Enterprise Teams Need to Know

A practical guide to SOC 2 compliance for AI inference: Type I vs. Type II, data residency, audit logging, and a checklist for evaluating inference providers.

Author: General Compute
Published: 2026-06-26
Tags: enterprise, security, compliance, soc2, inference

Markdown body

When an enterprise team decides to integrate AI inference into a product, the conversation usually starts with latency and cost. But if your company operates in healthcare, finance, legal, or any sector with meaningful data governance requirements, you hit a different question quickly: is this inference provider SOC 2 compliant?

This post walks through what SOC 2 compliance means in the context of AI inference specifically, where Type I and Type II differ, what data residency requirements look like in practice, and what your security or legal team will likely ask before signing off on any vendor.

---

## What SOC 2 Actually Is

SOC 2 (Service Organization Control 2) is an auditing framework developed by the American Institute of CPAs (AICPA). It assesses whether a service provider has adequate controls in place around five Trust Service Criteria: Security, Availability, Processing Integrity, Confidentiality, and Privacy.

For most AI inference providers, Security and Confidentiality are the relevant criteria. Availability matters too if your SLA requires it. Processing Integrity and Privacy apply in narrower circumstances.

SOC 2 is not a certification in the way ISO 27001 is. It is an audit report produced by a third-party CPA firm that attests to your controls at a point in time (Type I) or over a period of time (Type II). The report itself is not public by default -- vendors share it under NDA as part of a security review.

---

## Type I vs. Type II: The Practical Difference

**Type I** reports assess whether controls are designed correctly as of a specific date. The auditor is essentially saying: "On this date, we examined the controls this company says they have, and they appear to be appropriately designed."

**Type II** reports cover a period of time, typically six to twelve months. The auditor is saying: "We examined whether these controls were actually operating effectively over this period." That requires evidence -- logs, access reviews, incident reports, change records -- not just documentation of what controls exist.

From an enterprise buyer's perspective, Type II is the meaningful one. Type I is a reasonable starting point for newer vendors, but it tells you very little about whether controls held up under real operations. A company can have excellent policies on paper and still have gaps in how they actually function day-to-day. Type II catches that gap.

When evaluating an inference provider, ask specifically: do you have a SOC 2 Type II report, and how recent is it? Reports more than eighteen months old should prompt additional questions, since the controls landscape changes and audits expire in practice if not renewed.

---

## The Five Trust Service Criteria Applied to AI Inference

Here is how each criterion maps to the specific concerns an enterprise has with an inference provider:

**Security (CC)**: This covers logical and physical access controls, network security, change management, and incident response. For an inference provider, this includes how they manage access to the infrastructure running your model requests, how they segment customer workloads, and how they detect and respond to breaches.

**Availability**: Uptime commitments and the controls that support them. If your use case is latency-sensitive (voice agents, real-time applications), you need to understand whether the provider's availability controls align with your SLA requirements. A SOC 2 report alone does not guarantee uptime -- it attests to whether the controls around availability are in place.

**Processing Integrity**: Ensures that system processing is complete, accurate, timely, and authorized. For inference, this matters if you need assurance that requests are processed without corruption or unauthorized modification.

**Confidentiality**: This is critical for AI inference. It covers whether data shared with the provider is protected from unauthorized disclosure. The key question is whether your prompt data and completion data are logged, retained, used for training, or visible to provider employees.

**Privacy**: Applies when personal data is involved. If users are sending prompts containing PII -- names, email addresses, health information, financial data -- this criterion becomes relevant. It maps to GDPR, CCPA, and similar frameworks.

---

## Data Residency in AI Inference

Data residency is separate from SOC 2 compliance but comes up in the same security review. It refers to where your data is physically stored and processed.

For AI inference, "data" has a few distinct components:

**Prompt data** is what you send in the request. This is often the most sensitive -- it may contain user-submitted content, proprietary documents, or PII depending on your application.

**Completion data** is what the model returns. In most inference architectures, completions are generated and streamed back without persistent storage on the provider's side.

**Logs and telemetry** are what providers retain for debugging, billing, and abuse detection. This is where residency gets complicated. Even if a provider does not store your prompts long-term, their logging infrastructure may capture metadata about requests.

When reviewing a provider for data residency compliance:

- Ask explicitly whether prompt data is logged, and if so, where and for how long.
- Confirm whether any data is transmitted outside your required region (EU, US, etc.).
- Understand whether the provider uses third-party sub-processors and whether those sub-processors have their own data residency commitments.
- Review the DPA (Data Processing Agreement) alongside the SOC 2 report -- the DPA is the contractual commitment; the SOC 2 is the attestation that controls back it up.

For EU-based companies working with GDPR requirements, make sure the DPA includes standard contractual clauses (SCCs) if data processing occurs in the US. GDPR compliance and SOC 2 compliance are not the same thing, and providers sometimes conflate them in marketing materials.

---

## Audit Logging: What You Need and What Providers Offer

Your security team will want to know that you have an audit trail for AI inference activity. This serves several purposes: debugging production issues, responding to security incidents, and demonstrating compliance in an audit.

The specific logging requirements depend on your industry. Here is what enterprise teams typically ask for:

**Request-level logs**: Timestamps, endpoint called, model used, token counts, response codes. This is the minimum. Most providers expose this through a usage API or dashboard.

**Identity and attribution**: Which API key or user identity initiated each request. If you have multiple teams or applications sharing an inference account, you need to know which requests belong to which context. Use separate API keys per application or use organizational access controls if the provider supports them.

**Data access logs**: Evidence that only authorized parties can access your account data, and that the provider's own employees access logs only under defined conditions (support requests, security investigations). SOC 2 Type II reports cover this for the provider side. For your side, your own SIEM or logging infrastructure should capture API activity.

**Retention policies**: How long do you need logs? Healthcare applications under HIPAA typically require six years. Financial services may have different requirements. Confirm whether the provider's log retention aligns with your retention policy, or whether you need to export and store logs independently.

If your compliance requirements are strict, consider building a logging proxy in front of the inference API. This lets you capture and retain request/response data under your own policies without relying on what the provider exposes through their dashboard.

---

## What to Ask an Inference Provider Before Signing a Contract

Here is the practical checklist your security or legal team should work through:

**Compliance documentation**
- Do you have a current SOC 2 Type II report? From what period?
- Are you willing to share it under NDA for our security review?
- Do you have an ISO 27001 certification? (Often required in addition to SOC 2 for multinational enterprise buyers.)
- Do you have a penetration test report from the past twelve months?

**Data handling**
- Is prompt data retained? If so, for how long and in what form?
- Is prompt data used to train models? If yes, can we opt out?
- What happens to our data if we terminate the contract?
- Do you have a data breach notification process? What is the timeline for notifying customers?

**Infrastructure and access controls**
- Where are inference servers physically located?
- Can you commit to processing our data only in a specific region (US, EU, etc.)?
- Do your employees have access to customer prompt data? Under what conditions?
- How is customer workload isolation handled at the infrastructure level?

**Contracts and legal**
- Do you have a standard DPA? Does it include SCCs for EU data?
- Do you have a BAA (Business Associate Agreement) available for HIPAA use cases?
- What SLA do you offer, and what remedies exist for SLA breaches?
- What does your incident response process look like, and how are customers notified?

**Operational controls**
- How are access credentials to the production environment managed?
- What is your change management process for infrastructure changes?
- Do you have a formal vulnerability disclosure or bug bounty program?

---

## HIPAA and Other Regulatory Overlays

SOC 2 is often the floor, not the ceiling. Depending on your use case:

**Healthcare (HIPAA)**: If prompt data includes PHI (protected health information), you need a BAA with the inference provider. SOC 2 does not cover HIPAA -- a provider can be SOC 2 Type II certified and still not eligible to sign a BAA if their infrastructure or controls are not set up to handle PHI appropriately.

**Financial services (SOX, PCI-DSS)**: If AI inference is used in a system that touches financial reporting or cardholder data, additional controls are required. SOC 2 may satisfy some requirements, but review your specific regulatory framework.

**Government (FedRAMP)**: For US federal government use cases, FedRAMP authorization is typically required. This is a substantially different and more demanding framework than SOC 2. Most commercial inference providers do not have FedRAMP authorization.

---

## Building an Enterprise-Ready AI Inference Setup

The infrastructure choices you make on your side matter as much as the provider's compliance posture. A few patterns that enterprise teams use:

**Dedicated API keys per environment**: Keep separate keys for production, staging, and development. Rotate keys regularly. Store them in a secrets manager (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager), not in environment variables or source code.

**Request proxying for data control**: Route inference requests through your own service before they reach the provider. This lets you redact or mask sensitive fields in prompts, enforce rate limits per user, capture logs in your own system, and add your own authentication layer.

**Egress monitoring**: If you route through a proxy or API gateway, monitor for unusual patterns -- unexpected spikes in token usage, requests from unexpected IP ranges, or model access that does not match application behavior. These are early signals of a compromised key or unauthorized use.

**Contractual review cadence**: SOC 2 reports have effective periods. Set a calendar reminder to request an updated report from your inference provider each year, and re-review the DPA if there are changes to data handling practices.

---

## Evaluating Providers for Enterprise Readiness

When you put a provider through a security review, you are assessing both their documentation and the quality of their responses. A provider that is genuinely prepared for enterprise sales will have these materials ready: the SOC 2 report, a DPA template, answers to a standard security questionnaire (many enterprises use HECVAT or the SIG Lite), and a clear data handling FAQ.

Providers that are not enterprise-ready will either lack these materials or provide vague answers. The willingness to be specific in a security review is itself a signal about how seriously they take operational controls.

General Compute maintains a SOC 2 Type II report and is happy to work through security reviews with enterprise teams. If you are evaluating inference providers and want to see our security documentation or discuss specific compliance requirements for your use case, reach out directly at [generalcompute.com](https://generalcompute.com) or through our enterprise sales contact.

Building AI infrastructure that your security and legal teams can actually approve is not a side project -- it is what makes the difference between a proof of concept and a production deployment.