Security & Trust — VantageAI

Architecture

How data flows

VantageAI sits beside your AI calls, not in front of them. Your code communicates directly with OpenAI, Anthropic, or any provider — VantageAI reads response metadata after the call completes.

Your Code

Application

SDK / OTel / CLI

metadata only
async · non-blocking

VantageAI
Cost Observer
tokens · cost · latency

direct API call

full request + response

AI Provider

OpenAI · Anthropic

Google · Mistral · etc.

Critical distinction: Your application calls the AI provider directly — VantageAI is never in the network path between your code and the provider. In SDK and OTel modes, the SDK reads token counts and cost from the response object in-process, then queues a metadata event asynchronously. Your production traffic is unaffected by VantageAI's availability.

Data Minimization

Exactly what we collect

The following JSON examples show the actual payload sent to VantageAI servers versus what stays on your machine. No policy interpretation — this is the real data structure.

Sent to VantageAI servers ✓ transmitted

{
  "model": "gpt-4o",
  "input_tokens": 1842,
  "output_tokens": 394,
  "latency_ms": 1203,
  "cost_usd": 0.00284,
  "timestamp": "2026-04-14T09:41:00Z",
  "org_id": "org_8f3a...",
  "developer_id": "dev_a1b2...",
  "tool": "claude-code"
}

Never leaves your machine ✗ never sent

{
  "messages": // your prompt content,
  "system": // system prompt,
  "response": // model response text,
  "api_key": // provider API keys,
  "function_args": // tool call payloads,
  "file_contents": // files in context,
  "user_data": // any PII you process
}

Privacy Modes

Choose your disclosure level

VantageAI ships with three privacy modes. The default is Standard. Strict mode is designed for regulated industries — only aggregate counts leave your environment.

Strict

Aggregate counts only

VANTAGE_PRIVACY_MODE=strict

Total token counts (no per-call breakdown)
Estimated cost range
Model name suppressed
No per-request events
No developer attribution
No latency data

Standard

Metadata per call

VANTAGE_PRIVACY_MODE=standard

Per-call token counts
Model name and provider
Cost per call
Latency (ms)
Developer attribution (hashed ID)
No prompt or response content

Relaxed

Metadata + quality signals

VANTAGE_PRIVACY_MODE=relaxed

All Standard fields
Response quality score (0–100)
Prompt efficiency flags
Tool call count (no content)
No prompt or response content
No PII ever transmitted

Vendor Comparison

How we compare to proxy-based tools

The core architectural difference between VantageAI and Helicone or LangSmith is whether your API traffic passes through a third-party server. This table is intended for security reviewers and procurement teams.

Question	VantageAI	Helicone	LangSmith
API calls pass through vendor servers?	No — direct to provider	Yes — proxy required	Yes — proxy required
Prompts transmitted to vendor?	Never — metadata only	Yes — full request logged	Yes — full request logged
Production traffic affected by vendor outage?	No — async, non-blocking	Yes — in critical path	Yes — in critical path
Works without changing API endpoint?	Yes — OTel env var	No — endpoint change required	No — endpoint change required
API keys shared with vendor?	Never — keys stay local	Yes — forwarded through proxy	Yes — forwarded through proxy
Local deployment option?	Yes — local proxy mode	Partial — self-hosted OSS	Partial — self-hosted OSS
Strict privacy mode (no per-call data)?	Yes — aggregate only	No	No
Zero-code integration (OTel env var)?	Yes	No	No

Security Controls

What's implemented today

Controls currently live in production. No roadmap items are listed here — those are in the Compliance section below.

🔑

API Key Hashing

All API keys are hashed with SHA-256 on arrival. The plaintext key is shown exactly once at creation and never stored. Recovery is possible only via a single-use, time-limited email token.

SHA-256 · one-time display · email recovery

🍪

Session Security

Session tokens carry 256 bits of entropy. Cookies are HTTP-only and Secure — they cannot be read by JavaScript. Sessions expire after 30 days. Account owners can rotate all sessions instantly to force re-authentication.

HTTP-only · Secure · SameSite=Strict

🛡️

HTTP Security Headers

All responses include: HSTS with one-year max-age, Content-Security-Policy with strict frame-ancestors, Cross-Origin-Opener-Policy, and Cross-Origin-Resource-Policy. Verified via securityheaders.com.

HSTS · CSP · COOP · CORP

🚧

Rate Limiting & Brute-Force Protection

Every authentication endpoint is protected by KV-backed per-IP rate limiting. Ten failed attempts within a five-minute window returns 429 with a Retry-After header. Degrades gracefully if KV is unavailable — no silent bypass.

10 attempts / 5 min · per-IP · 429 + Retry-After

📋

Audit Log

Every API action is recorded with actor, action type, timestamp, and IP address. Logs are append-only, export-ready, and available in the dashboard. Suitable for security incident investigation and compliance evidence packages.

actor · action · timestamp · exportable

🔒

Transport Encryption

All traffic between clients and VantageAI is encrypted with TLS 1.2 minimum. TLS 1.3 is preferred and negotiated by default. Cloudflare terminates TLS at the edge. No unencrypted endpoints exist.

TLS 1.3 · Cloudflare edge · HSTS enforced

🔐

Secrets Management

Worker secrets (signing keys, internal tokens) are stored exclusively in Cloudflare Worker Secrets — never in environment variables, source code, or configuration files. Secrets are not accessible after deployment, only usable.

Cloudflare Worker Secrets · not in source

⚡

Budget Policy Enforcement

Budget policies are enforced server-side — they cannot be bypassed by SDK configuration. When a budget threshold is hit, ingest is blocked at the API layer and a Slack webhook alert is dispatched within one minute.

server-side enforcement · Slack alerts · 1-min latency

Data Storage & Retention

What we store and for how long

All data is stored in Cloudflare D1 (SQLite on Cloudflare's global network). Data is encrypted at rest by default via Cloudflare's storage layer.

Data Type	What it contains	Free	Team	Enterprise
Usage events	Model, tokens, cost, latency, timestamp, developer ID (hashed)	30 days	1 year	Custom
Audit log	Actor, action, timestamp, IP address	30 days	1 year	Custom
Account data	Email, org name, API key hash	Until deletion	Until deletion	Until deletion
Budget policies	Threshold amounts, alert config (no usage data)	Until deletion	Until deletion	Until deletion
Prompt content	—	Never stored	Never stored	Never stored
Provider API keys	—	Never stored	Never stored	Never stored

You may request deletion of your account and all associated data at any time by emailing [email protected]. Deletion is processed within 72 hours.

Infrastructure

Where it runs

VantageAI is built entirely on Cloudflare's platform. We inherit Cloudflare's security certifications for the infrastructure layer.

API Layer

Cloudflare Workers

All API endpoints run as Cloudflare Workers on the global edge network. No origin server. Code executes in isolated V8 contexts — no shared memory between requests.

SOC 2 Type II ↗

Database

Cloudflare D1

Usage events and account data are stored in Cloudflare D1 (SQLite). Data is encrypted at rest. Access is restricted to the Worker runtime — no public database endpoint exists.

Encrypted at rest ↗

Frontend

Cloudflare Pages

The dashboard and all static assets are served from Cloudflare Pages. No server-side rendering. No third-party analytics SDKs or tracking pixels are loaded.

No third-party trackers

Rate Limiting

Cloudflare KV

Per-IP rate limiting state is stored in Cloudflare KV. Brute-force protection on all authentication endpoints with graceful degradation if KV is unavailable.

Global · low latency

Email

Resend

Transactional email (key recovery, team invites) is delivered via Resend. Only your email address is passed — no usage data, no API keys. Resend is not used for marketing.

Transactional only

Secrets

Cloudflare Worker Secrets

All internal signing keys and tokens are stored in Cloudflare Worker Secrets — encrypted, write-only after deployment. Not accessible from outside the Worker runtime.

Write-only · encrypted

Compliance Roadmap

Where we are and where we're going

We are honest about what's done and what's in progress. Nothing is listed here unless it's real.

NowApril 2026

✓ Live

Security controls in production

API key hashing, HTTP-only sessions, HSTS, CSP, rate limiting, audit log, TLS everywhere, secrets in Worker Secrets. All documented on this page.

Q2 2026Month 2–3

◐ In progress

SOC 2 Type I — readiness assessment

Beginning SOC 2 Type I audit preparation with Vanta. Covers all five Trust Service Criteria: Security, Availability, Processing Integrity, Confidentiality, and Privacy. Full report available to enterprise customers under NDA upon completion.

Q3 2026Month 7–8

Planned

SOC 2 Type I certification

Target completion of SOC 2 Type I audit. Report will be available to enterprise customers under NDA. This is the foundation for Type II (12-month observation period begins on Type I completion).

Q4 2026Month 9–10

Planned

Data Processing Agreement (DPA) template

Standardized DPA for enterprise customers. Covers data processor obligations, sub-processor list, breach notification timelines, and data subject rights. Required for GDPR-regulated customers in the EU.

2027Year 2

Planned

SOC 2 Type II & EU AI Act alignment

Twelve-month observation period completes. Type II report covers operational effectiveness over time. Simultaneously aligning audit trail and governance features with EU AI Act transparency requirements (Articles 13, 17, 26).

Security Contact

For security reviews & disclosures

We respond to security questionnaires, vendor assessments, and vulnerability disclosures. Use the appropriate channel below.

Vendor security review

Security questionnaires & assessments

Send your vendor security questionnaire (SIG Lite, CAIQ, custom form) to this address. We respond within 48 business hours. We can also schedule a security call for enterprise procurement reviews.

[email protected]

Response SLA: 48 business hours

Vulnerability disclosure

Responsible disclosure

If you have found a security vulnerability, please disclose it responsibly. We do not currently have a public bug bounty program but we acknowledge and credit all valid reports. Please do not disclose publicly before we have had a chance to remediate.

[email protected]

Initial response: 24 hours · Remediation: 90 days

Your prompts never leaveyour infrastructure.

Your prompts never leave
your infrastructure.