OpenAI and Anthropic API keys leak from vibe-coded SaaS apps in five predictable ways: client-bundled environment variables, committed to public GitHub history, posted to public help forums, exposed in client-side fetch headers, and logged by error-tracking services. Within hours of leaking, bots find the key and burn it on LLM calls. Typical billing damage runs USD 5,000 to USD 50,000 before founders catch it. This post walks the five exposure paths, how Cybersecify pentests for them, and the three-layer rate limiting that contains damage when a key still leaks.
Key findings
- 5 exposure paths for OpenAI / Anthropic API keys in vibe-coded SaaS: client-side environment variables (
NEXT_PUBLIC_*,VITE_*prefixes), public git history (committed.envor config files), public help forums (Discord, Slack, GitHub issues where founders paste configs), client-side fetch headers (DevTools-readable), retained error-tracking payloads (Sentry / LogRocket logs). - Detection-to-abuse window is hours, sometimes minutes. GitHub Secret Scanning and OpenAI’s own key-leak detection catch the most public exposures, but private-but-bundled-to-client leaks evade both.
- Typical financial damage when a key leaks: USD 5,000 to USD 50,000 in unauthorized LLM calls before the founder catches it. OpenAI may reverse charges if the founder detects and rotates quickly; many do not.
- Server-side LLM proxy is mandatory. Calling OpenAI directly from the browser guarantees the key leaks. Proxy through Next.js API route, Supabase Edge Function, or Cloudflare Worker.
- Three-layer rate limiting contains damage when a key still leaks: per-user daily cap, per-IP rate cap, global burn-rate halt. All three layers; not one.
- 5-step pentest for AI API key exposure: bundle audit, GitHub history scan, DevTools inspection, error-tracking review, rate-limit testing.
- Pricing benchmark: Cybersecify Startup Pentest INR 74,999 covers AI API key exposure testing as part of the secret-exposure scan included in the 1-scope engagement.
Cybersecify is a founder-led penetration testing firm based in Bengaluru, India, serving AI-first and API-first SaaS startups. We see AI provider API key exposure in roughly 4 of 10 pentest engagements on vibe-coded SaaS. The patterns below are recurring, not theoretical. For the deliverable format that captures findings like these, see our pentest report sample.
Why leaked AI API keys are worse than other leaked keys
A leaked Stripe key is bad. A leaked database key is bad. A leaked OpenAI key is a different category of bad for three reasons.
1. LLM calls are expensive. A typical GPT-4 call costs USD 0.01 to USD 0.10. An abuse bot can run thousands of calls per minute. Within an hour of a key going public, the bot has spent more on your account than your monthly OpenAI budget.
2. LLM call abuse is highly automatable. There are specialized bot networks that exist to convert leaked LLM keys into throughput for content farms, image generation, and even crypto mining via LLM (an inefficient but possible attack pattern). The infrastructure to abuse a leaked key is mature and operating at scale.
3. The abuse pattern is invisible until the bill. Everything looks normal. Your app keeps working. Then the OpenAI billing email arrives at month end with USD 47,000 in usage. A leaked Stripe key, by contrast, has per-customer charge limits and triggers fraud alerts in the Stripe dashboard.
The 5 exposure paths
1. Client-bundled environment variables
The most common pattern in vibe-coded apps. The LLM (or the founder) prefixes the AI provider key with NEXT_PUBLIC_OPENAI_API_KEY or VITE_ANTHROPIC_KEY. These prefixes are DESIGNED to expose variables to the client. They are how you pass safe public values (Stripe publishable key, Google Maps client API key, Supabase anon key) to the browser.
When you put a secret behind one of these prefixes, the build tooling embeds the secret directly into the JavaScript bundle that ships to every visitor. Anyone with DevTools open can read it. Bots that scrape minified bundles for known key patterns find it within hours.
How to test:
# Next.js: build the production bundle and grep
npm run build
grep -r "sk-" .next/static/ | head
grep -r "sk-ant-" .next/static/ | head
# Vite
npm run build
grep -r "sk-" dist/ | head
grep -r "sk-ant-" dist/ | head
If any line returns a key, you have client exposure.
How to fix: Remove the NEXT_PUBLIC_ or VITE_ prefix. Move the key to a server-only variable. Proxy LLM calls through a server-side route (Next.js API route, Vercel Edge Function, Supabase Edge Function, Cloudflare Worker).
2. Public git history
The founder accidentally committed .env or config.js containing the key. They later removed it, but the file is still in commit history. Anyone who clones the repo can git log -p and find it.
GitHub Secret Scanning catches commits with known key patterns and notifies the owner. OpenAI also receives the notification and may proactively revoke the key. But the scanning is not instant, and the window between push and detection is enough for bots to grab the key.
How to test:
# Install trufflehog
brew install trufflesecurity/trufflehog/trufflehog
# Scan the repo
trufflehog git file://. --only-verified
If trufflehog returns verified findings, the key is exposed.
How to fix: Rotate the key at OpenAI / Anthropic immediately (the old one is compromised). Remove the secret from git history with git filter-repo or bfg. Force-push to the remote. Configure GitHub Secret Scanning + Push Protection on the repo so future commits are blocked at push time.
3. Public help forums
The founder posts a config file or environment dump to a public Discord, Slack, or GitHub issue while asking for help debugging. The key is now in a searchable forum.
Search-engine indexed help forums are scraped continuously by abuse bots. A key posted in a public Discord channel is typically detected within an hour.
How to test: Search for your OpenAI org name or project name on Discord, Slack public channels, GitHub issues, Stack Overflow, and Reddit. If any post you authored contains a key, rotate immediately.
How to fix: Always redact keys before posting help requests. Use sk-redacted-12345 or XXX placeholders. If you have already posted, rotate the key.
4. Client-side fetch headers
The LLM scaffolds a fetch call from the browser with the OpenAI key in the Authorization header. Even though this is “in code” not “in environment variables,” the key still travels through the browser and is visible in DevTools Network tab.
How to test: Load the deployed app, open DevTools Network tab, trigger an AI feature, inspect the outgoing request headers. If the OpenAI key appears in any Authorization header sent FROM the browser, exposure.
How to fix: Move the LLM call to a server-side route. The browser sends a request to your server (with your own auth cookie, not the OpenAI key), the server makes the OpenAI call with the key, the server returns the response to the browser.
5. Error-tracking service retained payloads
You wire up Sentry, LogRocket, or similar observability. The error-tracking service captures request and response bodies on errors. If a request body contained the key (e.g., a client-side fetch that included it, or a server-side error that logged the raw request), the key now lives in the error-tracking service indefinitely.
How to test: Review Sentry / LogRocket retention settings and search for known key patterns in retained payloads.
How to fix: Configure Sentry to scrub Authorization headers by default. Configure server-side logging to redact known secret patterns. Rotate keys that appeared in retained logs.
How Cybersecify pentests for AI API key exposure
We run this 5-step pentest on every vibe-coded SaaS engagement that uses AI features:
- Bundle audit. Build the production bundle (
npm run build,bun build, etc.) and grep for known patterns (sk-,sk-proj-,sk-ant-,OPENAI_API_KEY,ANTHROPIC_API_KEY,OPENAI_ORG_ID). Decompile if minified. - GitHub history scan. Clone the repo and run trufflehog or git-secrets against full commit history. Verify findings against current key status.
- DevTools inspection. Load the deployed app, exercise every AI feature, inspect Network and Sources panes for raw key strings. Inspect the rendered HTML for
<script>tags containing inlined config. - Error-tracking review. Audit Sentry, LogRocket, Datadog, or whatever observability is wired. Review retention settings and search for retained payloads containing key patterns.
- Rate-limit testing. Send rapid sequential requests to AI endpoints from a single IP and from authenticated users. Confirm the server-side route enforces caps (per-user daily, per-IP rate, global burn-rate halt) before OpenAI rate limit or billing notifications fire.
Three-layer rate limiting (mandatory)
Even when keys are perfectly contained server-side, abuse of the AI endpoint itself can drain billing. A hostile actor who creates 100 accounts and calls your AI feature aggressively is functionally the same as a leaked key. Rate limiting is the containment layer.
Layer 1: Per-user daily cap. Authenticated users get a daily cap on LLM calls. For a vibe-coded SaaS, 100 LLM calls per user per day is a reasonable starting point; tune based on legitimate usage patterns. Enforce in middleware or at the database layer.
Layer 2: Per-IP rate cap. Unauthenticated or pre-signup AI endpoints (free-tier demos, chat-with-our-marketing-page features) need per-IP caps. 10 requests per minute from a single IP is reasonable. Use Cloudflare, AWS WAF, or middleware-based rate limiting.
Layer 3: Global burn-rate halt. A circuit breaker that monitors OpenAI billing and halts all LLM calls if spend exceeds threshold (e.g., USD 50 in the last hour). This is the last line of defense. Implement via OpenAI’s usage API + a cron job that checks every 5 minutes and flips a feature flag.
All three layers, not one. A per-user cap does not stop signup abuse (attacker creates 1,000 accounts). A per-IP cap does not stop legitimate users sharing an office IP. A global burn-rate halt is too coarse to be the primary control. Defense in depth.
Decision matrix per stage
| Stage / Trigger | AI API key exposure pentest scope | Cybersecify plan |
|---|---|---|
| Pre-launch with AI features (LLM-backed SaaS) | Bundle audit + GitHub history + DevTools + rate-limit testing | Startup Pentest INR 74,999 (1 scope, 7 days, included) |
| Pre-launch, multiple AI features across web + API | Full 5-step + multi-surface coverage | Growth Pentest INR 1,79,999 (2 scopes, 10 days, included) |
| Post-launch, suspected breach | Forensic key-rotation audit + billing reconciliation | Custom engagement |
| Ongoing monitoring | Recurring monthly secret-exposure scan | Security Retainer INR 24,999/month |
Where to go from here
If you built an AI-backed SaaS with Cursor, Lovable, Bolt.new, v0, or Replit Agent and want to verify AI API key exposure posture before launch, book a free 30-min call. We will walk the 5 exposure paths, identify which ones apply to your stack, and quote a Startup or Growth Pentest in the same call.
For pricing, see Cybersecify Pentest Pricing. For the deliverable format, see our SOC 2 + ISO 27001 ready pentest report sample.
Related: Pentest Checklist for Vibe-Coded SaaS Apps, How to scope your first pentest, Pentest Pricing Tiers Explained, Vibe-coded app pre-launch security review.
Frequently asked questions
How do OpenAI and Anthropic API keys leak in vibe-coded SaaS apps?
Five patterns: the LLM scaffolds an endpoint that passes the key as a client-readable environment variable (NEXT_PUBLIC_OPENAI_API_KEY, VITE_ANTHROPIC_KEY) and the key lands in the browser bundle; the key gets committed to the public git repository in a config file or .env file; the key gets posted to a public Discord, Slack, or GitHub issue when the founder asks for help debugging; the key is passed in a client-side fetch header that any browser DevTools session reveals; the key is logged to a third-party error-tracking service that retains the request body.
What happens when an OpenAI API key leaks publicly?
Within hours, sometimes minutes. Bots scrape GitHub, npm packages, browser bundles, and other public sources for known key patterns (sk-, sk-proj-, sk-ant-). When a key is found, it gets added to abuse pools and used immediately. Typical abuse pattern: the attacker uses the key for high-throughput LLM calls until the OpenAI rate limit or billing cap stops it. Founders typically see USD 5,000 to USD 50,000 in billing before they catch it.
Can I prevent AI API key leaks in a vibe-coded SaaS pre-launch?
Yes, with discipline at five layers. Never use a NEXT_PUBLIC_ or VITE_ prefix on the AI provider key. Always proxy LLM calls through a server-side route (Next.js API route, Supabase Edge Function, Cloudflare Worker) that holds the key. Implement rate limiting on the server-side route. Scan the client production bundle for known secret patterns before every deploy. Configure secret scanning on the GitHub repository.
What rate limits should vibe-coded SaaS apps enforce on AI endpoints?
Three layers: per-user daily cap (e.g., 100 LLM calls per authenticated user per day), per-IP rate cap on unauthenticated AI endpoints (e.g., 10 requests per minute from a single IP), and a global burn-rate cap (e.g., halt all LLM calls if the last hour spent more than USD 50 in OpenAI billing). All three layers should be implemented; relying on OpenAI’s own rate limits leaves the door open to billing abuse before the OpenAI limit triggers.
Why are LLM provider API keys more dangerous to leak than other SaaS API keys?
LLM calls are expensive (a leaked key can rack up thousands of USD in hours). LLM call abuse is highly automatable (bot networks exist specifically to convert leaked keys into LLM throughput). The abuse pattern is invisible from the inside (everything looks normal until the billing notification arrives). A leaked Stripe key has a per-customer charge limit; a leaked OpenAI key has only your account’s monthly cap.
Should I use a server-side LLM proxy or call OpenAI directly from the browser?
Always server-side proxy. Calling OpenAI directly from the browser requires the OpenAI key be present in the browser, which guarantees the key leaks. Server-side proxy holds the key on the server, applies rate limiting, applies authorization, applies content filtering, and logs usage for billing reconciliation.