Surviving OpenAI rate limits from a static frontend, no backend required

Hide an OpenAI key behind a SaltingIO Bridge and handle 429 throttling and 402 quota errors from the browser with a backoff loop.

SaltingIO Team June 17, 2026 5 min read API Security
openairate-limitingapi-securitybridgebackofffrontendserverless429
Surviving OpenAI rate limits from a static frontend, no backend required

How do you keep an OpenAI key out of a static frontend and still survive a burst of traffic without a backend to absorb it? Putting the key in client code is the obvious mistake, and most teams catch it in review. The quieter problem shows up later, in production, when a spike of users pushes past a rate limit and the browser starts getting 429s with no logic to handle them. A single-page app with no server has nowhere to retry, queue, or back off.

A SaltingIO Bridge sits between your frontend and OpenAI. The key stays encrypted at rest, the browser calls one endpoint, and you get a single place to reason about throttling instead of scattering retry code across components. The rest is just deciding which failures are worth retrying.

Put the key behind a Bridge first

Create a Bridge whose destination is the OpenAI endpoint you call, store the API key as a header inside the Bridge config, and lock the allowed origins to your domain. The browser then calls the Bridge UUID and never touches OpenAI directly. The difference at the call site is small, which is the point.

// Before: key shipped to every visitor in the bundle
const res = await fetch("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${OPENAI_KEY}`, // exposed
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ model: "gpt-4o-mini", messages }),
});

// After: call the Bridge, no key in the browser
const res = await fetch("https://api.salting.io/r/3f2a9c64-...", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ model: "gpt-4o-mini", messages }),
});

The Authorization header carrying the OpenAI key lives in the Bridge and gets added to every forwarded request server-side. Nothing about it reaches the client.

Tell the three 429-shaped failures apart

Once traffic grows, you will see rejections, and they do not all mean the same thing. There are three sources, and treating them identically is the most common mistake here.

OpenAI's own rate limit comes back through the Bridge as the upstream response: a 429 carrying OpenAI's body and usually a Retry-After header. That is a transient "the model is busy, slow down" signal, and retrying after a wait is correct.

SaltingIO's per-plan rate limit is a separate thing with its own shape:

{"error":"Rate limit exceeded","message":"You have exceeded the rate limit for your plan. Please slow down.","retry_after_seconds":1}

That is also worth retrying, and it even hands you the wait time in retry_after_seconds. The monthly quota is the third case, and it is a 402, not a 429:

{"error":"Monthly quota exceeded","message":"...upgrade your plan or wait until next month."}

A backoff loop is right for the two 429s and wrong for the 402. Retrying a quota wall just burns more requests against a counter that will not refill until next month. So branch on the status code first, before you decide to wait.

A backoff loop that reads the right field

async function callBridge(body, attempt = 0) {
  const res = await fetch("https://api.salting.io/r/3f2a9c64-...", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(body),
  });

  if (res.ok) return res.json();

  if (res.status === 402) {
    // Plan quota or billing. Not transient. Stop and surface it.
    throw new Error("SaltingIO quota or subscription issue");
  }

  if (res.status === 429 && attempt < 4) {
    const data = await res.json().catch(() => ({}));
    const headerWait = Number(res.headers.get("retry-after")) || 0;
    const bodyWait = Number(data.retry_after_seconds) || 0;
    const waitMs = Math.max(headerWait, bodyWait, 2 ** attempt) * 1000;
    await new Promise((r) => setTimeout(r, waitMs));
    return callBridge(body, attempt + 1);
  }

  throw new Error(`Bridge call failed: ${res.status}`);
}

The loop honours retry_after_seconds from SaltingIO and Retry-After from OpenAI, falls back to exponential backoff when neither is present, and caps attempts so a stuck request does not hammer the endpoint forever. Crucially, the 402 branch sits above the 429 branch, so a quota failure exits immediately instead of looping.

Trim the response so retries stay cheap

Add ?select to pull only the field you render. A chat completion is a large object, and the browser usually wants one string. Reshaping the response upstream means a smaller payload and less parsing on every attempt.

const url =
  "https://api.salting.io/r/3f2a9c64-...?select=choices.0.message.content";

That returns the assistant text instead of the whole envelope. The saving matters more than usual under retry pressure, because every backoff attempt re-downloads the body. Three retries of a trimmed response beat three retries of a full one.

Common pitfalls

The 402-versus-429 mix-up is the big one. A naive if (!res.ok) retry loop will retry a monthly-quota 402 four times, then surface a confusing failure, all while you stare at the OpenAI dashboard wondering why it looks perfectly healthy. It looks healthy because the quota that ran out is your SaltingIO plan, not OpenAI's. Branch on the exact status code and the symptom disappears.

Origin lock bites during local development. If you set allowed origins to your production domain and then open the app on http://localhost:5173, the Bridge returns this:

{"error":"Origin not allowed","message":"Your origin is not authorized to access this endpoint."}

That is neither a key problem nor a rate-limit problem, and it is easy to misread as one. Add your localhost origin to the Bridge's allowed origins while developing, then remove it before launch.

The last trap applies when the Bridge is private rather than public. The X-API-Key for a private record is shown exactly once, on the success modal at creation, next to a Download .txt button. There is no screen that shows it again. Copy it immediately, or you will be recreating the record from scratch.

With that in place the frontend ships no key in its bundle, calls a single endpoint, distinguishes transient throttling from a real quota wall, and keeps retries light by trimming what comes back. None of it needed a server. Read the docs for the full Bridge configuration reference, including failover URLs for when OpenAI returns a 5xx instead of a 429.