TokenMix Research Lab · 2026-04-25

API Error Troubleshooting Directory: OpenAI, Anthropic and Cursor Fixes (2026)
This is the complete troubleshooting directory for the most common LLM API and tool errors in 2026. Click through to detailed fix guides for each specific error, organized by provider and category. Updated April 2026 with 50+ tracked error patterns across OpenAI, Anthropic, Cursor, Windsurf, Cline, and major aggregators.
How to Use This Directory
- Scan categories below for your error
- Click through to the detailed guide
- If your error isn't listed, use the general debug methodology at the bottom
- Errors rarely seen or solved are flagged in the "escalation" section
Category 1 — Authentication and API Key Errors
Most common first-time user errors. Usually fixable in minutes.
- API Key Not Found in Cookies — Cursor, Cline, Windsurf session cookie issues
- Failed to Generate API Key: Permission Denied — IAM escalation paths for OpenAI, Anthropic, AWS, Azure, GCP
- API Key Invalid — typically means expired, rotated, or wrong format. Regenerate key in provider console.
- Unauthorized (401) — auth header format wrong. OpenAI uses
Bearer <key>, Anthropic usesx-api-key: <key>. - Forbidden (403) — key valid but lacks permission for this endpoint or model. Check tier/role.
Category 2 — Rate Limiting and Capacity Errors
The most common errors for anyone running production workloads.
- Anthropic Overloaded Error — HTTP 529 explained
- Claude API Error 529: Overload Strategy — 4-tier failover design
- HTTP 429 Too Many Requests — exponential backoff retry, consider upgrading tier
- Quota Exceeded — monthly token budget used up. Upgrade plan or wait for reset.
- Insufficient Quota (OpenAI) — account has billing issue or ran out of credits. Add payment method.
Category 3 — Tool Use and Function Calling Errors
Errors specific to agent and function-calling workflows.
- Model Failed to Call the Tool with Correct Arguments — 8 root causes, schema validation
- Last Message Was Not an Assistant Message — Agent loop sequence bugs
- Tool Use ID Mismatch —
tool_result.tool_use_iddoesn't match anytool_use.idin prior assistant message - Invalid Tool Schema — your JSON Schema definition is malformed. Validate via jsonschema.org.
- Too Many Tools — exceeds the provider's per-call tool count limit (typically 128). Reduce or split.
Category 4 — Model-Specific Errors
Errors related to specific model capabilities or identifiers.
- Trying to Submit Images Without Vision-Enabled Model — vision model list and routing
- Model Not Found — model identifier typo or deprecated. Check
/v1/modelsendpoint. - Context Length Exceeded — total tokens exceed model's context window. Trim history or use longer-context model.
- Max Tokens Exceeded — requested
max_tokensbeyond model's output limit. Clamp to model's actual max.
Category 5 — Request Format Errors
Malformed requests that fail schema validation.
- Invalid Request: Request Parameters Are Invalid — 12 sub-causes isolated
- Missing Required Field — specific field from schema not included. Check provider docs.
- Invalid JSON — body isn't valid JSON. Check syntax, escaping.
- Unsupported Content Type — send
Content-Type: application/json. - Empty Messages Array — messages required to be non-empty.
Category 6 — Media and Multimodal Errors
Vision, audio, video specific issues.
- Sora Server Error Processing Request — video generation debugging
- Image Too Large — exceeds model's size or resolution cap
- Unsupported Image Format — stick to PNG, JPEG, GIF, WEBP
- Audio Too Long — Whisper and transcription models cap duration
- Invalid Base64 Encoding — data URL format wrong
Category 7 — Network and Infrastructure Errors
Below the application layer.
- HTTP 500 Internal Server Error — provider-side issue. Retry with backoff; check status page.
- HTTP 502 Bad Gateway — upstream provider unreachable. Usually transient.
- HTTP 503 Service Unavailable — provider maintenance or outage. Check status.
- Connection Timeout — network path broken. Check your local connectivity, DNS, firewall.
- SSL Certificate Errors — clock skew, bad intermediate cert. Update system time/certs.
Category 8 — Billing and Account Errors
Financial/contractual issues that surface as API errors.
- Payment Method Required — add card to provider account
- Account Suspended — TOS violation or billing issue. Contact support.
- Free Trial Expired — sign up for paid tier or switch provider
- Region Not Supported — provider doesn't serve your country. Use aggregator or VPN (check TOS).
Category 9 — Cursor / Windsurf / Cline Specific
Tool-layer errors beyond the raw API.
- Is Cursor Slow? Root Causes and Speed Fixes — 7 performance issues
- Cursor Indexer Stuck — clear workspace storage, restart
- Model Not Available — tool-specific model list, not API-level availability
- Fast Requests Depleted — Cursor Pro has 500 fast requests/cycle. After that, slower tier.
Category 10 — Provider-Specific Quirks
Edge cases unique to each provider.
OpenAI:
o1ando3reasoning models don't support temperature, top_p, penalties- Assistants API has separate rate limits from Chat Completions
- Batch API has 24h completion window — don't expect real-time results
Anthropic:
max_tokensis required (not optional)- Tool use requires alternating
user/assistantsequences strictly - Tokenizer changed in Opus 4.7 — expect 0-35% more tokens than 4.6
Google Gemini:
- Different function declaration format than OpenAI
- Safety filters more aggressive than competitors on certain content
- Regional availability varies more than other providers
DeepSeek:
- No vision model as of April 2026 — route images to Claude/GPT/Gemini
- R1 reasoning traces can be very long, inflating token counts
- V4-Pro OpenAI-compat endpoint is most stable
Moonshot/Kimi:
- MCP support is native — tool definitions transfer cleanly
- K2.6 agent swarm has specific extension parameters
General Debug Methodology
If your error isn't in the directory:
Step 1: Read the full error response, not just the top-line message.
try:
response = client.chat.completions.create(...)
except Exception as e:
print(vars(e))
if hasattr(e, 'response'):
print(e.response.json())
Step 2: Check the provider's status page. status.openai.com, status.anthropic.com, etc.
Step 3: Simplify your request to the smallest possible case that reproduces the error. Often reveals the specific problem field.
Step 4: Compare against a known-working request (curl example from docs).
Step 5: Check for recent changes — did you update the SDK, change config, switch models?
When to Route Through an Aggregator
If you're frequently hitting:
- 529 overloaded errors from Anthropic
- 429 rate limit errors from OpenAI during peak
- Model availability issues across regions
- Multi-provider configuration complexity
An aggregator simplifies dramatically. TokenMix.ai provides OpenAI-compatible access to Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-5.5, GPT-5.4, DeepSeek V4-Pro, V4-Flash, R1, Kimi K2.6, Gemini 3.1 Pro, and 300+ other models through one API key with:
- Automatic failover when any provider errors
- Unified billing (USD, RMB, Alipay, WeChat)
- Consolidated observability across providers
- Per-token pricing without per-provider contracts
For production workloads where the cost of debugging provider-specific errors outweighs the cost of aggregator abstraction, this is typically the right architectural decision.
Escalation Path
For errors not in this directory and not resolved by general debug methodology:
- Provider support ticket — include full error response, request ID, timestamp, reproduction steps. Response time: hours to days.
- Provider status page subscription — sometimes outages aren't publicized immediately; waiting often resolves.
- Community channels — r/LocalLLaMA, provider-specific Discords, Stack Overflow often answer faster than official support.
- Aggregator support — TokenMix.ai and similar aggregators often have faster support response than upstream providers because they handle cross-provider routing and can verify whether a specific provider is misbehaving.
Prevention Patterns
Three habits that cut error rate significantly:
1. Always implement exponential backoff retry for transient errors (429, 500, 502, 503, 529). This alone eliminates 80%+ of user-visible failures.
2. Use typed SDK clients instead of raw HTTP requests. The OpenAI, Anthropic, and Google SDKs catch most schema errors at construction time. Faster debug loops, fewer production errors.
3. Route through a proxy layer for production. Whether that's your own middleware or an aggregator like TokenMix.ai, having a single abstraction layer lets you swap providers, handle errors centrally, and roll out fixes across your whole fleet.
FAQ
Is there an official error code standard across providers?
No. Each provider has its own error codes, statuses, and message formats. OpenAI-compatible aggregators normalize this somewhat.
How often does this directory update?
Monthly reviews. Major provider changes (new error types, status code changes) trigger immediate updates.
Can I submit errors I've encountered?
Not directly, but you can reach out via TokenMix.ai support — our team tracks error patterns across 300+ models and incorporates significant findings into this directory.
What if the same error has different fixes on different providers?
That's often the case. Check the provider-specific sections first. If ambiguous, the general debug methodology (minimal repro, status check, SDK update) applies.
Does this cover embedding model errors?
Partial. Most errors in this directory apply to embedding models too (auth, rate limits, request format). Model-specific quirks differ — consult the specific model's docs.
How do I know if an error is my bug or a provider's bug?
Reproducibility. If the same request consistently fails at the same step with the same error, it's likely your bug. If it fails intermittently or only sometimes, it's likely a provider issue. Status page and support tickets help confirm.
By TokenMix Research Lab · Updated 2026-04-24
Sources: OpenAI error documentation, Anthropic errors reference, Google AI error handling, Cursor support forum, TokenMix.ai error tracking