TokenMix Research Lab · 2026-04-07

OpenAI Deep Research API 2026: $1.50-$8 per Query, 15-40 Sources

OpenAI Deep Research API Guide: How to Use o3-deep-research and o4-mini-deep-research (2026)

Last Updated: 2026-04-29
Author: TokenMix Research Lab

OpenAI Deep Research API runs autonomous 5-30 minute research workflows: o3-deep-research at $1.50-$8/query (best depth), o4-mini-deep-research at $0.40-$2.50 (70% cheaper). Batch API cuts another 50%.

Deep Research from OpenAI is the first production API that lets you run multi-step, autonomous research tasks through a single API call. Instead of getting one response, the model browses the web, reads documents, synthesizes findings, and returns a structured report — all within one request. TokenMix.ai has been tracking Deep Research API usage patterns across our user base since launch, and the data is clear: this is not a chatbot feature dressed up as research. It is a fundamentally different API pattern with different pricing, latency, and use-case fit.

This guide covers how the OpenAI Deep Research API works, how it compares to Perplexity's deep research offering, real pricing calculations, and when it makes sense to use it versus standard completion APIs.

What Is OpenAI Deep Research?
Deep Research API: Available Models and Pricing
How the Deep Research API Works
Code Example Concepts: Using Deep Research via API
Deep Research vs Standard Chat Completions
OpenAI Deep Research vs Perplexity Deep Research API
Cost Analysis: When Deep Research Saves Money
Best Use Cases for the Deep Research API
How to Choose: Deep Research vs DIY Agent
Limitations and Trade-offs
Conclusion
FAQ

What Is OpenAI Deep Research?

Deep Research is an agentic API on top of o-series reasoning — autonomously plans, browses 15-40 web sources, synthesizes a 2,000-5,000 word cited report in 5-30 minutes per query. OpenAI Deep Research is an agentic research capability built on top of the o-series reasoning models. When you submit a query, the model does not just generate text from its training data. It autonomously plans a research strategy, browses the web for current information, reads and analyzes multiple sources, cross-references findings, and produces a cited report.

The key difference from standard completions: Deep Research executes a multi-step workflow that can take 5-30 minutes per query. It is designed for complex questions that require synthesizing information from multiple sources — market research, competitive analysis, literature reviews, technical due diligence.

TokenMix.ai monitoring data shows the average Deep Research query processes 15-40 web sources and generates 2,000-5,000 word reports.

Deep Research API: Available Models and Pricing

Two models: o3-deep-research at $2.50/$15/M ($1.50-$8/query, 1.1M context) and o4-mini-deep-research at $0.75/$4.50/M ($0.40-$2.50/query, 400K context). Batch API gives 50% off both.

Two models are currently available for Deep Research via the OpenAI API:

Model	Input (per 1M tokens)	Output (per 1M tokens)	Max Output	Context Window	Typical Query Cost
o3-deep-research	$2.50	$15.00	100K tokens	1.1M	$1.50-$8.00
o4-mini-deep-research	$0.75	$4.50	100K tokens	400K	$0.40-$2.50

Why are individual queries so expensive? Deep Research queries consume far more tokens than standard completions. A single research query typically involves:

10,000-50,000 input tokens (your query + system instructions + internal reasoning)
20,000-80,000 output tokens (the reasoning chain + final report)
Additional token consumption from internal tool calls (web browsing, document reading)

At $15/M output tokens on o3-deep-research, an 80K output token query costs $1.20 in output alone. Add input tokens and you are at $1.50-$8.00 per query depending on complexity.

The o4-mini-deep-research model cuts this cost by roughly 70% with only moderate quality reduction, based on TokenMix.ai's comparative testing.

How the Deep Research API Works

Three-step async workflow: submit query, autonomous tool execution (web search, page reading, citations), synthesis output. Use polling for simplicity, streaming for interactive UX with progress updates. Deep Research uses the Responses API (not the older Chat Completions API). The workflow is fundamentally different from standard completion requests:

Step 1: Submit research query. You send a prompt describing what you want researched. The model interprets the intent and generates an internal research plan.

Step 2: Autonomous execution. The model uses built-in tools — web search, page reading, citation extraction — to gather information. This step runs for 5-30 minutes depending on query complexity.

Step 3: Synthesis and output. The model synthesizes findings into a structured report with citations. The response includes the final report and metadata about sources consulted.

The key architectural detail: Deep Research is inherently asynchronous. You submit a request and poll for completion, or use streaming to receive partial updates. This is not a synchronous request-response pattern.

API Request Structure (Conceptual)

A Deep Research API call uses the responses.create endpoint with the o3-deep-research or o4-mini-deep-research model specified. You provide your research query as the user message, and optionally include system instructions to guide the research scope, output format, or depth level.

The response includes the model's reasoning summary, the final research report with inline citations, and metadata such as sources consulted, tokens used, and processing time.

Streaming and Polling

Because Deep Research queries take minutes, not seconds, you have two options:

Polling: Submit the request, receive a response ID, then poll the status endpoint until the research is complete. This is simpler to implement.

Streaming: Open a streaming connection and receive incremental updates as the model progresses through its research. You get status messages like "searching for X," "reading source Y," and partial report sections as they are generated. This provides a better user experience for interactive applications.

Code Example Concepts: Using Deep Research via API

API uses POST /v1/responses with model = o3-deep-research; web_search_preview is enabled by default; system prompts steer output format and depth; Batch API drops cost 50% for non-urgent research. The Deep Research API follows OpenAI's Responses API pattern. Here are the key concepts:

Basic request pattern:

POST /v1/responses
Model: o3-deep-research (or o4-mini-deep-research)
Input: Your research question as user message
Tools: web_search_preview (enabled by default)

System prompt guidance: You can steer the research by including a system message that specifies the output format (bullet points, report, table), depth level (surface scan vs. comprehensive), source preferences (academic, industry, news), and specific aspects to cover or ignore.

Response handling: The response object contains:

output: The final research report with citations
usage: Token counts (input, output, reasoning)
metadata: Sources list, processing time, search queries used

Batch processing: Deep Research queries are eligible for OpenAI's Batch API at 50% off. For non-time-sensitive research tasks, this brings the cost of o4-mini-deep-research queries down to approximately $0.20-$1.25 per query — making it viable for bulk research operations.

Deep Research vs Standard Chat Completions

100-1000× cost spread per query: GPT-5.4 standard at $0.001-$0.05, Deep Research at $1.50-$8. Deep Research is never a drop-in replacement — different API pattern, different latency (5-30 min vs 1-15s), different output (20-100K tokens vs 4-16K).

Dimension	Standard Completions (GPT-5.4)	Deep Research (o3)
Response Time	1-15 seconds	5-30 minutes
Web Access	No (unless using tools)	Built-in, autonomous
Sources Consulted	0 (training data only)	15-40 per query
Output Length	4K-16K tokens typical	20K-100K tokens typical
Cost per Query	$0.001-$0.05	$1.50-$8.00
Citations	None	Inline with URLs
Accuracy (Current Info)	Limited to training cutoff	Real-time web data
Best For	Quick responses, chat	Research, analysis, reports

The cost difference is 100-1000x per query. This means Deep Research is never a drop-in replacement for standard completions. It is a different product for a different job.

OpenAI Deep Research vs Perplexity Deep Research API

OpenAI is deeper (15-40 sources, 2-5K word reports, $1.50-$8/query); Perplexity is faster and cheaper (10-30 sources, 1-3K word reports, $0.50-$3/query). Pick OpenAI for analytical depth, Perplexity for factual lookups. Perplexity offers its own deep research capability through its API. Here is how the two compare:

Feature	OpenAI Deep Research	Perplexity Deep Research
Underlying Model	o3 / o4-mini reasoning models	Proprietary (Sonar-based)
Research Depth	15-40 sources, multi-step	10-30 sources, iterative
Response Time	5-30 minutes	2-10 minutes
Output Length	2K-5K word reports	1K-3K word reports
Citation Quality	Inline with URLs	Inline with URLs
Pricing	$1.50-$8.00/query (o3)	$0.50-$3.00/query
Batch Discount	Yes (50% off via Batch API)	No
Custom System Prompts	Full control	Limited
API Maturity	Newer	More mature for search

The key difference: OpenAI Deep Research produces longer, more comprehensive reports with deeper analysis. Perplexity Deep Research is faster, cheaper, and better suited for factual lookups that need current data.

TokenMix.ai's comparative testing shows OpenAI Deep Research is stronger for analytical tasks — market comparisons, technical due diligence, multi-angle analysis. Perplexity excels at factual research — "what is the current state of X" — where speed matters more than depth.

Cost comparison for 100 research queries/month:

Provider	Model	Per Query	Monthly (100 queries)
OpenAI	o3-deep-research	~$4.00 avg	$400
OpenAI	o4-mini-deep-research	~$1.20 avg	$120
OpenAI	o4-mini (Batch, 50% off)	~$0.60 avg	$60
Perplexity	Deep Research	~$1.50 avg	$150

With OpenAI's Batch API discount, o4-mini-deep-research becomes the cheapest deep research option at approximately $0.60 per query.

Cost Analysis: When Deep Research Saves Money

A DIY research agent costs $1.15-$2/query in API alone plus 40-80 hours of engineering — Deep Research o4-mini at $0.40-$2.50 is competitive on API cost and eliminates engineering overhead, paying back within month 1 for most teams. Deep Research seems expensive at $1.50-$8.00 per query. But compare it to the alternative: manually building a research pipeline.

DIY research agent cost (per query):

Multiple GPT-5.4 calls for planning: ~$0.15
15-20 web search API calls: ~$0.30-$0.50
15-20 page scraping and processing calls: ~$0.50-$1.00
Synthesis and report generation: ~$0.10-$0.30
Error handling and retries: ~$0.10-$0.20
Total DIY cost: $1.15-$2.00 per query (plus engineering time)

Deep Research o4-mini at $0.40-$2.50 per query is competitive with DIY approaches on pure API cost. When you factor in the engineering time to build and maintain a custom research pipeline — TokenMix.ai estimates 40-80 hours for a production-quality system — Deep Research pays for itself within the first month for most teams.

Best Use Cases for the Deep Research API

Use o3-deep-research for high-stakes due diligence and competitive analysis ($4-8/query); use o4-mini-deep-research for content research and bulk lead research ($0.40-$2.50/query, $0.20-$1.25 batched). Skip for quick fact lookups — use Perplexity.

Use Case	Recommended Model	Typical Cost	Why Deep Research
Competitive analysis reports	o3-deep-research	$4-$8	Comprehensive multi-source synthesis
Market research summaries	o4-mini-deep-research	$1-$2.50	Good enough quality, 70% cheaper
Technical due diligence	o3-deep-research	$5-$8	Needs highest accuracy and depth
Content research for articles	o4-mini-deep-research	$0.40-$1.50	Cost-effective source gathering
Patent/literature reviews	o3-deep-research	$5-$8	Multi-source cross-referencing
Batch lead research (Batch API)	o4-mini (batch)	$0.20-$1.25	50% discount for non-urgent
Quick fact checking	Skip — use Perplexity	$0.50-$1.00	Deep Research is overkill

How to Choose: Deep Research API or DIY Agent?

Use Deep Research API at <500 queries/month (no DIY ROI), build a custom agent at >2000/month (scale economics flip), use Perplexity for real-time factual lookups, hybrid for everything in between.

Your Situation	Recommendation	Reasoning
Need research reports, <500/month	OpenAI Deep Research API	Not worth building custom pipeline
Need research reports, >2000/month	DIY agent on TokenMix.ai	Scale economics favor custom pipeline
Need real-time factual lookups	Perplexity API	Faster, cheaper for factual queries
Need research + custom processing	DIY agent with Deep Research for pre-research	Hybrid approach: Deep Research gathers, your agent processes
Budget under $100/month	o4-mini-deep-research + Batch API	50% batch discount keeps costs viable
Need customized output format	DIY agent	Deep Research output format has limited customization

Limitations and Trade-offs

Five hard constraints: 5-30 min latency (no real-time), $4 average burns $12K/month at 100 queries/day, web source quality varies, no guaranteed coverage of paywalled sources, strict 5-20 concurrent rate limits.

Latency is a hard constraint. 5-30 minutes per query means Deep Research cannot serve real-time user requests. It is strictly for async workflows.

Cost accumulates fast. At $4 average per o3 query, 100 queries/day is $12,000/month. Monitor usage carefully. TokenMix.ai's cost tracking dashboard shows Deep Research as its own line item to prevent budget surprises.

Source quality varies. The model accesses the open web. It can (and does) cite low-quality sources alongside authoritative ones. Always validate critical findings.

No guaranteed coverage. The model decides which sources to consult. It may miss relevant sources that exist behind paywalls, require authentication, or are not indexed by search engines.

Rate limits are strict. Deep Research has lower rate limits than standard completions — typically 5-20 concurrent requests depending on your tier.

What's the Bottom Line on Deep Research API?

Deep Research replaces 40-80 hours of custom agent development with a single API call. o4-mini-deep-research + Batch ($0.60/query) is the rational default for under 500 queries/month; build custom above 2,000. OpenAI Deep Research is a new API paradigm, not just another model. For research-heavy workflows — competitive analysis, market research, literature reviews — it replaces 40-80 hours of custom agent development with a single API call. The o4-mini variant with Batch API pricing brings per-query costs down to $0.60, making it accessible for moderate-volume use cases.

The decision framework is straightforward: if you need fewer than 500 research queries per month, use the Deep Research API directly. If you need more, consider building a custom pipeline using lower-cost models through TokenMix.ai's unified API for the bulk processing, with Deep Research reserved for complex queries that justify the premium.

Real-time pricing and availability data for Deep Research models — alongside 155+ other models — is tracked on TokenMix.ai.

FAQ

How much does OpenAI Deep Research API cost per query?

A single Deep Research query costs $1.50-$8.00 on o3-deep-research and $0.40-$2.50 on o4-mini-deep-research. The cost depends on query complexity, which determines how many tokens are consumed for reasoning and output generation. Using the Batch API reduces these costs by 50%.

Is the Perplexity deep research API cheaper than OpenAI's?

Perplexity Deep Research costs approximately $0.50-$3.00 per query, which is cheaper than OpenAI's o3-deep-research ($1.50-$8.00). However, OpenAI's o4-mini-deep-research with Batch API discount ($0.20-$1.25) is cheaper than Perplexity for non-time-sensitive queries.

Can I use Deep Research for real-time applications?

No. Deep Research queries take 5-30 minutes to complete. It is designed for asynchronous workflows where the user submits a research request and retrieves results later. For real-time information needs, use Perplexity's standard search API or build a custom retrieval pipeline.

What is the difference between o3-deep-research and o4-mini-deep-research?

o3-deep-research uses the full o3 reasoning model — producing more comprehensive reports with deeper analysis but at higher cost ($2.50/$15 per 1M tokens). o4-mini-deep-research uses the smaller o4-mini model — 70% cheaper ($0.75/$4.50) with moderately shorter reports and slightly less analytical depth. For most use cases, o4-mini provides sufficient quality.

Does Deep Research work with OpenAI's Batch API?

Yes. Deep Research queries are eligible for the Batch API, which provides a 50% discount on all token costs. Batch queries have a 24-hour completion window. This makes Deep Research viable for bulk research operations — for example, researching 100 companies at $0.60 per query instead of $1.20.

How does Deep Research compare to building my own research agent?

A production-quality DIY research agent costs approximately $1.15-$2.00 per query in API costs alone, plus 40-80 hours of engineering time to build and maintain. Deep Research (o4-mini) at $0.40-$2.50 per query is cost-competitive on API spend and eliminates the engineering overhead. For volumes above 2,000 queries per month, a custom pipeline may become more economical.

Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: TokenMix.ai Model Tracker, OpenAI Deep Research Documentation, Perplexity API Docs