TokenMix Research Lab · 2026-04-07

OpenAI Deep Research API 2026: $1.50-$8 per Query, 15-40 Sources

OpenAI Deep Research API Guide: How to Use o3-deep-research and o4-mini-deep-research (2026)

Last Updated: 2026-04-29
Author: TokenMix Research Lab

OpenAI Deep Research API runs autonomous 5-30 minute research workflows: o3-deep-research at $1.50-$8/query (best depth), o4-mini-deep-research at $0.40-$2.50 (70% cheaper). Batch API cuts another 50%.

Deep Research from OpenAI is the first production API that lets you run multi-step, autonomous research tasks through a single API call. Instead of getting one response, the model browses the web, reads documents, synthesizes findings, and returns a structured report — all within one request. TokenMix.ai has been tracking Deep Research API usage patterns across our user base since launch, and the data is clear: this is not a chatbot feature dressed up as research. It is a fundamentally different API pattern with different pricing, latency, and use-case fit.

This guide covers how the OpenAI Deep Research API works, how it compares to Perplexity's deep research offering, real pricing calculations, and when it makes sense to use it versus standard completion APIs.

Table of Contents


What Is OpenAI Deep Research?

Deep Research is an agentic API on top of o-series reasoning — autonomously plans, browses 15-40 web sources, synthesizes a 2,000-5,000 word cited report in 5-30 minutes per query. OpenAI Deep Research is an agentic research capability built on top of the o-series reasoning models. When you submit a query, the model does not just generate text from its training data. It autonomously plans a research strategy, browses the web for current information, reads and analyzes multiple sources, cross-references findings, and produces a cited report.

The key difference from standard completions: Deep Research executes a multi-step workflow that can take 5-30 minutes per query. It is designed for complex questions that require synthesizing information from multiple sources — market research, competitive analysis, literature reviews, technical due diligence.

TokenMix.ai monitoring data shows the average Deep Research query processes 15-40 web sources and generates 2,000-5,000 word reports.


Deep Research API: Available Models and Pricing

Two models: o3-deep-research at $2.50/$15/M ($1.50-$8/query, 1.1M context) and o4-mini-deep-research at $0.75/$4.50/M ($0.40-$2.50/query, 400K context). Batch API gives 50% off both.

Two models are currently available for Deep Research via the OpenAI API:

Model Input (per 1M tokens) Output (per 1M tokens) Max Output Context Window Typical Query Cost
o3-deep-research $2.50 $15.00 100K tokens 1.1M $1.50-$8.00
o4-mini-deep-research $0.75 $4.50 100K tokens 400K $0.40-$2.50

Why are individual queries so expensive? Deep Research queries consume far more tokens than standard completions. A single research query typically involves:

At $15/M output tokens on o3-deep-research, an 80K output token query costs $1.20 in output alone. Add input tokens and you are at $1.50-$8.00 per query depending on complexity.

The o4-mini-deep-research model cuts this cost by roughly 70% with only moderate quality reduction, based on TokenMix.ai's comparative testing.


How the Deep Research API Works

Three-step async workflow: submit query, autonomous tool execution (web search, page reading, citations), synthesis output. Use polling for simplicity, streaming for interactive UX with progress updates. Deep Research uses the Responses API (not the older Chat Completions API). The workflow is fundamentally different from standard completion requests:

Step 1: Submit research query. You send a prompt describing what you want researched. The model interprets the intent and generates an internal research plan.

Step 2: Autonomous execution. The model uses built-in tools — web search, page reading, citation extraction — to gather information. This step runs for 5-30 minutes depending on query complexity.

Step 3: Synthesis and output. The model synthesizes findings into a structured report with citations. The response includes the final report and metadata about sources consulted.

The key architectural detail: Deep Research is inherently asynchronous. You submit a request and poll for completion, or use streaming to receive partial updates. This is not a synchronous request-response pattern.

API Request Structure (Conceptual)

A Deep Research API call uses the responses.create endpoint with the o3-deep-research or o4-mini-deep-research model specified. You provide your research query as the user message, and optionally include system instructions to guide the research scope, output format, or depth level.

The response includes the model's reasoning summary, the final research report with inline citations, and metadata such as sources consulted, tokens used, and processing time.

Streaming and Polling

Because Deep Research queries take minutes, not seconds, you have two options:

Polling: Submit the request, receive a response ID, then poll the status endpoint until the research is complete. This is simpler to implement.

Streaming: Open a streaming connection and receive incremental updates as the model progresses through its research. You get status messages like "searching for X," "reading source Y," and partial report sections as they are generated. This provides a better user experience for interactive applications.


Code Example Concepts: Using Deep Research via API

API uses POST /v1/responses with model = o3-deep-research; web_search_preview is enabled by default; system prompts steer output format and depth; Batch API drops cost 50% for non-urgent research. The Deep Research API follows OpenAI's Responses API pattern. Here are the key concepts:

Basic request pattern:

POST /v1/responses
Model: o3-deep-research (or o4-mini-deep-research)
Input: Your research question as user message
Tools: web_search_preview (enabled by default)

System prompt guidance: You can steer the research by including a system message that specifies the output format (bullet points, report, table), depth level (surface scan vs. comprehensive), source preferences (academic, industry, news), and specific aspects to cover or ignore.

Response handling: The response object contains:

Batch processing: Deep Research queries are eligible for OpenAI's Batch API at 50% off. For non-time-sensitive research tasks, this brings the cost of o4-mini-deep-research queries down to approximately $0.20-$1.25 per query — making it viable for bulk research operations.


Deep Research vs Standard Chat Completions

100-1000× cost spread per query: GPT-5.4 standard at $0.001-$0.05, Deep Research at $1.50-$8. Deep Research is never a drop-in replacement — different API pattern, different latency (5-30 min vs 1-15s), different output (20-100K tokens vs 4-16K).

Dimension Standard Completions (GPT-5.4) Deep Research (o3)
Response Time 1-15 seconds 5-30 minutes
Web Access No (unless using tools) Built-in, autonomous
Sources Consulted 0 (training data only) 15-40 per query
Output Length 4K-16K tokens typical 20K-100K tokens typical
Cost per Query $0.001-$0.05 $1.50-$8.00
Citations None Inline with URLs
Accuracy (Current Info) Limited to training cutoff Real-time web data
Best For Quick responses, chat Research, analysis, reports

The cost difference is 100-1000x per query. This means Deep Research is never a drop-in replacement for standard completions. It is a different product for a different job.


OpenAI Deep Research vs Perplexity Deep Research API

OpenAI is deeper (15-40 sources, 2-5K word reports, $1.50-$8/query); Perplexity is faster and cheaper (10-30 sources, 1-3K word reports, $0.50-$3/query). Pick OpenAI for analytical depth, Perplexity for factual lookups. Perplexity offers its own deep research capability through its API. Here is how the two compare:

Feature OpenAI Deep Research Perplexity Deep Research
Underlying Model o3 / o4-mini reasoning models Proprietary (Sonar-based)
Research Depth 15-40 sources, multi-step 10-30 sources, iterative
Response Time 5-30 minutes 2-10 minutes
Output Length 2K-5K word reports 1K-3K word reports
Citation Quality Inline with URLs Inline with URLs
Pricing $1.50-$8.00/query (o3) $0.50-$3.00/query
Batch Discount Yes (50% off via Batch API) No
Custom System Prompts Full control Limited
API Maturity Newer More mature for search

The key difference: OpenAI Deep Research produces longer, more comprehensive reports with deeper analysis. Perplexity Deep Research is faster, cheaper, and better suited for factual lookups that need current data.

TokenMix.ai's comparative testing shows OpenAI Deep Research is stronger for analytical tasks — market comparisons, technical due diligence, multi-angle analysis. Perplexity excels at factual research — "what is the current state of X" — where speed matters more than depth.

Cost comparison for 100 research queries/month:

Provider Model Per Query Monthly (100 queries)
OpenAI o3-deep-research ~$4.00 avg $400
OpenAI o4-mini-deep-research ~$1.20 avg $120
OpenAI o4-mini (Batch, 50% off) ~$0.60 avg $60
Perplexity Deep Research ~$1.50 avg $150

With OpenAI's Batch API discount, o4-mini-deep-research becomes the cheapest deep research option at approximately $0.60 per query.


Cost Analysis: When Deep Research Saves Money

A DIY research agent costs $1.15-$2/query in API alone plus 40-80 hours of engineering — Deep Research o4-mini at $0.40-$2.50 is competitive on API cost and eliminates engineering overhead, paying back within month 1 for most teams. Deep Research seems expensive at $1.50-$8.00 per query. But compare it to the alternative: manually building a research pipeline.

DIY research agent cost (per query):

Deep Research o4-mini at $0.40-$2.50 per query is competitive with DIY approaches on pure API cost. When you factor in the engineering time to build and maintain a custom research pipeline — TokenMix.ai estimates 40-80 hours for a production-quality system — Deep Research pays for itself within the first month for most teams.


Best Use Cases for the Deep Research API

Use o3-deep-research for high-stakes due diligence and competitive analysis ($4-8/query); use o4-mini-deep-research for content research and bulk lead research ($0.40-$2.50/query, $0.20-$1.25 batched). Skip for quick fact lookups — use Perplexity.

Use Case Recommended Model Typical Cost Why Deep Research
Competitive analysis reports o3-deep-research $4-$8 Comprehensive multi-source synthesis
Market research summaries o4-mini-deep-research $1-$2.50 Good enough quality, 70% cheaper
Technical due diligence o3-deep-research $5-$8 Needs highest accuracy and depth
Content research for articles o4-mini-deep-research $0.40-$1.50 Cost-effective source gathering
Patent/literature reviews o3-deep-research $5-$8 Multi-source cross-referencing
Batch lead research (Batch API) o4-mini (batch) $0.20-$1.25 50% discount for non-urgent
Quick fact checking Skip — use Perplexity $0.50-$1.00 Deep Research is overkill

How to Choose: Deep Research API or DIY Agent?

Use Deep Research API at <500 queries/month (no DIY ROI), build a custom agent at >2000/month (scale economics flip), use Perplexity for real-time factual lookups, hybrid for everything in between.

Your Situation Recommendation Reasoning
Need research reports, <500/month OpenAI Deep Research API Not worth building custom pipeline
Need research reports, >2000/month DIY agent on TokenMix.ai Scale economics favor custom pipeline
Need real-time factual lookups Perplexity API Faster, cheaper for factual queries
Need research + custom processing DIY agent with Deep Research for pre-research Hybrid approach: Deep Research gathers, your agent processes
Budget under $100/month o4-mini-deep-research + Batch API 50% batch discount keeps costs viable
Need customized output format DIY agent Deep Research output format has limited customization

Limitations and Trade-offs

Five hard constraints: 5-30 min latency (no real-time), $4 average burns $12K/month at 100 queries/day, web source quality varies, no guaranteed coverage of paywalled sources, strict 5-20 concurrent rate limits.

Latency is a hard constraint. 5-30 minutes per query means Deep Research cannot serve real-time user requests. It is strictly for async workflows.

Cost accumulates fast. At $4 average per o3 query, 100 queries/day is $12,000/month. Monitor usage carefully. TokenMix.ai's cost tracking dashboard shows Deep Research as its own line item to prevent budget surprises.

Source quality varies. The model accesses the open web. It can (and does) cite low-quality sources alongside authoritative ones. Always validate critical findings.

No guaranteed coverage. The model decides which sources to consult. It may miss relevant sources that exist behind paywalls, require authentication, or are not indexed by search engines.

Rate limits are strict. Deep Research has lower rate limits than standard completions — typically 5-20 concurrent requests depending on your tier.


What's the Bottom Line on Deep Research API?

Deep Research replaces 40-80 hours of custom agent development with a single API call. o4-mini-deep-research + Batch ($0.60/query) is the rational default for under 500 queries/month; build custom above 2,000. OpenAI Deep Research is a new API paradigm, not just another model. For research-heavy workflows — competitive analysis, market research, literature reviews — it replaces 40-80 hours of custom agent development with a single API call. The o4-mini variant with Batch API pricing brings per-query costs down to $0.60, making it accessible for moderate-volume use cases.

The decision framework is straightforward: if you need fewer than 500 research queries per month, use the Deep Research API directly. If you need more, consider building a custom pipeline using lower-cost models through TokenMix.ai's unified API for the bulk processing, with Deep Research reserved for complex queries that justify the premium.

Real-time pricing and availability data for Deep Research models — alongside 155+ other models — is tracked on TokenMix.ai.


FAQ

How much does OpenAI Deep Research API cost per query?

A single Deep Research query costs $1.50-$8.00 on o3-deep-research and $0.40-$2.50 on o4-mini-deep-research. The cost depends on query complexity, which determines how many tokens are consumed for reasoning and output generation. Using the Batch API reduces these costs by 50%.

Is the Perplexity deep research API cheaper than OpenAI's?

Perplexity Deep Research costs approximately $0.50-$3.00 per query, which is cheaper than OpenAI's o3-deep-research ($1.50-$8.00). However, OpenAI's o4-mini-deep-research with Batch API discount ($0.20-$1.25) is cheaper than Perplexity for non-time-sensitive queries.

Can I use Deep Research for real-time applications?

No. Deep Research queries take 5-30 minutes to complete. It is designed for asynchronous workflows where the user submits a research request and retrieves results later. For real-time information needs, use Perplexity's standard search API or build a custom retrieval pipeline.

What is the difference between o3-deep-research and o4-mini-deep-research?

o3-deep-research uses the full o3 reasoning model — producing more comprehensive reports with deeper analysis but at higher cost ($2.50/$15 per 1M tokens). o4-mini-deep-research uses the smaller o4-mini model — 70% cheaper ($0.75/$4.50) with moderately shorter reports and slightly less analytical depth. For most use cases, o4-mini provides sufficient quality.

Does Deep Research work with OpenAI's Batch API?

Yes. Deep Research queries are eligible for the Batch API, which provides a 50% discount on all token costs. Batch queries have a 24-hour completion window. This makes Deep Research viable for bulk research operations — for example, researching 100 companies at $0.60 per query instead of $1.20.

How does Deep Research compare to building my own research agent?

A production-quality DIY research agent costs approximately $1.15-$2.00 per query in API costs alone, plus 40-80 hours of engineering time to build and maintain. Deep Research (o4-mini) at $0.40-$2.50 per query is cost-competitive on API spend and eliminates the engineering overhead. For volumes above 2,000 queries per month, a custom pipeline may become more economical.


Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: TokenMix.ai Model Tracker, OpenAI Deep Research Documentation, Perplexity API Docs