OpenAI Deep Research API in 2026: How It Works, What It Costs, and Perplexity Comparison

TokenMix Research Lab · 2026-04-07

OpenAI Deep Research API Guide: How to Use o3-deep-research and o4-mini-deep-research (2026)

Deep Research from OpenAI is the first production API that lets you run multi-step, autonomous research tasks through a single API call. Instead of getting one response, the model browses the web, reads documents, synthesizes findings, and returns a structured report — all within one request. TokenMix.ai has been tracking Deep Research API usage patterns across our user base since launch, and the data is clear: this is not a chatbot feature dressed up as research. It is a fundamentally different API pattern with different pricing, latency, and use-case fit.

This guide covers how the OpenAI Deep Research API works, how it compares to Perplexity's deep research offering, real pricing calculations, and when it makes sense to use it versus standard completion APIs.

[What Is OpenAI Deep Research?]
[Deep Research API: Available Models and Pricing]
[How the Deep Research API Works]
[Code Example Concepts: Using Deep Research via API]
[Deep Research vs Standard Chat Completions]
[OpenAI Deep Research vs Perplexity Deep Research API]
[Cost Analysis: When Deep Research Saves Money]
[Best Use Cases for the Deep Research API]
[How to Choose: Deep Research vs DIY Agent]
[Limitations and Trade-offs]
[Conclusion]
[FAQ]

---

What Is OpenAI Deep Research?

OpenAI Deep Research is an agentic research capability built on top of the o-series reasoning models. When you submit a query, the model does not just generate text from its training data. It autonomously plans a research strategy, browses the web for current information, reads and analyzes multiple sources, cross-references findings, and produces a cited report.

The key difference from standard completions: Deep Research executes a multi-step workflow that can take 5-30 minutes per query. It is designed for complex questions that require synthesizing information from multiple sources — market research, competitive analysis, literature reviews, technical due diligence.

TokenMix.ai monitoring data shows the average Deep Research query processes 15-40 web sources and generates 2,000-5,000 word reports.

---

Deep Research API: Available Models and Pricing

Two models are currently available for Deep Research via the OpenAI API:

| Model | Input (per 1M tokens) | Output (per 1M tokens) | Max Output | Context Window | Typical Query Cost | | --- | --- | --- | --- | --- | --- | | o3-deep-research | $2.50 | $15.00 | 100K tokens | 1.1M | $1.50-$8.00 | | o4-mini-deep-research | $0.75 | $4.50 | 100K tokens | 400K | $0.40-$2.50 |

**Why are individual queries so expensive?** Deep Research queries consume far more tokens than standard completions. A single research query typically involves:

10,000-50,000 input tokens (your query + system instructions + internal reasoning)
20,000-80,000 output tokens (the reasoning chain + final report)
Additional token consumption from internal tool calls (web browsing, document reading)

At $15/M output tokens on o3-deep-research, an 80K output token query costs $1.20 in output alone. Add input tokens and you are at $1.50-$8.00 per query depending on complexity.

The [o4-mini](https://tokenmix.ai/blog/openai-o4-mini-o3-pro)-deep-research model cuts this cost by roughly 70% with only moderate quality reduction, based on TokenMix.ai's comparative testing.

---

How the Deep Research API Works

Deep Research uses the Responses API (not the older Chat Completions API). The workflow is fundamentally different from standard completion requests:

**Step 1: Submit research query.** You send a prompt describing what you want researched. The model interprets the intent and generates an internal research plan.

**Step 2: Autonomous execution.** The model uses built-in tools — web search, page reading, citation extraction — to gather information. This step runs for 5-30 minutes depending on query complexity.

**Step 3: Synthesis and output.** The model synthesizes findings into a structured report with citations. The response includes the final report and metadata about sources consulted.

The key architectural detail: Deep Research is inherently asynchronous. You submit a request and poll for completion, or use streaming to receive partial updates. This is not a synchronous request-response pattern.

API Request Structure (Conceptual)

A Deep Research API call uses the `responses.create` endpoint with the `o3-deep-research` or `o4-mini-deep-research` model specified. You provide your research query as the user message, and optionally include system instructions to guide the research scope, output format, or depth level.

The response includes the model's reasoning summary, the final research report with inline citations, and metadata such as sources consulted, tokens used, and processing time.

Streaming and Polling

Because Deep Research queries take minutes, not seconds, you have two options:

**Polling:** Submit the request, receive a response ID, then poll the status endpoint until the research is complete. This is simpler to implement.

**Streaming:** Open a streaming connection and receive incremental updates as the model progresses through its research. You get status messages like "searching for X," "reading source Y," and partial report sections as they are generated. This provides a better user experience for interactive applications.

---

Code Example Concepts: Using Deep Research via API

The Deep Research API follows OpenAI's Responses API pattern. Here are the key concepts:

**Basic request pattern:** ``` POST /v1/responses Model: o3-deep-research (or o4-mini-deep-research) Input: Your research question as user message Tools: web_search_preview (enabled by default) ```

**System prompt guidance:** You can steer the research by including a system message that specifies the output format (bullet points, report, table), depth level (surface scan vs. comprehensive), source preferences (academic, industry, news), and specific aspects to cover or ignore.

**Response handling:** The response object contains: - `output`: The final research report with citations - `usage`: Token counts (input, output, reasoning) - `metadata`: Sources list, processing time, search queries used

**Batch processing:** Deep Research queries are eligible for OpenAI's [Batch API](https://tokenmix.ai/blog/openai-batch-api-pricing) at 50% off. For non-time-sensitive research tasks, this brings the cost of o4-mini-deep-research queries down to approximately $0.20-$1.25 per query — making it viable for bulk research operations.

---

Deep Research vs Standard Chat Completions

| Dimension | Standard Completions (GPT-5.4) | Deep Research (o3) | | --- | --- | --- | | Response Time | 1-15 seconds | 5-30 minutes | | Web Access | No (unless using tools) | Built-in, autonomous | | Sources Consulted | 0 (training data only) | 15-40 per query | | Output Length | 4K-16K tokens typical | 20K-100K tokens typical | | Cost per Query | $0.001-$0.05 | $1.50-$8.00 | | Citations | None | Inline with URLs | | Accuracy (Current Info) | Limited to training cutoff | Real-time web data | | Best For | Quick responses, chat | Research, analysis, reports |

The cost difference is 100-1000x per query. This means Deep Research is never a drop-in replacement for standard completions. It is a different product for a different job.

---

OpenAI Deep Research vs Perplexity Deep Research API

Perplexity offers its own deep research capability through its API. Here is how the two compare:

| Feature | OpenAI Deep Research | Perplexity Deep Research | | --- | --- | --- | | Underlying Model | o3 / o4-mini reasoning models | Proprietary (Sonar-based) | | Research Depth | 15-40 sources, multi-step | 10-30 sources, iterative | | Response Time | 5-30 minutes | 2-10 minutes | | Output Length | 2K-5K word reports | 1K-3K word reports | | Citation Quality | Inline with URLs | Inline with URLs | | Pricing | $1.50-$8.00/query (o3) | $0.50-$3.00/query | | Batch Discount | Yes (50% off via Batch API) | No | | Custom System Prompts | Full control | Limited | | API Maturity | Newer | More mature for search |

**The key difference:** OpenAI Deep Research produces longer, more comprehensive reports with deeper analysis. Perplexity Deep Research is faster, cheaper, and better suited for factual lookups that need current data.

TokenMix.ai's comparative testing shows OpenAI Deep Research is stronger for analytical tasks — market comparisons, technical due diligence, multi-angle analysis. Perplexity excels at factual research — "what is the current state of X" — where speed matters more than depth.

**Cost comparison for 100 research queries/month:**

| Provider | Model | Per Query | Monthly (100 queries) | | --- | --- | --- | --- | | OpenAI | o3-deep-research | ~$4.00 avg | $400 | | OpenAI | o4-mini-deep-research | ~$1.20 avg | $120 | | OpenAI | o4-mini (Batch, 50% off) | ~$0.60 avg | $60 | | Perplexity | Deep Research | ~$1.50 avg | $150 |

With OpenAI's Batch API discount, o4-mini-deep-research becomes the cheapest deep research option at approximately $0.60 per query.

---

Cost Analysis: When Deep Research Saves Money

Deep Research seems expensive at $1.50-$8.00 per query. But compare it to the alternative: manually building a research pipeline.

**DIY research agent cost (per query):** - Multiple [GPT-5.4](https://tokenmix.ai/blog/gpt-5-api-pricing) calls for planning: ~$0.15 - 15-20 web search API calls: ~$0.30-$0.50 - 15-20 page scraping and processing calls: ~$0.50-$1.00 - Synthesis and report generation: ~$0.10-$0.30 - Error handling and retries: ~$0.10-$0.20 - **Total DIY cost: $1.15-$2.00 per query** (plus engineering time)

Deep Research o4-mini at $0.40-$2.50 per query is competitive with DIY approaches on pure API cost. When you factor in the engineering time to build and maintain a custom research pipeline — TokenMix.ai estimates 40-80 hours for a production-quality system — Deep Research pays for itself within the first month for most teams.

---

Best Use Cases for the Deep Research API

| Use Case | Recommended Model | Typical Cost | Why Deep Research | | --- | --- | --- | --- | | Competitive analysis reports | o3-deep-research | $4-$8 | Comprehensive multi-source synthesis | | Market research summaries | o4-mini-deep-research | $1-$2.50 | Good enough quality, 70% cheaper | | Technical due diligence | o3-deep-research | $5-$8 | Needs highest accuracy and depth | | Content research for articles | o4-mini-deep-research | $0.40-$1.50 | Cost-effective source gathering | | Patent/literature reviews | o3-deep-research | $5-$8 | Multi-source cross-referencing | | Batch lead research (Batch API) | o4-mini (batch) | $0.20-$1.25 | 50% discount for non-urgent | | Quick fact checking | Skip — use Perplexity | $0.50-$1.00 | Deep Research is overkill |

---

How to Choose: Deep Research vs DIY Agent

| Your Situation | Recommendation | Reasoning | | --- | --- | --- | | Need research reports, <500/month | OpenAI Deep Research API | Not worth building custom pipeline | | Need research reports, >2000/month | DIY agent on TokenMix.ai | Scale economics favor custom pipeline | | Need real-time factual lookups | Perplexity API | Faster, cheaper for factual queries | | Need research + custom processing | DIY agent with Deep Research for pre-research | Hybrid approach: Deep Research gathers, your agent processes | | Budget under $100/month | o4-mini-deep-research + Batch API | 50% batch discount keeps costs viable | | Need customized output format | DIY agent | Deep Research output format has limited customization |

---

Limitations and Trade-offs

**Latency is a hard constraint.** 5-30 minutes per query means Deep Research cannot serve real-time user requests. It is strictly for async workflows.

**Cost accumulates fast.** At $4 average per o3 query, 100 queries/day is $12,000/month. Monitor usage carefully. TokenMix.ai's cost tracking dashboard shows Deep Research as its own line item to prevent budget surprises.

**Source quality varies.** The model accesses the open web. It can (and does) cite low-quality sources alongside authoritative ones. Always validate critical findings.

**No guaranteed coverage.** The model decides which sources to consult. It may miss relevant sources that exist behind paywalls, require authentication, or are not indexed by search engines.

**Rate limits are strict.** Deep Research has lower rate limits than standard completions — typically 5-20 concurrent requests depending on your tier.

---

Conclusion

OpenAI Deep Research is a new API paradigm, not just another model. For research-heavy workflows — competitive analysis, market research, literature reviews — it replaces 40-80 hours of custom agent development with a single API call. The o4-mini variant with Batch API pricing brings per-query costs down to $0.60, making it accessible for moderate-volume use cases.

The decision framework is straightforward: if you need fewer than 500 research queries per month, use the Deep Research API directly. If you need more, consider building a custom pipeline using lower-cost models through TokenMix.ai's unified API for the bulk processing, with Deep Research reserved for complex queries that justify the premium.

Real-time pricing and availability data for Deep Research models — alongside 155+ other models — is tracked on [TokenMix.ai](https://tokenmix.ai).

---

FAQ

How much does OpenAI Deep Research API cost per query?

A single Deep Research query costs $1.50-$8.00 on o3-deep-research and $0.40-$2.50 on o4-mini-deep-research. The cost depends on query complexity, which determines how many tokens are consumed for reasoning and output generation. Using the Batch API reduces these costs by 50%.

Is the Perplexity deep research API cheaper than OpenAI's?

Perplexity Deep Research costs approximately $0.50-$3.00 per query, which is cheaper than OpenAI's o3-deep-research ($1.50-$8.00). However, OpenAI's o4-mini-deep-research with Batch API discount ($0.20-$1.25) is cheaper than Perplexity for non-time-sensitive queries.

Can I use Deep Research for real-time applications?

No. Deep Research queries take 5-30 minutes to complete. It is designed for asynchronous workflows where the user submits a research request and retrieves results later. For real-time information needs, use Perplexity's standard search API or build a custom retrieval pipeline.

What is the difference between o3-deep-research and o4-mini-deep-research?

o3-deep-research uses the full o3 reasoning model — producing more comprehensive reports with deeper analysis but at higher cost ($2.50/$15 per 1M tokens). o4-mini-deep-research uses the smaller o4-mini model — 70% cheaper ($0.75/$4.50) with moderately shorter reports and slightly less analytical depth. For most use cases, o4-mini provides sufficient quality.

Does Deep Research work with OpenAI's Batch API?

Yes. Deep Research queries are eligible for the Batch API, which provides a 50% discount on all token costs. Batch queries have a 24-hour completion window. This makes Deep Research viable for bulk research operations — for example, researching 100 companies at $0.60 per query instead of $1.20.

How does Deep Research compare to building my own research agent?

A production-quality DIY research agent costs approximately $1.15-$2.00 per query in API costs alone, plus 40-80 hours of engineering time to build and maintain. Deep Research (o4-mini) at $0.40-$2.50 per query is cost-competitive on API spend and eliminates the engineering overhead. For volumes above 2,000 queries per month, a custom pipeline may become more economical.

---

*Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: [TokenMix.ai Model Tracker](https://tokenmix.ai), [OpenAI Deep Research Documentation](https://platform.openai.com/docs), [Perplexity API Docs](https://docs.perplexity.ai)*