TokenMix Research Lab · 2026-04-13

Call AI API in Python: One Code for 5 Providers (2026 Guide)

How to Call AI API in Python: Universal Code for OpenAI, Anthropic, Google, DeepSeek, and Groq (2026)

Last Updated: 2026-04-29
Author: TokenMix Research Lab

One Python pattern works with every major AI provider. The OpenAI SDK has become the universal standard -- OpenAI, DeepSeek, Groq, and others all use the same API format. Anthropic and Google have their own SDKs but follow similar patterns. This tutorial gives you working Python code for all five providers, explains the key parameters, and shows how to call any AI API through a single TokenMix.ai endpoint. All code tested and verified as of April 2026.

Quick Reference: 5 Providers in 5 Code Snippets
Prerequisites: What You Need Before Starting
The Universal Pattern: How All AI API Calls Work
Provider 1: OpenAI (GPT Models)
Provider 2: Anthropic (Claude Models)
Provider 3: Google (Gemini Models)
Provider 4: DeepSeek
Provider 5: Groq (Llama and Open Models)
The Universal Approach: All Providers via TokenMix.ai
Key Parameters Explained
Streaming Responses in Python
Error Handling Best Practices
How to Choose Which Provider to Call
Conclusion
FAQ

Quick Reference: 5 Providers in 5 Code Snippets

Provider	SDK Package	Base URL	Model Example
OpenAI	`openai`	`https://api.openai.com/v1`	`gpt-4.1-mini`
Anthropic	`anthropic`	`https://api.anthropic.com`	`claude-sonnet-4-20250514`
Google	`google-generativeai`	Google AI Studio	`gemini-2.0-flash`
DeepSeek	`openai` (compatible)	`https://api.deepseek.com`	`deepseek-chat`
Groq	`openai` (compatible)	`https://api.groq.com/openai/v1`	`llama-4-scout-17b-16e-instruct`
TokenMix.ai	`openai` (compatible)	`https://api.tokenmix.ai/v1`	Any model from any provider

Prerequisites: What You Need Before Starting

Required:

Python 3.8 or later. Check with python --version.
pip package manager. Comes with Python.
An API key from at least one provider. See our DeepSeek API key tutorial for a step-by-step guide.

Install the SDKs:

# Install all provider SDKs at once
pip install openai anthropic google-generativeai

# Or install only what you need
pip install openai          # Works for OpenAI, DeepSeek, Groq, TokenMix.ai
pip install anthropic       # For Anthropic Claude
pip install google-generativeai  # For Google Gemini

Set up your API keys as environment variables:

# Add to your .env file or export directly
export OPENAI_API_KEY="sk-your-openai-key"
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"
export GOOGLE_API_KEY="your-google-key"
export DEEPSEEK_API_KEY="sk-your-deepseek-key"
export GROQ_API_KEY="gsk_your-groq-key"
export TOKENMIX_API_KEY="your-tokenmix-key"

Never hardcode API keys in your source code. Use environment variables or a secrets manager.

The Universal Pattern: How All AI API Calls Work

Every AI API call in Python follows the same four-step pattern, regardless of provider.

# Step 1: Import and initialize the client
from openai import OpenAI
client = OpenAI(api_key="your-key")

# Step 2: Define your messages
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Python?"}
]

# Step 3: Make the API call
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages,
    max_tokens=300,
    temperature=0.7
)

# Step 4: Extract the response
answer = response.choices[0].message.content
tokens_used = response.usage.total_tokens
print(answer)
print(f"Tokens used: {tokens_used}")

This pattern works for OpenAI, DeepSeek, Groq, and any OpenAI-compatible provider. Anthropic and Google use slightly different syntax but the same conceptual flow.

Provider 1: OpenAI (GPT Models)

OpenAI is the most widely used AI API provider. Their SDK sets the standard that other providers follow.

Installation: pip install openai

Complete example:

import os
from openai import OpenAI

# Initialize client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# Simple chat completion
response = client.chat.completions.create(
    model="gpt-4.1-mini",  # Budget model, great for most tasks
    messages=[
        {"role": "system", "content": "You are a concise technical assistant."},
        {"role": "user", "content": "Explain list comprehension in Python."}
    ],
    max_tokens=300,
    temperature=0.3
)

print(response.choices[0].message.content)
print(f"Input tokens: {response.usage.prompt_tokens}")
print(f"Output tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

Available models:

Model	Best For	Input Price
`gpt-5.4`	Complex reasoning, best quality	$2.50/M
`gpt-4.1`	Strong general purpose	$2.00/M
`gpt-4.1-mini`	Best value, most tasks	$0.40/M
`gpt-4.1-nano`	Simple tasks, lowest cost	$0.10/M
`o4-mini`	Reasoning-heavy tasks	$1.10/M

Provider 2: Anthropic (Claude Models)

Anthropic uses its own SDK with a different syntax. The messages structure is similar but the client initialization and response format differ.

Installation: pip install anthropic

Complete example:

import os
import anthropic

# Initialize client
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

# Chat completion
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=300,
    system="You are a concise technical assistant.",  # System prompt is separate
    messages=[
        {"role": "user", "content": "Explain list comprehension in Python."}
    ]
)

print(response.content[0].text)
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")

Key differences from OpenAI:

System prompt is a separate parameter, not in the messages array
Response text is in response.content[0].text, not response.choices[0].message.content
Usage is input_tokens and output_tokens, not prompt_tokens and completion_tokens
The method is client.messages.create(), not client.chat.completions.create()

Available models:

Model	Best For	Input Price
`claude-opus-4-20250514`	Best quality, complex tasks	$15.00/M
`claude-sonnet-4-20250514`	Strong general purpose	$3.00/M
`claude-haiku-3-5-20241022`	Fast, budget option	$0.80/M

Provider 3: Google (Gemini Models)

Google offers two SDK options: the google-generativeai package for Google AI Studio, and the Vertex AI SDK for enterprise. Here we use the simpler AI Studio approach.

Installation: pip install google-generativeai

Complete example:

import os
import google.generativeai as genai

# Initialize
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))

# Create model instance
model = genai.GenerativeModel(
    model_name="gemini-2.0-flash",
    system_instruction="You are a concise technical assistant."
)

# Generate response
response = model.generate_content("Explain list comprehension in Python.")

print(response.text)
print(f"Input tokens: {response.usage_metadata.prompt_token_count}")
print(f"Output tokens: {response.usage_metadata.candidates_token_count}")

Key differences from OpenAI:

Uses genai.configure() instead of client initialization
Creates a model object first, then calls generate_content()
System instruction is set at model creation, not per-request
Response text is response.text directly
Token counts are in response.usage_metadata

Available models:

Model	Best For	Input Price
`gemini-3.1-pro`	Complex tasks, long context	$1.25/M
`gemini-2.0-flash`	Fast, budget, 1M context	$0.075/M

For a detailed Google vs OpenAI comparison, see our OpenAI vs Google AI API guide.

Provider 4: DeepSeek

DeepSeek uses an OpenAI-compatible API. You use the same openai Python package -- just change the base URL and API key.

Installation: pip install openai (same package as OpenAI)

Complete example:

import os
from openai import OpenAI

# Initialize with DeepSeek endpoint
client = OpenAI(
    api_key=os.environ.get("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com"
)

# Same syntax as OpenAI
response = client.chat.completions.create(
    model="deepseek-chat",  # DeepSeek V3
    messages=[
        {"role": "system", "content": "You are a concise technical assistant."},
        {"role": "user", "content": "Explain list comprehension in Python."}
    ],
    max_tokens=300,
    temperature=0.3
)

print(response.choices[0].message.content)
print(f"Total tokens: {response.usage.total_tokens}")

The code is identical to OpenAI except for two lines: the API key and the base URL. This is the beauty of OpenAI-compatible APIs -- zero learning curve.

Available models:

Model	Best For	Input Price
`deepseek-chat`	General purpose (V3)	$0.14/M
`deepseek-reasoner`	Complex reasoning (R1)	$0.55/M

For a complete setup guide, see our DeepSeek API key tutorial.

Provider 5: Groq (Llama and Open Models)

Groq hosts open-source models on custom LPU hardware for ultra-fast inference. Like DeepSeek, it uses the OpenAI-compatible format.

Installation: pip install openai (same package)

Complete example:

import os
from openai import OpenAI

# Initialize with Groq endpoint
client = OpenAI(
    api_key=os.environ.get("GROQ_API_KEY"),
    base_url="https://api.groq.com/openai/v1"
)

# Same syntax, different models
response = client.chat.completions.create(
    model="llama-4-scout-17b-16e-instruct",
    messages=[
        {"role": "system", "content": "You are a concise technical assistant."},
        {"role": "user", "content": "Explain list comprehension in Python."}
    ],
    max_tokens=300,
    temperature=0.3
)

print(response.choices[0].message.content)
print(f"Total tokens: {response.usage.total_tokens}")

Available models on Groq:

Model	Best For	Speed
`llama-4-scout-17b-16e-instruct`	General purpose	Ultra-fast (~200ms TTFT)
`llama-4-maverick-17b-128e-instruct`	Complex tasks	Fast
`llama-3.3-70b-versatile`	Quality-focused	Fast

The Universal Approach: All Providers via TokenMix.ai

If you want to call any model from any provider through a single endpoint, TokenMix.ai provides a unified OpenAI-compatible API.

One client, any model:

import os
from openai import OpenAI

# Single client for all providers
client = OpenAI(
    api_key=os.environ.get("TOKENMIX_API_KEY"),
    base_url="https://api.tokenmix.ai/v1"
)

# Call OpenAI models
gpt_response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "user", "content": "Hello from GPT!"}]
)

# Call DeepSeek models
ds_response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello from DeepSeek!"}]
)

# Call Google models
gemini_response = client.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[{"role": "user", "content": "Hello from Gemini!"}]
)

# Call Anthropic models
claude_response = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[{"role": "user", "content": "Hello from Claude!"}]
)

Why this matters:

One API key instead of five
One billing dashboard instead of five
Same code pattern for every model
Switch models by changing one string
Automatic failover if a provider goes down

TokenMix.ai supports 300+ models from all major providers. Check available models and pricing at TokenMix.ai.

Key Parameters Explained

Every AI API call accepts these common parameters. Understanding them is essential for controlling quality and cost.

Parameter	What It Does	Recommended Values
`model`	Which AI model to use	See provider tables above
`messages`	Conversation history (system + user + assistant)	Always include system prompt
`max_tokens`	Maximum output length	Set based on expected response size
`temperature`	Randomness (0 = deterministic, 1 = creative)	0-0.3 for factual, 0.7-1.0 for creative
`top_p`	Nucleus sampling (alternative to temperature)	Usually leave at 1.0
`stream`	Return tokens as they generate	`True` for chat UIs
`stop`	Stop sequences (halt generation at specific text)	Useful for structured output

Temperature guide:

Use Case	Temperature	Why
Code generation	0-0.2	Deterministic, consistent output
Data extraction	0	Exact, reproducible results
General Q&A	0.3-0.5	Balanced accuracy and naturalness
Creative writing	0.7-1.0	Varied, creative responses
Brainstorming	0.9-1.0	Maximum diversity

Streaming Responses in Python

Streaming returns tokens as they are generated, instead of waiting for the full response. Essential for chat interfaces.

from openai import OpenAI

client = OpenAI()

# Streaming response
stream = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "user", "content": "Write a short poem about Python."}],
    stream=True
)

# Print tokens as they arrive
full_response = ""
for chunk in stream:
    if chunk.choices[0].delta.content:
        token = chunk.choices[0].delta.content
        print(token, end="", flush=True)
        full_response += token

print()  # New line after stream ends

Streaming works with all OpenAI-compatible providers (OpenAI, DeepSeek, Groq, TokenMix.ai). For Anthropic, the syntax is slightly different:

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=300,
    messages=[{"role": "user", "content": "Write a short poem about Python."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Error Handling Best Practices

Production code needs robust error handling. Here is a complete example.

import os
import time
from openai import OpenAI, RateLimitError, APIError, APIConnectionError

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def call_ai_api(messages, model="gpt-4.1-mini", max_retries=3):
    """Make an AI API call with proper error handling."""
    
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                max_tokens=500,
                temperature=0.3
            )
            return {
                "content": response.choices[0].message.content,
                "tokens": response.usage.total_tokens,
                "model": response.model
            }
        
        except RateLimitError:
            # Wait and retry with exponential backoff
            wait = 2 ** attempt
            print(f"Rate limited. Waiting {wait}s...")
            time.sleep(wait)
        
        except APIConnectionError:
            # Network issue -- retry
            print(f"Connection error. Retrying...")
            time.sleep(1)
        
        except APIError as e:
            if e.status_code >= 500:
                # Server error -- retry
                time.sleep(2)
            else:
                # Client error (400, 401, etc.) -- do not retry
                raise
    
    raise Exception(f"Failed after {max_retries} retries")

# Usage
result = call_ai_api([
    {"role": "user", "content": "What is Python?"}
])
print(result["content"])

For comprehensive error handling including multi-provider failover, see our 429 error solutions guide.

How to Choose Which Provider to Call

Your Need	Best Provider	Best Model	Why
Cheapest possible	DeepSeek	`deepseek-chat`	$0.14/M input
Best free tier	Google	`gemini-2.0-flash`	No credit card needed
Best coding	Anthropic	`claude-sonnet-4`	Top SWE-bench scores
Fastest inference	Groq	`llama-4-scout`	200ms first token
Largest ecosystem	OpenAI	`gpt-4.1-mini`	Most tools and tutorials
One API for everything	TokenMix.ai	Any model	300+ models, one key

For a detailed provider comparison, see our guide on choosing the right LLM API.

Conclusion

Calling an AI API in Python follows the same pattern across all providers: initialize a client, define messages, call the API, extract the response. The OpenAI SDK works directly with OpenAI, DeepSeek, and Groq. Anthropic and Google have their own SDKs with minor syntax differences.

The fastest path to using all providers: install the openai package, point it at TokenMix.ai's endpoint, and switch between 300+ models by changing a single string. One API key, one bill, any model from any provider.

Get started with any provider today. Check real-time model availability and pricing at TokenMix.ai.

FAQ

What is the easiest AI API to call from Python?

OpenAI is the easiest to start with due to the largest number of tutorials and community resources. The code is 5 lines: import, initialize client, call API, print response. DeepSeek and Groq use the identical code pattern (same SDK, different base URL), so they are equally easy once you know the OpenAI pattern.

Do I need different Python packages for each AI provider?

No. The openai package works for OpenAI, DeepSeek, Groq, and any OpenAI-compatible provider (including TokenMix.ai). You only need anthropic for Claude and google-generativeai for Gemini if calling them directly. Through TokenMix.ai, the openai package accesses all providers.

How do I handle API keys securely in Python?

Store API keys as environment variables and read them with os.environ.get("KEY_NAME"). Never hardcode keys in source files. For production, use a secrets manager (AWS Secrets Manager, HashiCorp Vault, or similar). Add .env to your .gitignore to prevent accidental commits.

What is the difference between streaming and non-streaming API calls?

Non-streaming waits for the complete response before returning. Streaming returns tokens as they are generated, allowing real-time display. Use streaming for chat interfaces (better UX) and non-streaming for batch processing (simpler code). Streaming uses the same number of tokens and costs the same.

Can I call multiple AI providers from the same Python script?

Yes. Either initialize multiple clients (one per provider) or use TokenMix.ai as a unified endpoint that routes to any provider. The unified approach is simpler -- one client, one API key, switch models by changing the model name string.

How much does it cost to make 1,000 AI API calls in Python?

At 400 tokens per call (simple chat), 1,000 calls cost: $0.08 on DeepSeek V3, $0.10 on GPT-4.1 nano, $0.40 on GPT-4.1 mini, $2.00 on GPT-4.1, $3.60 on Claude Sonnet 4. For detailed cost breakdowns, check our AI API cost per request guide on TokenMix.ai.

Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: OpenAI Python SDK, Anthropic Python SDK, Google AI Python SDK, TokenMix.ai

How to Call AI API in Python: Universal Code for OpenAI, Anthropic, Google, DeepSeek, and Groq (2026)

Table of Contents

Quick Reference: 5 Providers in 5 Code Snippets

Prerequisites: What You Need Before Starting

The Universal Pattern: How All AI API Calls Work

Provider 1: OpenAI (GPT Models)

Provider 2: Anthropic (Claude Models)

Provider 3: Google (Gemini Models)

Provider 4: DeepSeek

Provider 5: Groq (Llama and Open Models)

The Universal Approach: All Providers via TokenMix.ai

Key Parameters Explained

Streaming Responses in Python

Error Handling Best Practices

How to Choose Which Provider to Call

Conclusion

FAQ

What is the easiest AI API to call from Python?

Do I need different Python packages for each AI provider?

How do I handle API keys securely in Python?

What is the difference between streaming and non-streaming API calls?

Can I call multiple AI providers from the same Python script?

How much does it cost to make 1,000 AI API calls in Python?