TokenMix Research Lab ยท 2026-04-13

Call AI API in Python: One Code for 5 Providers (2026 Guide)

How to Call AI API in Python: Universal Code for OpenAI, Anthropic, Google, DeepSeek, and Groq (2026)

One Python pattern works with every major AI provider. The OpenAI SDK has become the universal standard -- OpenAI, DeepSeek, Groq, and others all use the same API format. Anthropic and Google have their own SDKs but follow similar patterns. This tutorial gives you working Python code for all five providers, explains the key parameters, and shows how to call any AI API through a single TokenMix.ai endpoint. All code tested and verified as of April 2026.

Table of Contents


Quick Reference: 5 Providers in 5 Code Snippets

Provider SDK Package Base URL Model Example
OpenAI openai https://api.openai.com/v1 gpt-4.1-mini
Anthropic anthropic https://api.anthropic.com claude-sonnet-4-20250514
Google google-generativeai Google AI Studio gemini-2.0-flash
DeepSeek openai (compatible) https://api.deepseek.com deepseek-chat
Groq openai (compatible) https://api.groq.com/openai/v1 llama-4-scout-17b-16e-instruct
TokenMix.ai openai (compatible) https://api.tokenmix.ai/v1 Any model from any provider

Prerequisites: What You Need Before Starting

Required:

  1. Python 3.8 or later. Check with python --version.
  2. pip package manager. Comes with Python.
  3. An API key from at least one provider. See our DeepSeek API key tutorial for a step-by-step guide.

Install the SDKs:

# Install all provider SDKs at once
pip install openai anthropic google-generativeai

# Or install only what you need
pip install openai          # Works for OpenAI, DeepSeek, Groq, TokenMix.ai
pip install anthropic       # For Anthropic Claude
pip install google-generativeai  # For Google Gemini

Set up your API keys as environment variables:

# Add to your .env file or export directly
export OPENAI_API_KEY="sk-your-openai-key"
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"
export GOOGLE_API_KEY="your-google-key"
export DEEPSEEK_API_KEY="sk-your-deepseek-key"
export GROQ_API_KEY="gsk_your-groq-key"
export TOKENMIX_API_KEY="your-tokenmix-key"

Never hardcode API keys in your source code. Use environment variables or a secrets manager.


The Universal Pattern: How All AI API Calls Work

Every AI API call in Python follows the same four-step pattern, regardless of provider.

# Step 1: Import and initialize the client
from openai import OpenAI
client = OpenAI(api_key="your-key")

# Step 2: Define your messages
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Python?"}
]

# Step 3: Make the API call
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages,
    max_tokens=300,
    temperature=0.7
)

# Step 4: Extract the response
answer = response.choices[0].message.content
tokens_used = response.usage.total_tokens
print(answer)
print(f"Tokens used: {tokens_used}")

This pattern works for OpenAI, DeepSeek, Groq, and any OpenAI-compatible provider. Anthropic and Google use slightly different syntax but the same conceptual flow.


Provider 1: OpenAI (GPT Models)

OpenAI is the most widely used AI API provider. Their SDK sets the standard that other providers follow.

Installation: pip install openai

Complete example:

import os
from openai import OpenAI

# Initialize client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# Simple chat completion
response = client.chat.completions.create(
    model="gpt-4.1-mini",  # Budget model, great for most tasks
    messages=[
        {"role": "system", "content": "You are a concise technical assistant."},
        {"role": "user", "content": "Explain list comprehension in Python."}
    ],
    max_tokens=300,
    temperature=0.3
)

print(response.choices[0].message.content)
print(f"Input tokens: {response.usage.prompt_tokens}")
print(f"Output tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

Available models:

Model Best For Input Price
gpt-5.4 Complex reasoning, best quality $2.50/M
gpt-4.1 Strong general purpose $2.00/M
gpt-4.1-mini Best value, most tasks $0.40/M
gpt-4.1-nano Simple tasks, lowest cost $0.10/M
o4-mini Reasoning-heavy tasks .10/M

Provider 2: Anthropic (Claude Models)

Anthropic uses its own SDK with a different syntax. The messages structure is similar but the client initialization and response format differ.

Installation: pip install anthropic

Complete example:

import os
import anthropic

# Initialize client
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

# Chat completion
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=300,
    system="You are a concise technical assistant.",  # System prompt is separate
    messages=[
        {"role": "user", "content": "Explain list comprehension in Python."}
    ]
)

print(response.content[0].text)
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")

Key differences from OpenAI:

Available models:

Model Best For Input Price
claude-opus-4-20250514 Best quality, complex tasks 5.00/M
claude-sonnet-4-20250514 Strong general purpose $3.00/M
claude-haiku-3-5-20241022 Fast, budget option $0.80/M

Provider 3: Google (Gemini Models)

Google offers two SDK options: the google-generativeai package for Google AI Studio, and the Vertex AI SDK for enterprise. Here we use the simpler AI Studio approach.

Installation: pip install google-generativeai

Complete example:

import os
import google.generativeai as genai

# Initialize
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))

# Create model instance
model = genai.GenerativeModel(
    model_name="gemini-2.0-flash",
    system_instruction="You are a concise technical assistant."
)

# Generate response
response = model.generate_content("Explain list comprehension in Python.")

print(response.text)
print(f"Input tokens: {response.usage_metadata.prompt_token_count}")
print(f"Output tokens: {response.usage_metadata.candidates_token_count}")

Key differences from OpenAI:

Available models:

Model Best For Input Price
gemini-3.1-pro Complex tasks, long context .25/M
gemini-2.0-flash Fast, budget, 1M context $0.075/M

For a detailed Google vs OpenAI comparison, see our OpenAI vs Google AI API guide.


Provider 4: DeepSeek

DeepSeek uses an OpenAI-compatible API. You use the same openai Python package -- just change the base URL and API key.

Installation: pip install openai (same package as OpenAI)

Complete example:

import os
from openai import OpenAI

# Initialize with DeepSeek endpoint
client = OpenAI(
    api_key=os.environ.get("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com"
)

# Same syntax as OpenAI
response = client.chat.completions.create(
    model="deepseek-chat",  # DeepSeek V3
    messages=[
        {"role": "system", "content": "You are a concise technical assistant."},
        {"role": "user", "content": "Explain list comprehension in Python."}
    ],
    max_tokens=300,
    temperature=0.3
)

print(response.choices[0].message.content)
print(f"Total tokens: {response.usage.total_tokens}")

The code is identical to OpenAI except for two lines: the API key and the base URL. This is the beauty of OpenAI-compatible APIs -- zero learning curve.

Available models:

Model Best For Input Price
deepseek-chat General purpose (V3) $0.14/M
deepseek-reasoner Complex reasoning (R1) $0.55/M

For a complete setup guide, see our DeepSeek API key tutorial.


Provider 5: Groq (Llama and Open Models)

Groq hosts open-source models on custom LPU hardware for ultra-fast inference. Like DeepSeek, it uses the OpenAI-compatible format.

Installation: pip install openai (same package)

Complete example:

import os
from openai import OpenAI

# Initialize with Groq endpoint
client = OpenAI(
    api_key=os.environ.get("GROQ_API_KEY"),
    base_url="https://api.groq.com/openai/v1"
)

# Same syntax, different models
response = client.chat.completions.create(
    model="llama-4-scout-17b-16e-instruct",
    messages=[
        {"role": "system", "content": "You are a concise technical assistant."},
        {"role": "user", "content": "Explain list comprehension in Python."}
    ],
    max_tokens=300,
    temperature=0.3
)

print(response.choices[0].message.content)
print(f"Total tokens: {response.usage.total_tokens}")

Available models on Groq:

Model Best For Speed
llama-4-scout-17b-16e-instruct General purpose Ultra-fast (~200ms TTFT)
llama-4-maverick-17b-128e-instruct Complex tasks Fast
llama-3.3-70b-versatile Quality-focused Fast

The Universal Approach: All Providers via TokenMix.ai

If you want to call any model from any provider through a single endpoint, TokenMix.ai provides a unified OpenAI-compatible API.

One client, any model:

import os
from openai import OpenAI

# Single client for all providers
client = OpenAI(
    api_key=os.environ.get("TOKENMIX_API_KEY"),
    base_url="https://api.tokenmix.ai/v1"
)

# Call OpenAI models
gpt_response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "user", "content": "Hello from GPT!"}]
)

# Call DeepSeek models
ds_response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello from DeepSeek!"}]
)

# Call Google models
gemini_response = client.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[{"role": "user", "content": "Hello from Gemini!"}]
)

# Call Anthropic models
claude_response = client.chat.completions.create(
    model="claude-sonnet-4",
    messages=[{"role": "user", "content": "Hello from Claude!"}]
)

Why this matters:

TokenMix.ai supports 300+ models from all major providers. Check available models and pricing at TokenMix.ai.


Key Parameters Explained

Every AI API call accepts these common parameters. Understanding them is essential for controlling quality and cost.

Parameter What It Does Recommended Values
model Which AI model to use See provider tables above
messages Conversation history (system + user + assistant) Always include system prompt
max_tokens Maximum output length Set based on expected response size
temperature Randomness (0 = deterministic, 1 = creative) 0-0.3 for factual, 0.7-1.0 for creative
top_p Nucleus sampling (alternative to temperature) Usually leave at 1.0
stream Return tokens as they generate True for chat UIs
stop Stop sequences (halt generation at specific text) Useful for structured output

Temperature guide:

Use Case Temperature Why
Code generation 0-0.2 Deterministic, consistent output
Data extraction 0 Exact, reproducible results
General Q&A 0.3-0.5 Balanced accuracy and naturalness
Creative writing 0.7-1.0 Varied, creative responses
Brainstorming 0.9-1.0 Maximum diversity

Streaming Responses in Python

Streaming returns tokens as they are generated, instead of waiting for the full response. Essential for chat interfaces.

from openai import OpenAI

client = OpenAI()

# Streaming response
stream = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "user", "content": "Write a short poem about Python."}],
    stream=True
)

# Print tokens as they arrive
full_response = ""
for chunk in stream:
    if chunk.choices[0].delta.content:
        token = chunk.choices[0].delta.content
        print(token, end="", flush=True)
        full_response += token

print()  # New line after stream ends

Streaming works with all OpenAI-compatible providers (OpenAI, DeepSeek, Groq, TokenMix.ai). For Anthropic, the syntax is slightly different:

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=300,
    messages=[{"role": "user", "content": "Write a short poem about Python."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Error Handling Best Practices

Production code needs robust error handling. Here is a complete example.

import os
import time
from openai import OpenAI, RateLimitError, APIError, APIConnectionError

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def call_ai_api(messages, model="gpt-4.1-mini", max_retries=3):
    """Make an AI API call with proper error handling."""
    
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                max_tokens=500,
                temperature=0.3
            )
            return {
                "content": response.choices[0].message.content,
                "tokens": response.usage.total_tokens,
                "model": response.model
            }
        
        except RateLimitError:
            # Wait and retry with exponential backoff
            wait = 2 ** attempt
            print(f"Rate limited. Waiting {wait}s...")
            time.sleep(wait)
        
        except APIConnectionError:
            # Network issue -- retry
            print(f"Connection error. Retrying...")
            time.sleep(1)
        
        except APIError as e:
            if e.status_code >= 500:
                # Server error -- retry
                time.sleep(2)
            else:
                # Client error (400, 401, etc.) -- do not retry
                raise
    
    raise Exception(f"Failed after {max_retries} retries")

# Usage
result = call_ai_api([
    {"role": "user", "content": "What is Python?"}
])
print(result["content"])

For comprehensive error handling including multi-provider failover, see our 429 error solutions guide.


How to Choose Which Provider to Call

Your Need Best Provider Best Model Why
Cheapest possible DeepSeek deepseek-chat $0.14/M input
Best free tier Google gemini-2.0-flash No credit card needed
Best coding Anthropic claude-sonnet-4 Top SWE-bench scores
Fastest inference Groq llama-4-scout 200ms first token
Largest ecosystem OpenAI gpt-4.1-mini Most tools and tutorials
One API for everything TokenMix.ai Any model 300+ models, one key

For a detailed provider comparison, see our guide on choosing the right LLM API.


Conclusion

Calling an AI API in Python follows the same pattern across all providers: initialize a client, define messages, call the API, extract the response. The OpenAI SDK works directly with OpenAI, DeepSeek, and Groq. Anthropic and Google have their own SDKs with minor syntax differences.

The fastest path to using all providers: install the openai package, point it at TokenMix.ai's endpoint, and switch between 300+ models by changing a single string. One API key, one bill, any model from any provider.

Get started with any provider today. Check real-time model availability and pricing at TokenMix.ai.


FAQ

What is the easiest AI API to call from Python?

OpenAI is the easiest to start with due to the largest number of tutorials and community resources. The code is 5 lines: import, initialize client, call API, print response. DeepSeek and Groq use the identical code pattern (same SDK, different base URL), so they are equally easy once you know the OpenAI pattern.

Do I need different Python packages for each AI provider?

No. The openai package works for OpenAI, DeepSeek, Groq, and any OpenAI-compatible provider (including TokenMix.ai). You only need anthropic for Claude and google-generativeai for Gemini if calling them directly. Through TokenMix.ai, the openai package accesses all providers.

How do I handle API keys securely in Python?

Store API keys as environment variables and read them with os.environ.get("KEY_NAME"). Never hardcode keys in source files. For production, use a secrets manager (AWS Secrets Manager, HashiCorp Vault, or similar). Add .env to your .gitignore to prevent accidental commits.

What is the difference between streaming and non-streaming API calls?

Non-streaming waits for the complete response before returning. Streaming returns tokens as they are generated, allowing real-time display. Use streaming for chat interfaces (better UX) and non-streaming for batch processing (simpler code). Streaming uses the same number of tokens and costs the same.

Can I call multiple AI providers from the same Python script?

Yes. Either initialize multiple clients (one per provider) or use TokenMix.ai as a unified endpoint that routes to any provider. The unified approach is simpler -- one client, one API key, switch models by changing the model name string.

How much does it cost to make 1,000 AI API calls in Python?

At 400 tokens per call (simple chat), 1,000 calls cost: $0.08 on DeepSeek V3, $0.10 on GPT-4.1 nano, $0.40 on GPT-4.1 mini, $2.00 on GPT-4.1, $3.60 on Claude Sonnet 4. For detailed cost breakdowns, check our AI API cost per request guide on TokenMix.ai.


Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: OpenAI Python SDK, Anthropic Python SDK, Google AI Python SDK, TokenMix.ai