TokenMix Research Lab · 2026-04-10

LangChain Tutorial 2026: Python Guide from First Chain to RAG Pipeline and Agents
LangChain is the most popular framework for building LLM-powered applications in Python, with over 100,000 GitHub stars and support for 80+ model providers. This LangChain tutorial covers everything from installation to production deployment: your first chain, building a RAG pipeline, creating agents with tool use, and choosing the right LLMs. Whether you are new to LangChain or upgrading from an older version, this guide gives you working code and practical architecture decisions for 2026.
Table of Contents
- [Quick Reference: LangChain Core Concepts]
- [What Is LangChain and Why Use It in 2026]
- [Installation and Setup]
- [Your First LangChain Chain]
- [Prompt Templates and Output Parsers]
- [Building a RAG Pipeline with LangChain]
- [Creating Agents with Tool Use]
- [Which LLMs to Use with LangChain]
- [LangChain vs Alternatives]
- [Production Best Practices]
- [How to Choose Your LangChain Architecture]
- [Conclusion]
- [FAQ]
Quick Reference: LangChain Core Concepts
| Concept | What it does | When to use |
|---|---|---|
| Chain (LCEL) | Composes LLM calls with data transformations | Every LLM application |
| Prompt Template | Structures input to the LLM | When you need consistent prompt formatting |
| Retriever | Fetches relevant documents from a data source | RAG applications |
| Agent | LLM decides which tools to call and in what order | Dynamic, multi-step workflows |
| Tool | A function the agent can invoke | When agents need external capabilities |
| Memory | Stores conversation history | Chatbots and multi-turn interactions |
| Output Parser | Structures LLM output into typed objects | When you need structured data from LLMs |
What Is LangChain and Why Use It in 2026
LangChain is a Python (and JavaScript) framework that provides abstractions for building applications with LLMs. It standardizes the interface for calling different models, chaining operations, and integrating external data sources and tools.
Why LangChain still matters in 2026:
- Unified interface across 80+ model providers -- switch from OpenAI to Anthropic with one line
- Battle-tested RAG components (document loaders, text splitters, vector store integrations)
- Agent framework for building autonomous workflows
- LangSmith integration for tracing, monitoring, and evaluation
- LangGraph for complex, stateful agent workflows
When not to use LangChain:
- Simple, single-model API calls (just use the provider SDK directly)
- When you need maximum control over every HTTP request
- If your team finds the abstraction layers add more complexity than they remove
The key architectural change in 2026: LangChain Expression Language (LCEL) is the standard way to compose chains. The older LLMChain and SequentialChain classes are deprecated. This tutorial uses LCEL exclusively.
Installation and Setup
Step 1: Install core packages
pip install langchain langchain-openai langchain-community
pip install chromadb # For RAG examples
Step 2: Install provider-specific packages (choose the ones you need)
pip install langchain-anthropic # For Claude models
pip install langchain-google-genai # For Gemini models
pip install langchain-groq # For [Groq](https://tokenmix.ai/blog/groq-api-pricing)-hosted models
Step 3: Set your API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
# Or use TokenMix.ai unified API key for all providers
export TOKENMIX_API_KEY="tm-..."
Step 4: Verify installation
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke("Say hello")
print(response.content)
# Output: Hello! How can I help you today?
If this runs without errors, your setup is complete.
Your First LangChain Chain
A chain in LangChain is a sequence of operations composed using the pipe (|) operator. Here is the simplest useful chain: a prompt template connected to an LLM connected to an output parser.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Define components
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant that explains {topic} concepts."),
("human", "{question}")
])
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.3)
output_parser = StrOutputParser()
# Compose the chain using LCEL
chain = prompt | llm | output_parser
# Run the chain
result = chain.invoke({
"topic": "machine learning",
"question": "What is gradient descent in 2 sentences?"
})
print(result)
What happens at each step:
promptformats the input variables into a chat messagellmsends the formatted prompt to GPT-4o-minioutput_parserextracts the string content from the response
You can add steps to this chain: logging, caching, retry logic, or parallel branches. That is the power of LCEL -- every component is composable.
Prompt Templates and Output Parsers
Prompt Templates prevent prompt injection and ensure consistent formatting.
from langchain_core.prompts import ChatPromptTemplate
# Basic template
template = ChatPromptTemplate.from_messages([
("system", "You are a {role}. Respond in {language}."),
("human", "{query}")
])
# With few-shot examples
from langchain_core.prompts import FewShotChatMessagePromptTemplate
examples = [
{"input": "2+2", "output": "4"},
{"input": "3+3", "output": "6"},
]
example_prompt = ChatPromptTemplate.from_messages([
("human", "{input}"),
("ai", "{output}")
])
few_shot_prompt = FewShotChatMessagePromptTemplate(
example_prompt=example_prompt,
examples=examples,
)
Output Parsers convert unstructured LLM text into structured data.
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field
class ProductReview(BaseModel):
sentiment: str = Field(description="positive, negative, or neutral")
score: int = Field(description="1-10 rating")
summary: str = Field(description="One sentence summary")
parser = JsonOutputParser(pydantic_object=ProductReview)
prompt = ChatPromptTemplate.from_messages([
("system", "Analyze this product review.\n{format_instructions}"),
("human", "{review}")
])
chain = prompt | llm | parser
result = chain.invoke({
"review": "Great product, fast shipping, exactly as described.",
"format_instructions": parser.get_format_instructions()
})
# result is a dict: {"sentiment": "positive", "score": 9, "summary": "..."}
Building a RAG Pipeline with LangChain
RAG (Retrieval Augmented Generation) is LangChain's most common production use case. Here is a complete implementation.
Step 1: Load and split documents
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Load documents (web pages, PDFs, etc.)
loader = WebBaseLoader("https://docs.example.com/api-reference")
docs = loader.load()
# Split into chunks
splitter = RecursiveCharacterTextSplitter(
chunk_size=512,
chunk_overlap=50
)
chunks = splitter.split_documents(docs)
print(f"Loaded {len(chunks)} chunks")
Step 2: Create vector store
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory="./chroma_db"
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
Step 3: Build the RAG chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
prompt = ChatPromptTemplate.from_messages([
("system", """Answer based on the provided context. If the context
doesn't contain the answer, say "I don't have that information."
Context: {context}"""),
("human", "{question}")
])
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.1)
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
# Query your knowledge base
answer = rag_chain.invoke("What are the [rate limits](https://tokenmix.ai/blog/ai-api-rate-limits-guide)?")
print(answer)
This three-step setup handles most RAG applications. The retriever fetches relevant chunks, the prompt grounds the LLM, and the chain orchestrates everything.
Creating Agents with Tool Use
Agents let the LLM decide which tools to call based on the user's query. This is essential for applications that need to take actions, not just answer questions.
Define tools:
from langchain_core.tools import tool
import requests
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city."""
# Simplified example
return f"Weather in {city}: 72F, sunny"
@tool
def calculate(expression: str) -> str:
"""Evaluate a math expression."""
try:
result = eval(expression) # Use safer eval in production
return str(result)
except Exception as e:
return f"Error: {e}"
@tool
def search_docs(query: str) -> str:
"""Search the knowledge base for information."""
docs = retriever.invoke(query)
return "\n".join(doc.page_content for doc in docs[:3])
tools = [get_weather, calculate, search_docs]
Create the agent:
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use tools when needed."),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}")
])
llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# The agent decides which tools to use
result = executor.invoke({
"input": "What's 15% of 8500, and what's the weather in Tokyo?",
"chat_history": []
})
print(result["output"])
The agent will automatically call calculate("8500 * 0.15") and get_weather("Tokyo") to answer both parts of the question.
Which LLMs to Use with LangChain
LangChain supports 80+ model providers. Here are the best options for different use cases, with pricing tracked by TokenMix.ai.
| Use case | Recommended model | Price (input/output per 1M tokens) | Why |
|---|---|---|---|
| General chains | GPT-4o-mini | $0.15 / $0.60 | Cheap, fast, good enough for most tasks |
| Complex reasoning | Claude Sonnet 4.6 | $3 / |