TokenMix Research Lab · 2026-04-24

Last Message Was Not an Assistant Message: Debug Guide 2026

"Last Message Was Not an Assistant Message" Error: Debug Steps (2026)

The Agent 1: Last message was not an assistant message error is Anthropic's Claude Agent SDK telling you that your message history ended on a user or tool_use turn when the agent expected an assistant response. It's a message-sequence validation error, not an API outage, and it typically means your agent loop has a logic bug in how it stitches together conversation history. This guide covers the five patterns that trigger it, how to debug them, and the canonical fix for each. Tested on Anthropic SDK 0.68+ (April 2026) and Claude Agent SDK 1.2.

What the Error Actually Means

Anthropic's Messages API enforces strict alternation: user → assistant → user → assistant. Tool-use flows break this into a specific sub-sequence:

  1. user (or initial prompt)
  2. assistant (may include tool_use blocks)
  3. user (must contain tool_result blocks matching the tool_use IDs)
  4. assistant (the model's response after seeing the tool result)

If your agent tries to continue execution — e.g., call another tool, request another completion — before step 4 completes, the SDK raises this error because the message sequence is in an invalid state.

The message "Last message was not an assistant message" means: the API expected position 4 to exist before you asked for more work, but position 4 never happened.

Debug Step 1 — Print Your Message Array Just Before the Failing Call

Add a one-liner to dump the messages array before your next API call:

import json
print(json.dumps([{"role": m["role"], "type": m.get("content", [{}])[0].get("type", "text")} for m in messages], indent=2))

Look at the final message's role. If it's user with tool_result blocks, you haven't yet called the API to get the assistant's response to those results. That's the bug.

The fix: always call messages.create(...) after sending tool_result and append the returned assistant message to your history before doing anything else.

Debug Step 2 — Check That Tool Results Have Matching tool_use_ids

Every tool_result block must have a tool_use_id that matches a tool_use block in the immediately preceding assistant message. If IDs don't match, Anthropic either rejects the request or returns an error that surfaces as this "not an assistant message" symptom.

Validate with:

def validate_tool_ids(messages):
    for i, msg in enumerate(messages):
        if msg["role"] == "assistant":
            tool_use_ids = {b["id"] for b in msg["content"] if b.get("type") == "tool_use"}
            if i + 1 < len(messages) and messages[i+1]["role"] == "user":
                tool_result_ids = {b["tool_use_id"] for b in messages[i+1]["content"] if b.get("type") == "tool_result"}
                missing = tool_use_ids - tool_result_ids
                extra = tool_result_ids - tool_use_ids
                if missing or extra:
                    print(f"Mismatch at index {i}: missing={missing}, extra={extra}")

Debug Step 3 — Look for Dropped Assistant Messages

Some agent frameworks (especially custom loops built on top of the raw Messages API) accidentally filter out assistant messages with only tool_use content and no text. The filter logic assumes "empty assistant = skip," which breaks the sequence.

Check your message filtering logic for patterns like:

# WRONG — drops tool-use-only assistant messages
messages = [m for m in messages if m.get("content") and m["content"][0].get("text")]

Fix: preserve all assistant messages regardless of whether they contain text.

Debug Step 4 — Inspect Streaming Accumulators

If you're using streaming (stream=True), the error often means your streaming accumulator didn't finalize the assistant message before you tried to continue the loop. Typical broken pattern:

# WRONG — partial accumulation
async for event in stream:
    if event.type == "content_block_delta":
        partial += event.delta.text
# Never actually appended a complete assistant message

Fix: always wait for message_stop before appending:

async for event in stream:
    if event.type == "message_stop":
        messages.append({"role": "assistant", "content": final_content_blocks})

Debug Step 5 — Verify Tool Handlers Don't Mutate the Array

Agent frameworks sometimes share the messages array between the main loop and tool executors via reference. If a tool handler mutates the array mid-execution (adds, removes, reorders messages), the sequence state becomes inconsistent.

Fix: pass an immutable copy into tool handlers. Never mutate from inside a tool.

The Five Patterns That Trigger This Error

Pattern 1 — Calling API Before Appending Assistant Response

# BROKEN
response = client.messages.create(messages=messages, tools=tools)
# Oops, forgot to append response to messages before continuing
messages.append({"role": "user", "content": [tool_result]})

Fix: always append the assistant response before the next user turn.

Pattern 2 — Mixing Sync and Async Handlers

Some frameworks allow both sync and async tool handlers. Mixing them in the same loop can create race conditions where a sync handler completes before the async accumulator writes the assistant message.

Fix: pick one paradigm. Either all sync or all async.

Pattern 3 — Retry Logic Without History Reset

When a tool call fails and you retry the API call, you may be retrying from a state that already has a tool_result appended — but no assistant response yet.

Fix: before retry, trim the trailing tool_result and re-request the assistant message from the last clean assistant state.

Pattern 4 — Multi-Agent Handoff Bug

If your agent hands off to another agent mid-conversation (a common pattern in CrewAI and LangGraph), the receiving agent sometimes inherits the messages array in a state where the last message is a tool_result from the sender.

Fix: when handing off, ensure the sender has already processed its own tool result and the final message in the array is an assistant response.

Pattern 5 — Token Limit Truncation

If your history exceeds the model's context window, Anthropic may return an error that bubbles up as "last message was not an assistant message" after your code attempts to recover. The underlying issue is context length, not message sequence.

Fix: implement context window management — truncate old messages while preserving the required alternation pattern.

Canonical Fix Pattern for Agent Loops

Here's the structure that prevents this error entirely:

def run_agent_loop(initial_prompt: str, tools: list) -> str:
    messages = [{"role": "user", "content": initial_prompt}]

    while True:
        response = client.messages.create(
            model="claude-opus-4-7",
            messages=messages,
            tools=tools,
            max_tokens=4096,
        )

        # CRITICAL — always append assistant response immediately
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            return response.content[0].text

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                })

        messages.append({"role": "user", "content": tool_results})
        # Loop continues — next API call will get the assistant response

The invariant: messages[-1] is always either the initial user prompt, an assistant response, or a user tool_result block that's about to be responded to. Never leave the loop with the array in any other state.

If You're Debugging Through an Aggregator

If you're routing Claude through an OpenAI-compatible aggregator (including TokenMix.ai which supports Claude Opus 4.7, Sonnet 4.6, and Haiku 4.5 alongside GPT-5.5, DeepSeek V4, and 300+ other models), the tool-use sequence follows OpenAI's function calling format, which is slightly different:

If your error only appears when switching between formats, it's a format conversion bug in your wrapper, not a message sequence issue.

FAQ

Does this error happen on the Anthropic API directly or only through SDKs?

It's a sequence validation error that the API enforces. SDKs surface it with a more specific message ("last message was not an assistant message"), but the raw API returns a invalid_request_error with similar semantics.

Can I just retry the failing call?

No. Retrying without fixing the sequence state reproduces the error. You must trim or add the missing assistant message before retrying.

Why doesn't the SDK auto-fix this?

Because the "correct" fix depends on your intent — did you mean to call the API to get the assistant response, or did you want to inject a synthetic assistant message? The SDK can't guess, so it raises the error and lets you decide.

Does this affect Claude Code or Claude Desktop?

Claude Code and Claude Desktop manage message sequences internally and don't expose this error to users. It only surfaces when you're building against the raw Messages API or Claude Agent SDK.

How do I avoid this error entirely in production?

Use a well-tested agent framework (LangGraph, Claude Agent SDK latest, OpenAI Agents SDK) instead of rolling your own loop. If you must roll your own, follow the canonical pattern above religiously. Route through TokenMix.ai if you want to test the same agent logic against multiple models — Claude Opus 4.7, Claude Sonnet 4.6, GPT-5.5, and DeepSeek V4-Pro all expose identical OpenAI-compatible tool-use semantics through the aggregator, making cross-model testing straightforward.


By TokenMix Research Lab · Updated 2026-04-24

Sources: Anthropic Messages API documentation, Claude Agent SDK GitHub, Anthropic tool use guide, TokenMix.ai multi-model API