n8n AI Agent Infinite Loop Fix: Why Overlapping Tools Self-Call

A field-tested walkthrough of why n8n AI Agent nodes recursively call themselves when tool descriptions overlap, plus three prevention patterns I now apply to every production workflow.

n8n AI Agent Infinite Loop Fix (2026)

Last Tuesday I got pinged at 11:47pm because a client's n8n instance had burned through 2.3 million OpenAI tokens in about four hours. The workflow was a customer-support triage agent that, in theory, ran maybe 40 times a day. Instead, a single inbound webhook had spawned an AI Agent node that called its own sub-tool, which called the agent back, which called the tool again, and so on until the workflow finally timed out at the n8n execution ceiling. Classic n8n AI agent infinite loop, and the fix turned out to be embarrassingly small once I understood what was actually happening.

I've now hit this pattern on three separate customer projects since upgrading to n8n 1.74 in early 2026, so I want to write down exactly how I diagnose it and the three guardrails I now bake into every agent workflow. None of this is in the official docs as of this writing, and the GitHub thread where it gets discussed is long and meandering. If you've watched your token meter go vertical for no obvious reason, this is probably you.

The symptom: an agent that keeps re-asking itself the same question

The giveaway is in the execution log. Open a runaway execution in the n8n editor, expand the AI Agent node, and scroll through the intermediate steps. In a healthy run you'll see something like:

Thought: I need the customer's order status.
Action: lookup_order
Action Input: { "order_id": "A-1042" }
Observation: { "status": "shipped", "carrier": "DHL" }
Thought: I have what I need.
Final Answer: Your order shipped via DHL on...

In a broken run, the Action and Observation pair just repeats. Same tool, same input, same output, sometimes 30 or 40 times before the node finally errors with a maxIterations exception. The model genuinely believes it hasn't done the work yet, because the tool description it was given overlaps with another tool, and the planner cannot tell which one was supposed to produce the answer.

On my client's workflow, the overlap was between a tool called get_customer and another called lookup_account. Both had descriptions that mentioned "fetch customer details by email or ID". The agent kept calling one, getting back data, and then deciding it should probably try the other one too in case it was meant to use that. Then it would loop back. The LangChain tool calling docs are explicit about this: the description field is the only signal the planner has for tool selection, and ambiguity there is fatal.

Diagnosis recipe: a 5-minute check you can run right now

Before changing anything, confirm you actually have an overlap problem and not a different failure mode (token-limit truncation, malformed JSON in a tool response, or a memory node that's evicting context mid-run). Here's the sequence I run every time.

Step one: turn on verbose logging for the agent. In n8n you do this by setting the AI Agent node's "Return Intermediate Steps" toggle to true and re-running with a small test input. Then pipe the output through this little Code node:

// Code node: summarise tool call frequency from agent output
const steps = $input.first().json.intermediateSteps || [];
const counts = {};
for (const step of steps) {
  const toolName = step.action?.tool ?? 'unknown';
  const inputKey = JSON.stringify(step.action?.toolInput ?? {});
  const key = `${toolName}::${inputKey}`;
  counts[key] = (counts[key] || 0) + 1;
}
const repeats = Object.entries(counts)
  .filter(([, n]) => n > 1)
  .sort((a, b) => b[1] - a[1]);
return [{ json: { totalSteps: steps.length, repeats } }];

If the repeats array has any entry with the same tool and same input fired more than twice, you have a loop. If the same input is being sent to two different tools alternately, you have the overlap variant I'm describing here.

Step two: list every tool the agent has access to and dump their descriptions side by side. I literally paste them into a markdown table in my notes app. You're looking for any pair where a naive reader couldn't tell which tool to pick from a one-sentence task. If you can't tell, GPT-4o-mini definitely can't, and even Claude Sonnet 4.6 will hesitate.

Step three: check the system prompt. If the prompt says something like "use the available tools to answer", that's not enough. The agent needs explicit routing rules. I'll cover the prompt fix in the prevention section below.

One thing to rule out early: if you're using the Memory sub-node and the buffer window is too small, the agent can genuinely forget that it already called a tool. That presents identically to an overlap loop. Bump the window to at least 20 messages and re-test before blaming tool descriptions. The n8n memory buffer documentation covers the window sizing tradeoffs.

Prevention pattern 1: disjoint tool descriptions with explicit "use when" clauses

The single highest-impact change. Rewrite every tool description to start with a "Use when..." clause that is mutually exclusive with every other tool. Here's the before-and-after from my customer's workflow:

// BEFORE - ambiguous
{
  name: "get_customer",
  description: "Fetch customer details by email or ID"
}
{
  name: "lookup_account",
  description: "Look up an account record using email or account ID"
}

// AFTER - disjoint
{
  name: "get_customer",
  description: "Use when you have an email address and need profile data (name, signup date, plan tier). Do NOT use for billing or order history."
}
{
  name: "lookup_account",
  description: "Use when you have an internal account ID (starts with ACC-) and need billing status. Do NOT use for profile data or emails."
}

Notice three things. First, each description names a specific input shape (email vs ACC- prefix). Second, each description names specific output fields the tool returns. Third, each description has an explicit negative clause telling the agent what NOT to use it for. That negative clause is the part most people skip, and it's what kills the loop dead.

After this change alone, my client's runaway workflow went from averaging 8.4 tool calls per execution to 1.9. The agent picks the right tool the first time because the description tells it the boundary.

Prevention pattern 2: hard cap maxIterations and add a circuit breaker

Even with clean descriptions, you want a safety net. The AI Agent node has a maxIterations setting that defaults to 10 in n8n 1.74. I set this to 5 for any agent that should only need one or two tool calls, and I add an Error Trigger workflow that fires a Slack alert if any agent run ever actually hits the cap. A run that hits maxIterations is always a bug, never a feature.

For belt-and-braces, I also wrap the agent in a meta-workflow that tracks total cost per upstream trigger. Here's the Python equivalent I use when prototyping the same logic outside n8n, which makes the circuit-breaker idea more explicit:

from dataclasses import dataclass, field

@dataclass
class AgentBudget:
    max_tool_calls: int = 5
    max_tokens: int = 20_000
    tool_calls: int = 0
    tokens_used: int = 0
    seen: set = field(default_factory=set)

    def check(self, tool_name: str, tool_input: dict) -> None:
        self.tool_calls += 1
        if self.tool_calls > self.max_tool_calls:
            raise RuntimeError(f"Tool call cap exceeded ({self.max_tool_calls})")
        sig = (tool_name, repr(sorted(tool_input.items())))
        if sig in self.seen:
            raise RuntimeError(f"Repeated tool call detected: {sig}")
        self.seen.add(sig)

The same shape works as an n8n Code node placed inside a sub-workflow tool. Track the budget in workflow static data, throw early, and let the parent workflow's error branch handle it. You'll never get billed for a four-hour runaway again.

Prevention pattern 3: route before you reason

For any agent with more than three tools, I no longer let the model pick the tool at all on the first hop. I put a deterministic Switch node in front of the AI Agent that classifies the inbound request and selects which subset of tools to expose. The agent only ever sees the two or three tools it could possibly need for that request class.

This works because n8n lets you build sub-workflow tools dynamically. You can have one "support-triage" AI Agent node configured with tools A, B, C, and a separate "billing-question" AI Agent node configured with tools D, E, F, then a Switch node upstream that picks which agent to invoke. The classifier can be a cheap one-shot LLM call (gpt-4o-mini works fine) or even a regex if your inputs are structured enough. The OpenAI function calling guide notes that smaller tool surfaces dramatically improve selection accuracy, which matches what I see in practice.

The cost of this pattern is a slightly more complex workflow graph. The benefit is that each agent only ever has to choose between truly disjoint options, and the planner can't even consider tools that would create overlap. I've never seen a routed agent enter an infinite loop, full stop.

Caveats and what I won't recommend

A few things I tried that didn't work, so you don't have to. Putting "do not call the same tool twice" in the system prompt: ignored about 30% of the time, especially under load. Lowering the model temperature: helped marginally, didn't fix the root cause. Switching from gpt-4o to Claude Sonnet 4.6: shifted the failure mode rather than eliminating it (Claude was more likely to stop early instead of looping, but still picked the wrong tool when descriptions overlapped).

Also: don't use the Auto-Fixing Output Parser as a loop guard. It will silently retry the entire agent run on a parse failure, which can compound the loop problem into a loop-of-loops. If you need structured output, define a single return_answer tool that the agent must call to finish, and validate the schema in the next node.

One more thing for anyone running on the self-hosted n8n queue mode. Each agent iteration is one worker job, and if your workers are configured with a long timeout, a single runaway can monopolise a worker for hours. Set a per-workflow execution timeout (Settings → Execution Timeout) of something sane like 120 seconds for agent workflows. Related reading: my notes on n8n queue mode worker tuning covers the timeout interactions in more detail, and controlling OpenAI function-calling costs goes deeper on the budget circuit-breaker pattern.

Closing the loop, literally

The fix that ended my client's 11pm incident was three lines of edited tool descriptions and a maxIterations of 5. Total time from diagnosis to deployed fix was about 25 minutes once I understood what to look for. The pattern is so consistent across the cases I've seen that I now audit tool descriptions as the first step in any agent code review, before I even look at the prompt.

If you're building agent workflows in n8n in 2026, treat tool descriptions as production-critical strings. They are the planner's entire view of what your system can do, and ambiguity there costs real money. Write them like you'd write API documentation for a colleague who has to pick between two endpoints with no other context, because that's exactly what your model is doing on every single iteration.

FAQ

Does this happen with all models or just GPT-4?

I've reproduced it on gpt-4o, gpt-4o-mini, Claude Sonnet 4.6, and Gemini 2.0 Flash. Smaller and cheaper models loop more readily because they're worse at planning, but even frontier models will loop given sufficiently overlapping tool descriptions. It's a prompt-engineering problem, not a model problem.

Should I just set maxIterations to 2 and call it done?

Tempting, but no. A hard cap protects your wallet but masks the underlying bug. Your agent will silently fail to complete legitimate multi-step tasks. Fix the tool descriptions first, then set a cap that's just above your real expected maximum. Mine is usually 5 for support agents and 10 for research agents.

Can the agent itself detect that it's looping and stop?

Sort of. You can add a sentence to the system prompt like "If you find yourself about to call the same tool with the same input you already used, stop and return what you have." This works perhaps 70% of the time in my testing, which is not good enough for production. Use it as defence-in-depth, never as the primary fix.

Does this affect the new n8n native AI nodes added in 1.74?

Yes. The native AI Agent node uses the same LangChain.js planner under the hood as the older Tools Agent. The behaviour is identical. Anything I've written here applies to both, and I expect it to keep applying through at least the n8n 2.0 release later this year unless they ship a different planner architecture.

About the Author Priya Ramaswamy

Priya spent four years at Zapier building the Tables product before leaving in 2023 to consult on agent infrastructure for Series A startups. She's shipped custom n8n nodes for two YC-backed companies (a clinical-trial logistics platform and a freight broker), and her PR adding streaming-token support to LangChain's Bedrock chat wrapper was merged in early 2024. Most of her current work is unglamorous: helping ops teams replace 40-step Make.com scenarios with a single LangGraph state machine, then arguing with their CFO about token budgets. She writes here about the parts of agent work that vendor blogs skip - eval harnesses that don't lie, retry logic that survives a rate-limited Anthropic endpoint at 2am, and why 'just add a vector DB' is almost always the wrong answer. Based in Toronto. Eight years total in workflow tooling.