Minbook
KO
Locating Agent Failure
Agents & Architecture Active 3 posts

Locating Agent Failure

Evaluation of LLM agents is moving from final-answer accuracy down to trajectory, span, and claim-level failure localization. A method-first reading of seven arXiv papers from 2025-2026.

About this series

Agent evaluation that only checked whether the answer was right has hit a wall. The longer the task and the more tools it touches, the more the path to the same answer decides cost and trust. Seven papers answer one question in different ways: where did the agent go wrong.

Directly useful for builders wiring their own multi-agent flows, consultants advising on agent adoption, and PMs scrutinizing the reliability of evaluation metrics. We read for method, not benchmark scores.

Part 1 lays out why outcome evaluation breaks and maps the seven papers. From Part 2, we look at how each paper localizes failure, one at a time. The approaches split into five strands: inductive taxonomy, instrumentation, verification layers, environment design, and metric formulation.

3 episodes

  1. 01
  2. 02
  3. 03