guides
Handling escalations
Pipemason retries failures automatically up to the per-phase budget in config.yml. When the budget's exhausted the run escalates: it pauses, surfaces the failing artifact, and waits on you.
How an escalation surfaces
- Status on the run card flips to
escalated. - The dashboard shows the failing phase, the last agent reasoning, and a diff of what the agent tried.
- Pipemason writes an entry to
iterations.logwith the failure_class. - If you have notifications enabled, you get one.
1. Read the agent's last reasoning
Open the run on the dashboard. The escalation panel includes:
- The phase name and which gate failed.
- The last 2–3 iterations the agent attempted (model output + tool calls).
- The failure_class — usually one of
gate_failed,spec_underspecified,contract_violation,test_drift,retry_budget_exhausted.
2. Edit the frozen artifact
Most escalations are fixable by amending the input artifact for the failing phase. Open .pipeline/ on disk:
spec_underspecified→ editspec.md; clarify the AC the agent struggled with.contract_violation→ editcontracts/*.ts; tighten or loosen the type the impl can't satisfy.test_drift→ edittest-plan.md; either remove the impossible row or rephrase it.gate_failed→ varies; read the gate's expectations and patch the artifact it's reading.
Heads-up
3. Resume the run
pipemason resume <run-id>
The monitor re-evaluates the failing gate against the amended artifact. If it passes, the run continues from where it stopped. If the gate still fails, the run re-escalates with a fresh iterations.log entry.
4. When to cancel instead
Sometimes the right move is to abandon the run and start fresh with a better-scoped ticket:
pipemason cancel <run-id>
Cancel is idempotent and final. The branch stays in place so you can inspect it; nothing's force-pushed.
Common patterns
- Spec is too ambiguous. The analyst's best-effort interpretation is wrong. Edit
spec.mdand resume. - Test asks for the impossible. The test-plan phase wrote an AC the impl can't satisfy without breaking something else. Loosen the AC or split the work into two stories.
- Cross-domain contract is wrong. The contract phase wrote a type the API and web disagree on. Edit
contracts/*.tsto match the actual runtime; resume from the failing impl phase. - A flaky test escalated. Mark it
test.skipin the failing test file (with a TODO), commit, resume. Fix the test in a follow-up.