Failures
FailureClass
Bases: StrEnum
Classification of a pipeline failure for deciding retry behavior.
RECOVERABLE: The failure is transient and the node should be retried.TERMINAL: The failure is permanent and the pipeline should stop.AMBIGUOUS: The failure may or may not be recoverable; limited retries are attempted.
FailurePolicy
Bases: BaseModel
Retry behavior for a given failure class.
Controls max_retries and backoff_seconds (jittered).
Example
policy = FailurePolicy(max_retries=5, backoff_seconds=2.0)
policy = FailurePolicy(max_retries=0, backoff_seconds=0.0)
NodeContext
dataclass
Context passed to failure classifiers for making classification decisions.
Carries the node_id that raised, the 1-based attempt number, and
the run_id of the pipeline.
classify(exc, context, classifiers=None)
Determine the failure class of an exception.
Custom classifiers are checked first in order. If none return a result, built-in rules apply:
ContractViolation→TERMINALBudgetExceeded→RECOVERABLETimeoutError→RECOVERABLEValueError→AMBIGUOUS- Everything else →
AMBIGUOUS
RetryBudget
Tracks retry counts per (run_id, node_id) pair and enforces limits.
increment(run_id, node_id)
Increment and return the retry count for the given run/node pair.
exhausted(run_id, node_id, policy)
Return True if the retry count has reached the policy's max_retries.