====== Agentic Uncertainty ====== Uncertainty quantification (UQ) in LLM agents is a critical and underexplored challenge. Unlike single-turn question answering, agentic workflows involve sequential decisions where errors compound -- each step's uncertainty inherits from and amplifies prior steps. This page covers the "Spiral of Hallucination" phenomenon and the UProp framework for principled uncertainty propagation. ===== The Compounding Problem ===== In single-turn LLM interactions, uncertainty is localized to the current generation. In agentic systems, the agent operates in a partially observable environment where: * The true environment state $s_t$ is latent * The agent maintains a belief state $b_t(s_t) = P(s_t | h_t)$ based on history $h_t$ * Internal cognitive errors (hallucinations, logic gaps, memory failures) become external constraints for future actions * Small early errors can cascade into catastrophic failures across the planning horizon ===== Spiral of Hallucination ===== The **Spiral of Hallucination** (arXiv:2601.15703) identifies how early epistemic errors in LLM agents propagate irreversibly through the context window, creating a self-reinforcing cycle of degraded reasoning. === Key Findings === * Early errors (hallucinations, logic gaps, tool misinterpretation) propagate through the agent's context, biasing all subsequent reasoning steps * Unlike single-step generation, agentic workflows suffer from **state uncertainty** -- the agent's internal belief diverges from reality and this divergence compounds * Self-reflection alone leads to "aimless corrections" that fail to halt the spiral * Existing UQ methods act passively (diagnosing risk without intervention), which is insufficient for agentic settings === POMDP Formulation === The paper models agent reliability failures as a Partially Observable Markov Decision Process (POMDP), distinguishing: * **Epistemic uncertainty**: Reducible via better reasoning or knowledge -- the agent's ignorance about the true state * **Aleatoric uncertainty**: Irreducible randomness inherent in the environment === Dual-Process UQ Framework === Inspired by dual-process theory from cognitive science: * **System 1 (Fast, forward propagation)**: Memory-aware uncertainty propagation that prevents error spread at each step -- a proactive "immune system" for the agent's reasoning chain * **System 2 (Slow, inverse correction)**: Reflective calibration that detects deviations and applies targeted corrections before errors solidify into the context This bridges passive sensing (traditional UQ) with active reasoning (intervention), improving long-horizon task performance, calibration, and self-awareness. ===== UProp: Information-Theoretic Uncertainty Propagation ===== **UProp** (Duan et al., 2025) introduces a principled, information-theoretic framework for decomposing uncertainty in sequential agent decisions. === Uncertainty Decomposition === UProp decomposes LLM sequential decision uncertainty into two parts: * **Internal uncertainty**: Intrinsic to the current decision -- what existing single-turn UQ methods measure (e.g., token entropy, sampling variance) * **Extrinsic uncertainty**: A Mutual Information (MI) quantity $I(d_t; d_{ import numpy as np class UncertaintyPropagator: """UProp-style uncertainty estimation for agentic decisions.""" def __init__(self, llm, n_trajectories=10): self.llm = llm self.n_traj = n_trajectories def estimate_uncertainty(self, history, current_query): """Decompose uncertainty into internal + extrinsic components.""" # Internal: uncertainty of current decision in isolation p_marginal = self.llm.sample_decisions(current_query, n=self.n_traj) internal = self._entropy(p_marginal) # Extrinsic: MI between current decision and trajectory history p_conditional = self.llm.sample_decisions( current_query, context=history, n=self.n_traj ) pmi = np.log(p_conditional / (p_marginal + 1e-10) + 1e-10) extrinsic = np.mean(pmi) return {"internal": internal, "extrinsic": extrinsic, "total": internal + extrinsic} def should_intervene(self, uncertainty, threshold=0.7): """Dual-process: trigger System 2 correction if extrinsic uncertainty exceeds threshold.""" return uncertainty["extrinsic"] > threshold @staticmethod def _entropy(probs): return -np.sum(probs * np.log(probs + 1e-10)) ===== How Confidence Degrades Across Agent Steps ===== The degradation of agent confidence across sequential steps follows a characteristic pattern: - **Step 1-2**: Uncertainty is primarily internal; agent operates near its single-turn accuracy - **Step 3-5**: Extrinsic uncertainty begins to accumulate as earlier decisions constrain the solution space - **Step 5+**: The spiral effect -- if early errors exist, belief state divergence accelerates non-linearly - **Long horizon**: Without intervention, total uncertainty can exceed the threshold for reliable operation The key insight is that //total agent uncertainty is not simply the sum of per-step uncertainties// -- the extrinsic (inherited) component introduces multiplicative compounding effects. ===== Mathematical Framework ===== For a sequence of agent decisions $d_1, d_2, \ldots, d_T$: $U_{\text{total}}(d_t) = \underbrace{H(d_t)}_\text{internal} + \underbrace{I(d_t; d_{