====== Least-to-Most Prompting ====== **Least-to-Most Prompting (LtM)** is a prompting paradigm introduced by Zhou et al. at Google Research in 2022 that enables LLMs to solve complex problems by decomposing them into an ordered sequence of simpler sub-problems, solved from easiest to hardest. Each sub-problem's solution is passed as context to the next, enabling compositional generalization far beyond what standard few-shot prompting achieves. graph LR A[Complex Question Q] --> B[Decompose] B --> C[Sub-Q1 easiest] C --> D[Answer A1] D --> E[Sub-Q2 + A1 context] E --> F[Answer A2] F --> G[Sub-Qn + all prior As] G --> H[Final Answer] ===== Overview ===== Standard few-shot prompting and even Chain-of-Thought (CoT) struggle when test problems are harder or longer than the provided examples. LtM overcomes this limitation by explicitly separating the //planning// of how to decompose a problem from the //execution// of solving each piece. This separation allows models to generalize to problems significantly more complex than those in the prompt exemplars. ===== Two-Stage Approach ===== LtM operates in two distinct stages: - **Decomposition Stage**: The LLM is prompted with exemplars showing how to break a complex problem $Q$ into an ordered list of sub-problems $Q_1, Q_2, \ldots, Q_n$, where complexity increases monotonically. The output is only the sub-problem list -- no solving occurs. - **Sequential Solving Stage**: For each sub-problem $Q_i$ (from $i=1$ to $n$), the LLM generates answer $A_i$ conditioned on all prior question-answer pairs. The prompt explicitly appends this accumulated context. ===== Formal Description ===== The decomposition yields an ordered set of sub-problems: $$Q \rightarrow \{Q_1, Q_2, \ldots, Q_n\} \quad \text{where } \text{complexity}(Q_i) \leq \text{complexity}(Q_{i+1})$$ Each sub-problem is solved with full prior context: $$A_i = \text{LLM}(Q_i \mid \{(Q_1, A_1), (Q_2, A_2), \ldots, (Q_{i-1}, A_{i-1})\})$$ The final answer $A_n$ addresses the original problem $Q$. This recursive conditioning creates an auditable chain of reasoning where each step builds on verified prerequisites. ===== How It Differs from Chain-of-Thought ===== ^ Aspect ^ Chain-of-Thought ^ Least-to-Most ^ | Structure | Monolithic: interleaved reasoning and answers in a single prompt | Separated: decomposition phase distinct from solving phase | | Ordering | Implicit in exemplar patterns | Explicit least-to-most ordering | | Context | Fixed exemplar context | Growing context: each step sees all prior solutions | | Generalization | Struggles with OOD harder-than-seen problems | Robust to problems harder than exemplars | | Dependency | No explicit dependency graph | Explicit dependency chain between sub-problems | ===== Code Example ===== import openai def least_to_most(question, exemplars, client): # Stage 1: Decompose into sub-problems decompose_prompt = ( "Break the following problem into simpler sub-problems, " "ordered from easiest to hardest.\n\n" ) for ex in exemplars["decomposition"]: decompose_prompt += f"Q: {ex['question']}\nSub-problems: {ex['subproblems']}\n\n" decompose_prompt += f"Q: {question}\nSub-problems:" decomposition = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": decompose_prompt}], temperature=0 ).choices[0].message.content subproblems = [s.strip() for s in decomposition.split("\n") if s.strip()] # Stage 2: Solve sequentially, passing prior context solved_context = [] for subproblem in subproblems: solve_prompt = "Solve each sub-problem using prior answers.\n\n" for ex in exemplars["solving"]: solve_prompt += f"{ex}\n\n" for sq, sa in solved_context: solve_prompt += f"Q: {sq}\nA: {sa}\n\n" solve_prompt += f"Q: {subproblem}\nA:" answer = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": solve_prompt}], temperature=0 ).choices[0].message.content solved_context.append((subproblem, answer)) return solved_context[-1][1] # Final answer ===== Experimental Results ===== Evaluated on PaLM-540B with 8-shot exemplars and temperature 0: ^ Benchmark ^ Standard Prompting ^ CoT ^ Least-to-Most ^ Key Insight ^ | SCAN (length-4 OOD) | 14.3% | -- | 99.7% | ~10x gain on compositional generalization | | DROP | -- | 69.1% | 78.2% | +9.1% via entity extraction then arithmetic | | GSM8K | -- | 58.0% | 74.4% | Solves 7-step problems from 3-step demos | The most striking result is on SCAN, where LtM achieves near-perfect accuracy on compositional commands that are far longer and more complex than any seen in the exemplars. This demonstrates the power of explicit decomposition for out-of-distribution generalization. ===== References ===== * [[https://arxiv.org/abs/2205.10625|Zhou et al., "Least-to-Most Prompting Enables Complex Reasoning in Large Language Models", arXiv:2205.10625 (2022)]] ===== See Also ===== * [[chain_of_verification|Chain-of-Verification (CoVe)]] * [[step_back_prompting|Step-Back Prompting]] * [[skeleton_of_thought|Skeleton-of-Thought]]