Least-to-Most Prompting (LtM) is a prompting paradigm introduced by Zhou et al. at Google Research in 2022 that enables LLMs to solve complex problems by decomposing them into an ordered sequence of simpler sub-problems, solved from easiest to hardest. Each sub-problem's solution is passed as context to the next, enabling compositional generalization far beyond what standard few-shot prompting achieves.
Standard few-shot prompting and even Chain-of-Thought (CoT) struggle when test problems are harder or longer than the provided examples. LtM overcomes this limitation by explicitly separating the planning of how to decompose a problem from the execution of solving each piece. This separation allows models to generalize to problems significantly more complex than those in the prompt exemplars.
LtM operates in two distinct stages:
The decomposition yields an ordered set of sub-problems:
$$Q \rightarrow \{Q_1, Q_2, \ldots, Q_n\} \quad \text{where } \text{complexity}(Q_i) \leq \text{complexity}(Q_{i+1})$$
Each sub-problem is solved with full prior context:
$$A_i = \text{LLM}(Q_i \mid \{(Q_1, A_1), (Q_2, A_2), \ldots, (Q_{i-1}, A_{i-1})\})$$
The final answer $A_n$ addresses the original problem $Q$. This recursive conditioning creates an auditable chain of reasoning where each step builds on verified prerequisites.
| Aspect | Chain-of-Thought | Least-to-Most |
|---|---|---|
| Structure | Monolithic: interleaved reasoning and answers in a single prompt | Separated: decomposition phase distinct from solving phase |
| Ordering | Implicit in exemplar patterns | Explicit least-to-most ordering |
| Context | Fixed exemplar context | Growing context: each step sees all prior solutions |
| Generalization | Struggles with OOD harder-than-seen problems | Robust to problems harder than exemplars |
| Dependency | No explicit dependency graph | Explicit dependency chain between sub-problems |
import openai def least_to_most(question, exemplars, client): # Stage 1: Decompose into sub-problems decompose_prompt = ( "Break the following problem into simpler sub-problems, " "ordered from easiest to hardest.\n\n" ) for ex in exemplars["decomposition"]: decompose_prompt += f"Q: {ex['question']}\nSub-problems: {ex['subproblems']}\n\n" decompose_prompt += f"Q: {question}\nSub-problems:" decomposition = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": decompose_prompt}], temperature=0 ).choices[0].message.content subproblems = [s.strip() for s in decomposition.split("\n") if s.strip()] # Stage 2: Solve sequentially, passing prior context solved_context = [] for subproblem in subproblems: solve_prompt = "Solve each sub-problem using prior answers.\n\n" for ex in exemplars["solving"]: solve_prompt += f"{ex}\n\n" for sq, sa in solved_context: solve_prompt += f"Q: {sq}\nA: {sa}\n\n" solve_prompt += f"Q: {subproblem}\nA:" answer = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": solve_prompt}], temperature=0 ).choices[0].message.content solved_context.append((subproblem, answer)) return solved_context[-1][1] # Final answer
Evaluated on PaLM-540B with 8-shot exemplars and temperature 0:
| Benchmark | Standard Prompting | CoT | Least-to-Most | Key Insight |
|---|---|---|---|---|
| SCAN (length-4 OOD) | 14.3% | – | 99.7% | ~10x gain on compositional generalization |
| DROP | – | 69.1% | 78.2% | +9.1% via entity extraction then arithmetic |
| GSM8K | – | 58.0% | 74.4% | Solves 7-step problems from 3-step demos |
The most striking result is on SCAN, where LtM achieves near-perfect accuracy on compositional commands that are far longer and more complex than any seen in the exemplars. This demonstrates the power of explicit decomposition for out-of-distribution generalization.