====== Least-to-Most Prompting ======
**Least-to-Most Prompting (LtM)** is a prompting paradigm introduced by Zhou et al. at Google Research in 2022 that enables LLMs to solve complex problems by decomposing them into an ordered sequence of simpler sub-problems, solved from easiest to hardest. Each sub-problem's solution is passed as context to the next, enabling compositional generalization far beyond what standard few-shot prompting achieves.
graph LR
A[Complex Question Q] --> B[Decompose]
B --> C[Sub-Q1 easiest]
C --> D[Answer A1]
D --> E[Sub-Q2 + A1 context]
E --> F[Answer A2]
F --> G[Sub-Qn + all prior As]
G --> H[Final Answer]
===== Overview =====
Standard few-shot prompting and even Chain-of-Thought (CoT) struggle when test problems are harder or longer than the provided examples. LtM overcomes this limitation by explicitly separating the //planning// of how to decompose a problem from the //execution// of solving each piece. This separation allows models to generalize to problems significantly more complex than those in the prompt exemplars.
===== Two-Stage Approach =====
LtM operates in two distinct stages:
- **Decomposition Stage**: The LLM is prompted with exemplars showing how to break a complex problem $Q$ into an ordered list of sub-problems $Q_1, Q_2, \ldots, Q_n$, where complexity increases monotonically. The output is only the sub-problem list -- no solving occurs.
- **Sequential Solving Stage**: For each sub-problem $Q_i$ (from $i=1$ to $n$), the LLM generates answer $A_i$ conditioned on all prior question-answer pairs. The prompt explicitly appends this accumulated context.
===== Formal Description =====
The decomposition yields an ordered set of sub-problems:
$$Q \rightarrow \{Q_1, Q_2, \ldots, Q_n\} \quad \text{where } \text{complexity}(Q_i) \leq \text{complexity}(Q_{i+1})$$
Each sub-problem is solved with full prior context:
$$A_i = \text{LLM}(Q_i \mid \{(Q_1, A_1), (Q_2, A_2), \ldots, (Q_{i-1}, A_{i-1})\})$$
The final answer $A_n$ addresses the original problem $Q$. This recursive conditioning creates an auditable chain of reasoning where each step builds on verified prerequisites.
===== How It Differs from Chain-of-Thought =====
^ Aspect ^ Chain-of-Thought ^ Least-to-Most ^
| Structure | Monolithic: interleaved reasoning and answers in a single prompt | Separated: decomposition phase distinct from solving phase |
| Ordering | Implicit in exemplar patterns | Explicit least-to-most ordering |
| Context | Fixed exemplar context | Growing context: each step sees all prior solutions |
| Generalization | Struggles with OOD harder-than-seen problems | Robust to problems harder than exemplars |
| Dependency | No explicit dependency graph | Explicit dependency chain between sub-problems |
===== Code Example =====
import openai
def least_to_most(question, exemplars, client):
# Stage 1: Decompose into sub-problems
decompose_prompt = (
"Break the following problem into simpler sub-problems, "
"ordered from easiest to hardest.\n\n"
)
for ex in exemplars["decomposition"]:
decompose_prompt += f"Q: {ex['question']}\nSub-problems: {ex['subproblems']}\n\n"
decompose_prompt += f"Q: {question}\nSub-problems:"
decomposition = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": decompose_prompt}],
temperature=0
).choices[0].message.content
subproblems = [s.strip() for s in decomposition.split("\n") if s.strip()]
# Stage 2: Solve sequentially, passing prior context
solved_context = []
for subproblem in subproblems:
solve_prompt = "Solve each sub-problem using prior answers.\n\n"
for ex in exemplars["solving"]:
solve_prompt += f"{ex}\n\n"
for sq, sa in solved_context:
solve_prompt += f"Q: {sq}\nA: {sa}\n\n"
solve_prompt += f"Q: {subproblem}\nA:"
answer = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": solve_prompt}],
temperature=0
).choices[0].message.content
solved_context.append((subproblem, answer))
return solved_context[-1][1] # Final answer
===== Experimental Results =====
Evaluated on PaLM-540B with 8-shot exemplars and temperature 0:
^ Benchmark ^ Standard Prompting ^ CoT ^ Least-to-Most ^ Key Insight ^
| SCAN (length-4 OOD) | 14.3% | -- | 99.7% | ~10x gain on compositional generalization |
| DROP | -- | 69.1% | 78.2% | +9.1% via entity extraction then arithmetic |
| GSM8K | -- | 58.0% | 74.4% | Solves 7-step problems from 3-step demos |
The most striking result is on SCAN, where LtM achieves near-perfect accuracy on compositional commands that are far longer and more complex than any seen in the exemplars. This demonstrates the power of explicit decomposition for out-of-distribution generalization.
===== References =====
* [[https://arxiv.org/abs/2205.10625|Zhou et al., "Least-to-Most Prompting Enables Complex Reasoning in Large Language Models", arXiv:2205.10625 (2022)]]
===== See Also =====
* [[chain_of_verification|Chain-of-Verification (CoVe)]]
* [[step_back_prompting|Step-Back Prompting]]
* [[skeleton_of_thought|Skeleton-of-Thought]]