====== Graph of Thoughts ======
**Graph of Thoughts (GoT)** is a reasoning framework introduced by Besta et al. (2024) that models LLM reasoning as an arbitrary directed graph $G = (V, E)$, where nodes $V$ represent individual "thoughts" (intermediate reasoning steps) and edges $E$ denote dependencies between them. By generalizing from [[linear|linear]] chains to arbitrary graph topologies, GoT enables cycles, merges, and refinement operations that capture complex reasoning patterns like dynamic programming subproblem combinations.(([[https://arxiv.org/abs/2308.09687|Besta et al. "Graph of Thoughts: Solving Elaborate Problems with Large Language Models" (arXiv:2308.09687]]))(([[https://arxiv.org/abs/2201.11903|Wei et al. "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" (arXiv:2201.11903]]))(([[https://arxiv.org/abs/2305.10601|Yao et al. "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" (arXiv:2305.10601]]))

===== Motivation =====
Prior prompting frameworks impose rigid structural constraints on reasoning. Chain-of-Thought (CoT) restricts reasoning to a linear sequence, while Tree-of-Thoughts (ToT) allows branching but not merging of independent paths. Many real-world problems require combining partial solutions, iterative refinement, and feedback loops — operations that demand a graph topology.

===== Framework Architecture =====
GoT defines reasoning as a programmatic construction of a directed graph through four core operations:

  * **Generation** — From a parent thought, prompt the LLM to produce $k$ child thoughts, enabling branching exploration
  * **Aggregation** — Merge multiple parent thoughts into a single child, synthesizing partial solutions (in-degree > 1)
  * **Refinement** — Iteratively improve a thought by feeding it back through the LLM with downstream context
  * **Scoring** — Evaluate each thought using LLM-based heuristics, such as error counting or voting across states

These operations compose into arbitrary graph topologies. A controller orchestrates the graph construction, an LLM executor processes individual thoughts, and a scoring module ranks candidates.

===== Graph Operations =====
The key graph operations and their structural implications:

$$G_{\text{GoT}} \supseteq G_{\text{ToT}} \supseteq G_{\text{CoT}}$$

Where CoT is a path graph (each node has in-degree and out-degree $\leq 1$), ToT is a tree (in-degree $\leq 1$, out-degree $\geq 1$), and GoT permits arbitrary connectivity including cycles.

<code python>
from got_framework import operations, OperationsGraph

# Build a GoT reasoning graph for a sorting task
graph = OperationsGraph()

# Phase 1: Generate 10 candidate partial solutions from input
graph.append(operations.Generate(num_parents=1, num_children=10))

# Phase 2: Score each candidate using error-count heuristic
graph.append(operations.Score(
    num_inputs=1,
    combined=False,
    scoring_fn=lambda thought: -count_errors(thought)
))

# Phase 3: Keep the best 3 candidates
graph.append(operations.KeepBestN(n=3))

# Phase 4: Aggregate top candidates into a merged solution
graph.append(operations.Aggregate(num_parents=3, num_children=1))

# Phase 5: Refine the merged solution
graph.append(operations.Refine(num_iterations=2))

# Phase 6: Evaluate against ground truth
graph.append(operations.GroundTruth(test_cases=test_data))

result = graph.execute(llm=model, input_thought=problem)
</code>

===== Comparison with CoT and ToT =====
^ Method ^ Structure ^ Branching ^ Merging ^ Cycles ^ Expressiveness ^
| CoT | Linear chain | No | No | No | Sequential reasoning only |
| ToT | Tree | Yes | No | No | Parallel exploration, backtracking |
| GoT | Arbitrary graph | Yes | Yes | Yes | Full graph: aggregation, refinement, feedback |

GoT subsumes both CoT (as a path subgraph) and ToT (as a tree subgraph) while enabling fundamentally new reasoning patterns through aggregation and cyclic refinement.

===== Key Results =====
  * **Sorting tasks**: GoT achieved 23x improvement over CoT baselines via error-rate reduction through iterative aggregation and refinement
  * **Creative writing**: Graph-constrained reasoning produced higher-quality outputs by merging diverse perspectives
  * **Token efficiency**: Aggregation and ranking reduce total tokens compared to exhaustive ToT search
  * **Game of 24**: GoT's flexible topologies yielded higher accuracy by enabling subproblem decomposition and combination

===== Mathematical Formulation =====
Each thought $t_i \in V$ is scored by a value function:

$$v(t_i) = \text{LLM}_{\text{eval}}(t_i, \text{context}(t_i))$$

Selection uses a ranking function over candidates:

$$t^* = \arg\max_{t_i \in \text{candidates}} v(t_i)$$

Aggregation combines multiple thoughts:

$$t_{\text{agg}} = \text{LLM}_{\text{gen}}(t_{p_1}, t_{p_2}, \ldots, t_{p_k})$$

where $t_{p_1}, \ldots, t_{p_k}$ are parent thoughts being merged.

===== See Also =====
  * [[graph_prompting|Graph Prompting]]
  * [[tree_of_thoughts|Tree of Thoughts]]
  * [[program_of_thoughts|Program of Thoughts]]
  * [[buffer_of_thoughts|Buffer of Thoughts]]
  * [[reasoning_models|Reasoning Models]]

===== References =====