Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
LLM-based agents are transforming data science workflows by autonomously executing end-to-end machine learning pipelines. DatawiseAgent (2025) introduces a notebook-centric framework that mimics how human data scientists work – iteratively planning, coding, debugging, and refining within Jupyter notebooks.
DatawiseAgent models the data science workflow as a Finite State Transducer (FST) with four orchestrated states connected by a transition function $\delta$ that responds to action signals and computational feedback:
States: {q_plan, q_inc, q_debug, q_filter}
Transition: δ(q, signal) → q'
The state transition function ensures that errors from planning or execution trigger appropriate repair sequences while constraints prevent infinite loops.
DFS-like Adaptive Re-Planning: The agent explores the solution space as a tree structure. After completing a subtask, it evaluates whether to backtrack (explore sibling nodes), advance deeper, or terminate. This enables dynamic adaptation when initial strategies fail.
Incremental Execution: Rather than generating entire solutions at once, the agent progressively produces text and code step-by-step for each subtask, incorporating real-time execution feedback to handle LLM limitations and task interdependencies.
Self-Debugging: When execution errors occur, the agent analyzes faulty code using fine-grained execution feedback. It iteratively refines code through LLM-based diagnosis, handling both syntax and logic errors across multiple repair attempts.
Post-Filtering: After debugging, this cleanup stage removes errors and redundancies to produce clean, executable code. It learns from past mistakes to prevent error accumulation in subsequent cells.
All agent-environment interactions occur through Jupyter notebook cells – markdown for planning and observations, code cells for execution. This unifies communication and supports:
The trajectory through the FST can be expressed as:
$$\tau = (q_0, a_0, q_1, a_1, \ldots, q_n)$$
where each state $q_i$ represents one of the four processing stages. The transition function incorporates execution signals:
$$\delta(q_{plan}, \text{error}) \rightarrow q_{debug}$$ $$\delta(q_{debug}, \text{fixed}) \rightarrow q_{filter}$$ $$\delta(q_{filter}, \text{clean}) \rightarrow q_{plan}$$
class DatawiseAgent: def __init__(self, notebook, llm, max_turns=20): self.notebook = notebook self.llm = llm self.state = "plan" self.max_turns = max_turns self.history = [] def run(self, task_description): self.notebook.add_markdown(task_description) for turn in range(self.max_turns): if self.state == "plan": subtask = self.llm.plan(self.history, task_description) action = "backtrack" if subtask.score < 0.3 else "advance" if action == "backtrack": self.notebook.rollback_to(subtask.branch_point) self.state = "execute" elif self.state == "execute": code = self.llm.generate_code(subtask) result = self.notebook.execute_cell(code) self.state = "debug" if result.has_error else "plan" elif self.state == "debug": fix = self.llm.diagnose_and_fix(result.error, code) result = self.notebook.execute_cell(fix) self.state = "filter" if not result.has_error else "debug" elif self.state == "filter": clean_code = self.llm.post_filter(self.notebook.cells) self.notebook.consolidate(clean_code) self.state = "plan" self.history.append((self.state, turn)) return self.notebook