====== Agentic AI Workflows ======

An **agentic AI workflow** is a process where AI agents autonomously perceive, decide, reason through multi-step tasks, and execute actions with minimal human intervention. Unlike single-shot prompting, agentic workflows place a model inside a loop — it plans, acts, observes results, and iterates until a goal is satisfied.((Andrew Ng introduced the term to mainstream practitioner audiences at the Sequoia AI Ascent conference, March 2024: [[https://x.com/AndrewYNg/status/1770897666702233815|Andrew Ng on X, March 2024]]))

Ng's key insight captures the paradigm shift: //«Better workflows beat better models.»//

===== Definition =====

A workflow earns the label //agentic// when it exhibits all of the following properties:

  * **Autonomy** — the agent selects its next action without step-by-step human direction.
  * **Iterative reasoning** — outputs are refined across multiple passes rather than returned in one shot.
  * **Tool integration** — the agent can invoke external capabilities: web search, code execution, database queries, REST APIs.
  * **Adaptive planning** — if a sub-task fails or new information arrives, the agent revises its plan on the fly.
  * **Memory** — state is preserved across steps (in-context, vector store, or external database) so earlier results inform later decisions.

These properties together allow a small model running inside a well-designed loop to outperform a much larger model running in zero-shot mode — a finding repeatedly confirmed on coding benchmarks (see [[#benchmarks|Benchmarks]] below).

===== Four Design Patterns =====

Andrew Ng identified four foundational design patterns for agentic systems at the Sequoia AI Ascent conference in March 2024.((Full thread: [[https://x.com/AndrewYNg/status/1773393357022298617|Andrew Ng on X, agentic design patterns, March 2024]]))

==== Reflection ====

The agent critiques its own output and iterates. A separate //critic// call (which may use the same model) scores or annotates the draft; the generator then revises. This simple loop pushed GPT-4 from **67% to 88%** on the HumanEval coding benchmark — matching or exceeding human performance without any change to the underlying model weights.

==== Tool Use (ReAct) ====

The agent interleaves //Reasoning// and //Acting// steps (the ReAct pattern): it emits a thought, calls a tool (web search, Python interpreter, SQL query, external API), observes the result, and continues reasoning. Tool use breaks the knowledge-cutoff barrier and enables real-time data access.

==== Planning ====

The agent decomposes a high-level goal into an ordered sequence of sub-tasks, executes them, and replans when a step fails or returns unexpected results. Planning agents can handle objectives that span dozens of tool calls and require backtracking.

==== Multi-Agent ====

Specialised agents collaborate: one orchestrates, others execute domain-specific subtasks (coding, research, QA, summarisation). Systems such as **ChatDev** model an entire software company as a society of agents with distinct roles. Multi-agent architectures increase parallelism and allow each agent to stay within a focused context window.

==== Human-in-the-Loop (Emerging 5th Pattern) ====

Practitioners increasingly treat deliberate human checkpoints as a first-class design element rather than an afterthought. The agent pauses at high-stakes decision points, presents its reasoning, and waits for approval before proceeding. This pattern is especially prominent in enterprise deployments where auditability and compliance are required.

===== Benchmarks =====

HumanEval pass@1 results illustrate the leverage that agentic scaffolding adds on top of raw model capability:

^ Setup ^ HumanEval pass@1 ^
| GPT-3.5, zero-shot | 48% |
| GPT-4, zero-shot | 67% |
| GPT-3.5, agentic (iterative) | >67% (exceeded GPT-4 zero-shot) |
| GPT-4, agentic (AlphaCodium flow) | **95.1%** |

The AlphaCodium result — achieved by wrapping GPT-4 in a multi-step code-generation and test-driven refinement loop — exceeds the zero-shot score by more than 28 percentage points without fine-tuning.

===== Frameworks =====

The following open-source and managed frameworks are widely used to build agentic AI workflows:

^ Framework ^ Maintainer ^ Primary abstraction ^ Notes ^
| [[https://github.com/langchain-ai/langgraph|LangGraph]] | LangChain | Stateful directed graph | Fine-grained control over agent loops; see [[langgraph]] |
| [[https://github.com/crewAIInc/crewAI|CrewAI]] | CrewAI Inc. | Role-based crew of agents | High-level; see [[crewai]] |
| [[https://github.com/microsoft/autogen|AutoGen]] | Microsoft | Conversational multi-agent | Research-grade; see [[autogen]] |
| [[https://github.com/run-llama/llama_index|LlamaIndex]] | LlamaIndex | Data-centric agent pipelines | Strong RAG integration |
| [[https://platform.openai.com/docs/guides/agents|OpenAI Agents SDK]] | OpenAI | Handoffs and guardrails | First-party SDK for GPT models |
| [[https://aws.amazon.com/bedrock/agents/|Amazon Bedrock Agents]] | AWS | Managed agent runtime | Enterprise-managed; native AWS tooling |
| [[https://cloud.google.com/vertex-ai/generative-ai/docs/agent-builder/agents|Google Vertex AI Agent Builder]] | Google Cloud | Managed agent runtime | Integrates Gemini models and Google Search |

===== Enterprise Adoption =====

Adoption has moved from research to production faster than most technology cycles:

  * **PwC** reports that **79% of organisations** are running AI agents in production as of early 2026, up from negligible deployment two years prior.
  * **AT&T** processes **8 billion tokens per day** through its internal agentic stack and has achieved a **90% cost reduction** compared to earlier approaches — a case study in scaling agent infrastructure to enterprise traffic.((AT&T agentic AI deep-dive: [[https://www.techbuddies.io/2026/02/27/inside-atts-agentic-ai-stack-how-8-billion-tokens-a-day-led-to-a-90-cost-cut/|Inside AT&T's Agentic AI Stack, TechBuddies, Feb 2026]]))
  * **Deloitte Tech Trends 2026** cautions that organisations deploying agents at scale must address trust, observability, and governance before expanding agent autonomy further.((Deloitte Technology Trends 2026, Agentic AI Strategy: [[https://www.deloitte.com/us/en/insights/topics/technology-management/tech-trends/2026/agentic-ai-strategy.html|Deloitte, 2026]]))

The cost-reduction figures from AT&T underline Ng's workflow-over-model thesis: infrastructure and orchestration design matter as much as model selection.

===== See Also =====

  * [[agent_orchestration]]
  * [[react_framework]]
  * [[plan_and_execute_agents]]
  * [[multi_agent_systems]]
  * [[crewai]]
  * [[langgraph]]
  * [[autogen]]