====== Agentic AI Workflows ====== An **agentic AI workflow** is a process where AI agents autonomously perceive, decide, reason through multi-step tasks, and execute actions with minimal human intervention. Unlike single-shot prompting, agentic workflows place a model inside a loop — it plans, acts, observes results, and iterates until a goal is satisfied.((Andrew Ng introduced the term to mainstream practitioner audiences at the Sequoia AI Ascent conference, March 2024: [[https://x.com/AndrewYNg/status/1770897666702233815|Andrew Ng on X, March 2024]])) Ng's key insight captures the paradigm shift: //«Better workflows beat better models.»// ===== Definition ===== A workflow earns the label //agentic// when it exhibits all of the following properties: * **Autonomy** — the agent selects its next action without step-by-step human direction. * **Iterative reasoning** — outputs are refined across multiple passes rather than returned in one shot. * **Tool integration** — the agent can invoke external capabilities: web search, code execution, database queries, REST APIs. * **Adaptive planning** — if a sub-task fails or new information arrives, the agent revises its plan on the fly. * **Memory** — state is preserved across steps (in-context, vector store, or external database) so earlier results inform later decisions. These properties together allow a small model running inside a well-designed loop to outperform a much larger model running in zero-shot mode — a finding repeatedly confirmed on coding benchmarks (see [[#benchmarks|Benchmarks]] below). ===== Four Design Patterns ===== Andrew Ng identified four foundational design patterns for agentic systems at the Sequoia AI Ascent conference in March 2024.((Full thread: [[https://x.com/AndrewYNg/status/1773393357022298617|Andrew Ng on X, agentic design patterns, March 2024]])) ==== Reflection ==== The agent critiques its own output and iterates. A separate //critic// call (which may use the same model) scores or annotates the draft; the generator then revises. This simple loop pushed GPT-4 from **67% to 88%** on the HumanEval coding benchmark — matching or exceeding human performance without any change to the underlying model weights. ==== Tool Use (ReAct) ==== The agent interleaves //Reasoning// and //Acting// steps (the ReAct pattern): it emits a thought, calls a tool (web search, Python interpreter, SQL query, external API), observes the result, and continues reasoning. Tool use breaks the knowledge-cutoff barrier and enables real-time data access. ==== Planning ==== The agent decomposes a high-level goal into an ordered sequence of sub-tasks, executes them, and replans when a step fails or returns unexpected results. Planning agents can handle objectives that span dozens of tool calls and require backtracking. ==== Multi-Agent ==== Specialised agents collaborate: one orchestrates, others execute domain-specific subtasks (coding, research, QA, summarisation). Systems such as **ChatDev** model an entire software company as a society of agents with distinct roles. Multi-agent architectures increase parallelism and allow each agent to stay within a focused context window. ==== Human-in-the-Loop (Emerging 5th Pattern) ==== Practitioners increasingly treat deliberate human checkpoints as a first-class design element rather than an afterthought. The agent pauses at high-stakes decision points, presents its reasoning, and waits for approval before proceeding. This pattern is especially prominent in enterprise deployments where auditability and compliance are required. ===== Benchmarks ===== HumanEval pass@1 results illustrate the leverage that agentic scaffolding adds on top of raw model capability: ^ Setup ^ HumanEval pass@1 ^ | GPT-3.5, zero-shot | 48% | | GPT-4, zero-shot | 67% | | GPT-3.5, agentic (iterative) | >67% (exceeded GPT-4 zero-shot) | | GPT-4, agentic (AlphaCodium flow) | **95.1%** | The AlphaCodium result — achieved by wrapping GPT-4 in a multi-step code-generation and test-driven refinement loop — exceeds the zero-shot score by more than 28 percentage points without fine-tuning. ===== Frameworks ===== The following open-source and managed frameworks are widely used to build agentic AI workflows: ^ Framework ^ Maintainer ^ Primary abstraction ^ Notes ^ | [[https://github.com/langchain-ai/langgraph|LangGraph]] | LangChain | Stateful directed graph | Fine-grained control over agent loops; see [[langgraph]] | | [[https://github.com/crewAIInc/crewAI|CrewAI]] | CrewAI Inc. | Role-based crew of agents | High-level; see [[crewai]] | | [[https://github.com/microsoft/autogen|AutoGen]] | Microsoft | Conversational multi-agent | Research-grade; see [[autogen]] | | [[https://github.com/run-llama/llama_index|LlamaIndex]] | LlamaIndex | Data-centric agent pipelines | Strong RAG integration | | [[https://platform.openai.com/docs/guides/agents|OpenAI Agents SDK]] | OpenAI | Handoffs and guardrails | First-party SDK for GPT models | | [[https://aws.amazon.com/bedrock/agents/|Amazon Bedrock Agents]] | AWS | Managed agent runtime | Enterprise-managed; native AWS tooling | | [[https://cloud.google.com/vertex-ai/generative-ai/docs/agent-builder/agents|Google Vertex AI Agent Builder]] | Google Cloud | Managed agent runtime | Integrates Gemini models and Google Search | ===== Enterprise Adoption ===== Adoption has moved from research to production faster than most technology cycles: * **PwC** reports that **79% of organisations** are running AI agents in production as of early 2026, up from negligible deployment two years prior. * **AT&T** processes **8 billion tokens per day** through its internal agentic stack and has achieved a **90% cost reduction** compared to earlier approaches — a case study in scaling agent infrastructure to enterprise traffic.((AT&T agentic AI deep-dive: [[https://www.techbuddies.io/2026/02/27/inside-atts-agentic-ai-stack-how-8-billion-tokens-a-day-led-to-a-90-cost-cut/|Inside AT&T's Agentic AI Stack, TechBuddies, Feb 2026]])) * **Deloitte Tech Trends 2026** cautions that organisations deploying agents at scale must address trust, observability, and governance before expanding agent autonomy further.((Deloitte Technology Trends 2026, Agentic AI Strategy: [[https://www.deloitte.com/us/en/insights/topics/technology-management/tech-trends/2026/agentic-ai-strategy.html|Deloitte, 2026]])) The cost-reduction figures from AT&T underline Ng's workflow-over-model thesis: infrastructure and orchestration design matter as much as model selection. ===== See Also ===== * [[agent_orchestration]] * [[react_framework]] * [[plan_and_execute_agents]] * [[multi_agent_systems]] * [[crewai]] * [[langgraph]] * [[autogen]]