Haystack is an open-source AI orchestration framework by deepset for building customizable, production-ready LLM applications through modular pipelines. With 24.6K GitHub stars and active development since 2019, it is one of the longest-standing AI frameworks, evolving from a question-answering system to a comprehensive pipeline-based orchestration platform.
framework python pipelines rag production orchestration deepset
Haystack was created by deepset (Berlin, Germany) in 2019 as an open-source question-answering framework focused on extractive QA over documents. Over six years, it has evolved through multiple paradigm shifts — from neural search with dense retrievers (2020-2021) to RAG pipelines (2022-2023) to the fully redesigned Haystack 2.0 with component-based architecture and agent workflows (2023-2024). The framework emphasizes production-readiness with built-in observability, async execution, and Kubernetes integration.1)2)3)
Haystack's pipeline-centric architecture:
Infrastructure Layer:
Building a RAG pipeline with Haystack 2.x:
from haystack import Pipeline from haystack.components.generators import OpenAIGenerator from haystack.components.builders.prompt_builder import PromptBuilder from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.document_stores.in_memory import InMemoryDocumentStore from haystack.dataclasses import Document # Set up document store with sample data doc_store = InMemoryDocumentStore() doc_store.write_documents([ Document(content="Haystack is an AI orchestration framework by deepset."), Document(content="It supports [[modular|modular]] pipelines for RAG and search."), Document(content="Haystack 2.0 introduced component-based architecture."), ]) # Build RAG pipeline template = """ Given these documents, answer the question. Documents: {% for doc in documents %}{{ doc.content }}{% endfor %} Question: {{ question }} """ rag_pipeline = Pipeline() rag_pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=doc_store)) rag_pipeline.add_component("prompt", PromptBuilder(template=template)) rag_pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o")) rag_pipeline.connect("retriever", "prompt.documents") rag_pipeline.connect("prompt", "llm") result = rag_pipeline.run({ "retriever": {"query": "What is Haystack?"}, "prompt": {"question": "What is Haystack?"} }) print(result["llm"]["replies"][0])
| Aspect | Haystack | LangChain |
|---|---|---|
| Core Paradigm | Pipeline DAGs with visual UI | Chains/Agents with LCEL |
| Modularity | Strong typing, 200+ components | Flexible, vast integrations |
| Production | Built-in observability, K8s, async | Requires LangSmith/LangServe |
| RAG Focus | Optimized for search/retrieval | General-purpose agents |
| History | Since 2019 (6+ years) | Since 2022 |
| Stars | 24.6K (steady growth) | 131K (larger community) |