ChemCrow: Augmenting LLMs with Chemistry Tools

ChemCrow is an LLM-powered chemistry agent introduced by Bran et al. (2023) that augments GPT-4 with 18 expert-designed chemistry tools to autonomously perform tasks in organic synthesis, drug discovery, and materials design¹⁾. With 795 citations, it demonstrates that domain-specific tool augmentation enables LLMs to bridge the gap between computational and experimental chemistry, including executing synthesis on robotic platforms.

arXiv:2304.05376

The 18 Chemistry Expert Tools

ChemCrow integrates tools across four categories:

General Tools

WebSearch: Query the web for chemical information
LitSearch: Search scientific literature databases
Python REPL: Execute computational chemistry code
HumanInput: Query a human expert when uncertain

Molecular Analysis and Design

Name2SMILES / CAS2SMILES / SMILES2Name: Chemical name and identifier conversions
FuncGroups: Identify functional groups in molecules
Similarity: Compute molecular similarity scores using Tanimoto coefficient
SMILES2Weight: Calculate molecular weight from SMILES representation
ModifyMol: Generate structural modifications for retro/forward synthesis

Safety and Compliance

ChemicalWeaponCheck: Screen compounds against chemical weapons databases (CAS-based)
ExplosiveCheck: Detect explosive properties via PubChem
PatentChecker: Verify patent status of compounds

Reaction and Synthesis

RXNPredict: Predict reaction products using RXN4Chemistry
RXNPlanner: Generate multi-step synthetic routes with conditions and solvents
NamedRxnFinder: Identify named reactions in transformations

ReAct Reasoning Loop

ChemCrow employs the ReAct framework for iterative reasoning²⁾. At each step $t$, the agent generates:

$$a_t = \text{LLM}(\text{Thought}_t, \text{Action}_t, \text{Input}_t \mid h_{<t})$$

where $h_{<t}$ is the history of previous thought-action-observation triples. The agent selects from the 18 tools based on the current reasoning state and observation feedback.

System Architecture

graph TD A[User Chemistry Query] --> B[GPT-4 Reasoning Engine] B --> C{Select Tool} C --> D[General Tools] C --> E[Molecular Analysis] C --> F[Safety Checks] C --> G[Reaction and Synthesis] D --> H[Observation] E --> H F --> H G --> H H --> I{Safety Verified?} I, No --> J[Halt: Safety Violation] I, Yes --> K{Task Complete?} K, No --> B K, Yes --> L[Final Answer with Citations] G --> M[RXN4Chemistry API] G --> N[RoboRXN Robotic Platform] E --> O[PubChem Database]

Code Example

# Simplified ChemCrow agent with tool selection
TOOLS = {
    "Name2SMILES": lambda name: pubchem_lookup(name, "smiles"),
    "RXNPredict": lambda rxn: rxn4chemistry_predict(rxn),
    "RXNPlanner": lambda target: rxn4chemistry_plan(target),
    "ChemicalWeaponCheck": lambda cas: check_cw_database(cas),
    "ExplosiveCheck": lambda smiles: check_explosive(smiles),
    "Similarity": lambda pair: tanimoto_similarity(*pair),
    "FuncGroups": lambda smiles: identify_groups(smiles),
}
 
def chemcrow_run(query, llm, max_steps=15):
    history = []
    for step in range(max_steps):
        thought, action, action_input = llm.reason(query, history)
        # Safety gate: always check before synthesis
        if action in ("RXNPlanner", "RXNPredict"):
            smiles = TOOLS["Name2SMILES"](action_input)
            if not TOOLS["ChemicalWeaponCheck"](smiles):
                return "Halted: safety violation detected"
            if not TOOLS["ExplosiveCheck"](smiles):
                return "Halted: explosive compound detected"
        observation = TOOLSaction(action_input)
        history.append((thought, action, observation))
        if is_final_answer(thought):
            return compile_answer(history)

Demonstrated Capabilities

Autonomous synthesis: Successfully planned and executed synthesis of an insect repellent and three organocatalysts
Drug discovery: Guided novel chromophore discovery with target optical properties
Safety enforcement: Proactive screening against chemical weapons and explosives databases
Robotic execution: Integrated with IBM RoboRXN for automated wet-lab synthesis³⁾
Outperforms vanilla GPT-4 on chemistry-specific tasks per expert evaluation

References

¹⁾

Bran et al. "ChemCrow: Augmenting large-language models with chemistry tools" (2023

²⁾

Yao et al. "ReAct: Synergizing Reasoning and Acting in Language Models" (2022

³⁾

IBM RXN for Chemistry Platform

AI Agent Knowledge Base

Sidebar

Table of Contents

ChemCrow: Augmenting LLMs with Chemistry Tools

The 18 Chemistry Expert Tools

General Tools

Molecular Analysis and Design

Safety and Compliance

Reaction and Synthesis

ReAct Reasoning Loop

System Architecture

Code Example

Demonstrated Capabilities

See Also

References

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

ChemCrow: Augmenting LLMs with Chemistry Tools

The 18 Chemistry Expert Tools

General Tools

Molecular Analysis and Design

Safety and Compliance

Reaction and Synthesis

ReAct Reasoning Loop

System Architecture

Code Example

Demonstrated Capabilities

See Also

References

Page Tools