Function Calling

Function calling (also called tool calling) is the mechanism that enables large language models to invoke external functions, APIs, and tools in response to user requests. This capability transforms LLMs from passive text generators into action-taking agents that can query databases, call APIs, execute code, and orchestrate multi-step workflows. It is the foundational building block of all agentic AI systems.

How Function Calling Works

Function calling operates as a structured loop between the LLM and external tools:

Definition — You provide the LLM with function definitions including names, descriptions, and JSON Schema parameter specifications
Detection — The model determines when a function call is needed based on user input
Invocation — The model outputs structured JSON matching the function signature instead of natural language
Execution — Your application executes the function with the provided arguments
Response — The result is fed back to the model, which generates a final response

Models like GPT-4 and Claude are fine-tuned specifically to detect when functions should be called and to produce correctly formatted JSON output.

Cross-Provider Implementation

All major providers support function calling with slightly different APIs but the same core JSON Schema format:

from litellm import completion
 
# Universal tool definition works across providers
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            }
        }
    }
]
 
# Same code works for OpenAI, Anthropic, Google, open-source
for model in ["gpt-4", "claude-sonnet-4-20250514", "gemini/gemini-pro"]:
    response = completion(
        model=model,
        messages=[{"role": "user", "content": "Weather in Tokyo?"}],
        tools=tools,
        tool_choice="auto"
    )
    tool_calls = response.choices[0].message.tool_calls
    if tool_calls:
        print(f"{model}: {tool_calls[0].function.name}({tool_calls[0].function.arguments})")

Provider Differences

Provider	API Field	Parallel Calls	Structured Output	Notable Feature
OpenAI	`tools`	Yes	JSON mode, strict mode	Function calling fine-tuning support
Anthropic	`tools`	Yes	Tool use with `tool_result`	Forced tool use with `tool_choice`
Google Gemini	`tools`	Yes	Function declarations	Automatic grounding with Search
Open Source	Varies	Model-dependent	Via grammar constraints	BFCL leaderboard rankings

Parallel Tool Calls

Modern LLMs can request multiple function calls simultaneously when operations are independent. This significantly improves latency for workflows requiring multiple data fetches:

# Model returns multiple tool_calls in a single response
# e.g., "What's the weather in Tokyo and New York?"
# Returns: [call(get_weather, city="Tokyo"), call(get_weather, city="New York")]
 
import asyncio
 
async def execute_parallel_calls(tool_calls, registry):
    tasks = [registry[call.function.name](**json.loads(call.function.arguments))
             for call in tool_calls]
    return await asyncio.gather(*tasks)

Structured Outputs

Structured output mode guarantees the model's response conforms exactly to a provided JSON Schema. OpenAI's strict mode and Anthropic's tool use both enforce schema compliance at the token generation level, eliminating parsing failures in production.

Benchmarking

The Berkeley Function Calling Leaderboard (BFCL) is the standard benchmark for evaluating function calling across models, testing simple calls, parallel calls, multiple functions, and relevance detection.

AI Agent Knowledge Base

Sidebar

Table of Contents

Function Calling

How Function Calling Works

Cross-Provider Implementation

Provider Differences

Parallel Tool Calls

Structured Outputs

Benchmarking

References

See Also

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

Function Calling

How Function Calling Works

Cross-Provider Implementation

Provider Differences

Parallel Tool Calls

Structured Outputs

Benchmarking

References

See Also

Page Tools