Function calling (also called tool use) allows LLMs to invoke external functions, APIs, and tools instead of generating free-form text. The model decides when to call a function, generates structured arguments, and your code executes the function and feeds results back. This guide covers implementation across major providers with practical patterns.
The function calling flow has four steps:
The LLM never executes code directly. It only generates the function name and arguments. Your application handles execution, giving you full control over security and validation. 1)
OpenAI uses the tools parameter with JSON Schema definitions:
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"units": {"type": "string", "enum": ["metric", "imperial"]}
},
"required": ["location"],
"additionalProperties": false
},
"strict": true
}
}]
from openai import OpenAI
import json
client = OpenAI()
messages = [{"role": "user", "content": "What is the weather in Paris?"}]
# Step 1: Send with tools
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
# Step 2: Handle tool calls
message = response.choices[0].message
if message.tool_calls:
for tool_call in message.tool_calls:
args = json.loads(tool_call.function.arguments)
result = get_weather(**args) # Execute the actual function
messages.append(message) # Add assistant message with tool calls
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Step 3: Get final response
final = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
The strict: true parameter enforces that the model's output exactly matches the JSON Schema, preventing hallucinated parameters. Always set additionalProperties: false with strict mode. 2)
Claude uses the tools array with input_schema for parameter definitions:
tools = [{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"units": {"type": "string", "enum": ["metric", "imperial"]}
},
"required": ["location"]
}
}]
import anthropic
client = anthropic.Anthropic()
messages = [{"role": "user", "content": "What is the weather in Paris?"}]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
# Handle tool_use blocks in response content
for block in response.content:
if block.type == "tool_use":
result = get_weather(**block.input)
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
}]
})
Claude returns tool_use content blocks. Tool results are sent back as tool_result blocks within a user message, matched by the tool_use_id.
Gemini uses functionDeclarations within the tools parameter:
tools = [{
"function_declarations": [{
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}]
}]
import google.generativeai as genai
model = genai.GenerativeModel("gemini-2.0-flash", tools=tools)
chat = model.start_chat()
response = chat.send_message("What is the weather in Paris?")
# Handle function calls
for part in response.parts:
if hasattr(part, "function_call"):
result = get_weather(**dict(part.function_call.args))
response = chat.send_message(
genai.protos.Content(parts=[genai.protos.Part(
function_response=genai.protos.FunctionResponse(
name=part.function_call.name,
response={"result": result}
)
)])
)
Beyond tool calling, enforce structured JSON responses:
response_format: {“type”: “json_schema”, “json_schema”: {…}} for guaranteed schema compliance
Strict mode (strict: true in OpenAI) constrains the model's token generation to only produce valid JSON matching the schema. This eliminates malformed output at a small latency cost. 3)
Models can return multiple tool calls in a single response when tasks are independent:
# Model returns 3 tool calls at once
tool_calls = response.choices[0].message.tool_calls # List of 3
# Execute in parallel
import asyncio
async def run_parallel(tool_calls):
tasks = [execute_tool(tc) for tc in tool_calls]
return await asyncio.gather(*tasks)
results = asyncio.run(run_parallel(tool_calls))
# Feed all results back
for tc, result in zip(tool_calls, results):
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(result)
})
Parallel execution significantly reduces latency when the model needs data from multiple independent sources. 4)
Robust error handling is critical since tool execution can fail:
def safe_execute_tool(tool_call):
try:
args = json.loads(tool_call.function.arguments)
# Validate against schema
jsonschema.validate(args, tool_schemas[tool_call.function.name])
result = tool_functions[tool_call.function.name](**args)
return {"success": True, "data": result}
except json.JSONDecodeError:
return {"success": False, "error": "Invalid JSON arguments"}
except jsonschema.ValidationError as e:
return {"success": False, "error": f"Invalid parameters: {e.message}"}
except Exception as e:
return {"success": False, "error": f"Execution failed: {str(e)}"}
Return error information as the tool result so the model can reason about the failure and try an alternative approach.
strict: true and additionalProperties: false to prevent hallucinated parameterstool_call_id or tool_use_idadditionalProperties: false lets the model invent extra parameters