====== How to Use Function Calling ====== Function calling (also called tool use) allows LLMs to invoke external functions, APIs, and tools instead of generating free-form text. The model decides when to call a function, generates structured arguments, and your code executes the function and feeds results back. This guide covers implementation across major providers with practical patterns. ===== How It Works ===== The function calling flow has four steps: - **Define tools** -- describe available functions with names, descriptions, and JSON Schema parameter definitions - **Send to LLM** -- include tool definitions alongside the user message - **Parse tool calls** -- the model returns structured tool call requests instead of (or alongside) text - **Execute and return** -- run the function, send results back to the model for final reasoning The LLM never executes code directly. It only generates the function name and arguments. Your application handles execution, giving you full control over security and validation. ((Source: [[https://developers.openai.com/api/docs/guides/function-calling/|OpenAI Function Calling Guide]])) ===== OpenAI Function Calling ===== OpenAI uses the ''tools'' parameter with JSON Schema definitions: === Tool Definition === tools = [{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"}, "units": {"type": "string", "enum": ["metric", "imperial"]} }, "required": ["location"], "additionalProperties": false }, "strict": true } }] === Complete Flow === from openai import OpenAI import json client = OpenAI() messages = [{"role": "user", "content": "What is the weather in Paris?"}] # Step 1: Send with tools response = client.chat.completions.create( model="gpt-4o", messages=messages, tools=tools, tool_choice="auto" ) # Step 2: Handle tool calls message = response.choices[0].message if message.tool_calls: for tool_call in message.tool_calls: args = json.loads(tool_call.function.arguments) result = get_weather(**args) # Execute the actual function messages.append(message) # Add assistant message with tool calls messages.append({ "role": "tool", "tool_call_id": tool_call.id, "content": json.dumps(result) }) # Step 3: Get final response final = client.chat.completions.create( model="gpt-4o", messages=messages ) The ''strict: true'' parameter enforces that the model's output exactly matches the JSON Schema, preventing hallucinated parameters. Always set ''additionalProperties: false'' with strict mode. ((Source: [[https://developers.openai.com/api/docs/guides/function-calling/|OpenAI Function Calling Guide]])) ===== Anthropic Claude Tool Use ===== Claude uses the ''tools'' array with ''input_schema'' for parameter definitions: === Tool Definition === tools = [{ "name": "get_weather", "description": "Get current weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"}, "units": {"type": "string", "enum": ["metric", "imperial"]} }, "required": ["location"] } }] === Complete Flow === import anthropic client = anthropic.Anthropic() messages = [{"role": "user", "content": "What is the weather in Paris?"}] response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=tools, messages=messages ) # Handle tool_use blocks in response content for block in response.content: if block.type == "tool_use": result = get_weather(**block.input) messages.append({"role": "assistant", "content": response.content}) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result) }] }) Claude returns ''tool_use'' content blocks. Tool results are sent back as ''tool_result'' blocks within a user message, matched by the ''tool_use_id''. ===== Google Gemini Function Calling ===== Gemini uses ''functionDeclarations'' within the ''tools'' parameter: === Tool Definition === tools = [{ "function_declarations": [{ "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"} }, "required": ["location"] } }] }] === Complete Flow === import google.generativeai as genai model = genai.GenerativeModel("gemini-2.0-flash", tools=tools) chat = model.start_chat() response = chat.send_message("What is the weather in Paris?") # Handle function calls for part in response.parts: if hasattr(part, "function_call"): result = get_weather(**dict(part.function_call.args)) response = chat.send_message( genai.protos.Content(parts=[genai.protos.Part( function_response=genai.protos.FunctionResponse( name=part.function_call.name, response={"result": result} ) )]) ) ===== Structured Output ===== Beyond tool calling, enforce structured JSON responses: * **OpenAI** -- use ''response_format: {"type": "json_schema", "json_schema": {...}}'' for guaranteed schema compliance * **Anthropic** -- instruct via system prompt with examples; use tool_use with a single "output" tool for enforcement * **All providers** -- validate responses client-side with Pydantic or jsonschema Strict mode (''strict: true'' in OpenAI) constrains the model's token generation to only produce valid JSON matching the schema. This eliminates malformed output at a small latency cost. ((Source: [[https://developers.openai.com/api/docs/guides/function-calling/|OpenAI Function Calling Guide]])) ===== Parallel Function Calls ===== Models can return multiple tool calls in a single response when tasks are independent: # Model returns 3 tool calls at once tool_calls = response.choices[0].message.tool_calls # List of 3 # Execute in parallel import asyncio async def run_parallel(tool_calls): tasks = [execute_tool(tc) for tc in tool_calls] return await asyncio.gather(*tasks) results = asyncio.run(run_parallel(tool_calls)) # Feed all results back for tc, result in zip(tool_calls, results): messages.append({ "role": "tool", "tool_call_id": tc.id, "content": json.dumps(result) }) Parallel execution significantly reduces latency when the model needs data from multiple independent sources. ((Source: [[https://developers.openai.com/api/docs/guides/function-calling/|OpenAI Function Calling Guide]])) ===== Error Handling ===== Robust error handling is critical since tool execution can fail: def safe_execute_tool(tool_call): try: args = json.loads(tool_call.function.arguments) # Validate against schema jsonschema.validate(args, tool_schemas[tool_call.function.name]) result = tool_functions[tool_call.function.name](**args) return {"success": True, "data": result} except json.JSONDecodeError: return {"success": False, "error": "Invalid JSON arguments"} except jsonschema.ValidationError as e: return {"success": False, "error": f"Invalid parameters: {e.message}"} except Exception as e: return {"success": False, "error": f"Execution failed: {str(e)}"} Return error information as the tool result so the model can reason about the failure and try an alternative approach. ===== Best Practices ===== * **Write clear descriptions** -- the model relies on function and parameter descriptions to decide when and how to call tools * **Use strict mode** -- set ''strict: true'' and ''additionalProperties: false'' to prevent hallucinated parameters * **Limit tool count** -- keep to 5-10 tools per request; too many confuses the model * **Use enums** -- constrain parameter values with enums wherever possible * **Validate everything** -- never trust the model's output; validate JSON and schema before execution * **Match tool_call IDs** -- always pair results with the correct ''tool_call_id'' or ''tool_use_id'' * **Handle missing tools** -- if the model calls a function that does not exist, return a clear error * **Test edge cases** -- test with ambiguous queries, missing parameters, and adversarial inputs ===== Common Pitfalls ===== * **Loose schemas** -- omitting ''additionalProperties: false'' lets the model invent extra parameters * **Ignoring call IDs** -- failing to match tool results to their call IDs breaks the conversation loop * **No client-side validation** -- trusting the model's JSON output without parsing leads to runtime crashes * **Over-complex schemas** -- deeply nested schemas confuse models; flatten where possible * **Missing error handling** -- unhandled tool failures cause the entire agent to crash * **Not feeding results back** -- the model needs to see tool results to generate a final answer ===== See Also ===== * [[how_to_create_an_agent|How to Create an Agent]] * [[how_to_build_an_ai_assistant|How to Build an AI Assistant]] * [[how_to_build_a_chatbot|How to Build a Chatbot]] * [[how_to_implement_guardrails|How to Implement Guardrails]] ===== References =====