Table of Contents

On-Device Agents

On-Device Agents are AI agent systems that run entirely on local hardware – smartphones, tablets, PCs, and edge devices – without requiring cloud connectivity for inference. By shifting computation to the device, on-device agents deliver ultra-low latency, privacy-by-design data handling, and offline reliability. Recent advances in small language models and specialized function-calling architectures have made agentic capabilities practical on mobile hardware.

Overview

Cloud-based AI agents require network round-trips for every inference call, adding latency, incurring per-request API costs, and sending potentially sensitive user data to external servers. On-device agents eliminate these constraints: the model runs on the local processor, functions in airplane mode, incurs no per-request charges, and processes sensitive data without it ever leaving the device.

The key challenge is fitting capable models – especially those with function-calling abilities – into the memory and compute constraints of mobile hardware while maintaining accuracy and battery efficiency.

Google AI Edge

Google AI Edge is Google's platform for deploying AI models on mobile and edge devices. The ecosystem includes:

A showcase application (available on Android and iOS) demonstrating on-device AI capabilities powered by Gemma and other open-weight models. In February 2026, Google announced major updates including on-device function calling support and cross-platform iOS availability.

Gemini Nano

The smallest model in Google's Gemini family, designed for on-device inference:

FunctionGemma

Released December 2025, FunctionGemma is a specialized version of Gemma 3 (270M parameters) fine-tuned specifically for function calling. Key characteristics:

AI Edge Function Calling SDK

A library enabling developers to use function calling with on-device LLMs. The pipeline involves:

  1. Define function declarations (names, parameters, types)
  2. Format prompts for the LLM including function schemas
  3. Parse LLM outputs to detect function calls
  4. Execute detected function calls with appropriate parameters

On-Device Function Calling

The real power of on-device agents emerges when models can invoke functions – opening apps, adjusting settings, creating calendar entries, or navigating to destinations. This transforms passive text generation into active device interaction.

from dataclasses import dataclass
 
@dataclass
class FunctionDeclaration:
    name: str
    description: str
    parameters: dict
 
# Define available device functions
device_functions = [
    FunctionDeclaration(
        name="create_calendar_event",
        description="Create a new calendar event",
        parameters={
            "title": {"type": "string", "required": True},
            "datetime": {"type": "string", "format": "iso8601"},
            "duration_minutes": {"type": "integer", "default": 60}
        }
    ),
    FunctionDeclaration(
        name="set_alarm",
        description="Set a device alarm",
        parameters={
            "time": {"type": "string", "format": "HH:MM"},
            "label": {"type": "string", "default": "Alarm"}
        }
    ),
    FunctionDeclaration(
        name="navigate_to",
        description="Open navigation to a destination",
        parameters={
            "destination": {"type": "string", "required": True},
            "mode": {"type": "string", "enum": ["driving", "walking"]}
        }
    ),
]
 
class OnDeviceAgent:
    def __init__(self, model, functions):
        self.model = model
        self.functions = {f.name: f for f in functions}
 
    def process_input(self, user_input: str) -> dict:
        schema_text = self._format_schemas()
        prompt = f"Functions:\n{schema_text}\nUser: {user_input}\nAction:"
        output = self.model.generate(prompt)
 
        if self._is_function_call(output):
            func_name, params = self._parse_function_call(output)
            return {"type": "function_call", "name": func_name, "params": params}
        return {"type": "text", "content": output}
 
    def _format_schemas(self) -> str:
        return "\n".join(
            f"- {f.name}: {f.description}"
            for f in self.functions.values()
        )

Key Challenges

Frameworks and Tools

Framework Platform Purpose
Google AI Edge Gallery Android, iOS Showcase and SDK for on-device models
FunctionGemma Cross-platform 270M parameter function-calling model
Gemini Nano Android, Chrome Built-in on-device inference
TensorFlow Lite / LiteRT Android, iOS, embedded Model deployment runtime
Core ML iOS, macOS Apple on-device ML framework
Qualcomm AI Engine Android (Snapdragon) Hardware-accelerated inference

References

See Also