personalized_agents_human

Personalized Agents from Human Feedback

Personalized Agents from Human Feedback (PAHF) is a framework introduced by Liang et al. from Meta Superintelligence Labs and Princeton University (arXiv:2602.16173) for continual personalization of AI agents through online learning from live human interaction.¹⁾ PAHF addresses a fundamental limitation of current agents: they are powerful but fail to align with the idiosyncratic, evolving preferences of individual users. The framework operationalizes a three-step interaction loop with explicit per-user memory, enabling agents to learn initial preferences from scratch and rapidly adapt to preference shifts without relying on static datasets.

The Personalization Gap

Modern AI agents optimize for average user preferences through RLHF and instruction tuning, but individual users have unique, evolving needs. Prior approaches face two key limitations:²⁾

Static implicit models: Train preference models on historical interaction data, but struggle with new users (cold start) and preference drift
External memory profiles: Encode user preferences in retrieval systems, but lack mechanisms for systematic learning and adaptation

PAHF bridges this gap with an online continual learning framework that treats each interaction as a learning opportunity.

The Three-Step PAHF Loop

PAHF operationalizes personalization through a continuous three-step interaction loop:³⁾

1. Pre-Action Clarification: Before taking action, the agent proactively asks questions to resolve ambiguities in user preferences. This prevents errors from partial observability and accelerates initial learning.

2. Preference-Grounded Actions: The agent selects actions by retrieving stored preferences from explicit per-user memory and grounding its decisions in those preferences.

3. Post-Action Feedback: After acting, the agent integrates human corrections and reactions to update memory, handling preference drift and correcting miscalibrated beliefs.

# Illustration of the PAHF three-step interaction loop
class PAHFAgent:
    def __init__(self, base_model, memory_store):
        self.model = base_model
        self.memory = memory_store  # per-user [[explicit_memory|explicit memory]]
 
    def interact(self, user_id: str, task: dict) -> dict:
        user_prefs = self.memory.retrieve(user_id)
 
        # Step 1: Pre-action clarification
        if self.has_ambiguity(task, user_prefs):
            clarification = self.model.generate_question(task, user_prefs)
            user_response = self.get_user_input(clarification)
            self.memory.update(user_id, self.extract_prefs(user_response))
            user_prefs = self.memory.retrieve(user_id)
 
        # Step 2: Preference-grounded action
        action = self.model.select_action(
            task=task,
            preferences=user_prefs,
            strategy="preference_grounded"
        )
        result = self.execute(action)
 
        # Step 3: Post-action feedback integration
        feedback = self.get_user_feedback(result)
        if feedback.has_correction:
            self.memory.update(user_id, feedback.new_preferences)
 
        return result

Explicit Per-User Memory

PAHF maintains a dynamic, user-specific store of preferences that is:

Explicit: Preferences are stored as interpretable key-value pairs rather than implicit neural representations
Persistent: Memory survives across sessions, enabling long-term personalization
Updatable: New feedback overwrites or refines existing preferences in real-time
Retrievable: Relevant preferences are retrieved at action time via similarity matching

This design enables rapid adaptation to new users (no cold start with historical data required) and graceful handling of preference shifts.⁴⁾)

Four-Phase Evaluation Protocol

PAHF introduces a rigorous four-phase evaluation protocol that tests both initial learning and adaptation:⁵⁾

Phase	Description	What it Tests
Phase 1	Learn initial preferences from scratch via feedback	Cold-start learning ability
Phase 2	Exploit learned preferences without additional feedback	Preference retention and grounding
Phase 3	Adapt to preference/persona shifts	Online adaptation speed
Phase 4	Demonstrate post-shift exploitation	Updated preference stability

Formal Learning Dynamics

The personalization error under PAHF decreases as a function of interaction rounds:

<latex>

ext{ACPE}(T) = \sum_{t=1}^{T} \| \hat{p}_t - p_t^* \|^2

</latex>

where $\hat{p}_t$ is the agent's predicted preference at time $t$ and $p_t^*$ is the true user preference. PAHF with dual feedback channels (pre-action + post-action) achieves lower ACPE than either channel alone:

<latex>

ext{ACPE}_{	ext{dual}} \leq \min(	ext{ACPE}_{	ext{pre-only}}, 	ext{ACPE}_{	ext{post-only}})

</latex>

Pre-action clarification minimizes initial errors by resolving ambiguity upfront, while post-action feedback enables fast correction when predictions are wrong.

Benchmarks and Results

PAHF is evaluated on two large-scale benchmarks:⁶⁾

Embodied Manipulation: Robot tasks in household scenarios involving appliance and object preferences
Online Shopping: E-commerce tasks simulating digital personalization across product categories

Results across both benchmarks show PAHF consistently outperforms baselines:

No-memory baselines: Hallucinate preferences or default to population averages, failing on individual users
Single-channel (pre-action only): Cannot correct errors after action execution
Single-channel (post-action only): Slower initial learning due to trial-and-error without upfront clarification
PAHF (dual channel): Achieves highest success rates and lowest ACPE across all four evaluation phases

Post-drift adaptation (Phase 3) is particularly strong: PAHF matches or exceeds single-channel adaptation speed while maintaining lower cumulative error throughout the process.

¹⁾ , ²⁾ , ⁵⁾

https://arxiv.org/abs/2602.16173|Liang et al., “Learning Personalized Agents from Human Feedback,” arXiv:2602.16173, 2026

³⁾

https://ai.meta.com/research/publications/learning-personalized-agents-from-human-feedback/|Meta AI Research Publication Page

⁴⁾

https://github.com/facebookresearch/PAHF|PAHF GitHub Repository (Meta Research

⁶⁾

https://personalized-ai.[[github|github]].io|PAHF Project Page

Table of Contents