====== Agent Personalization ====== Agent personalization enables LLM-powered agents to learn user preferences over time, maintaining persistent profiles that adapt communication style, decision-making, and tool use to individual users. Rather than treating every interaction as stateless, personalized agents build cumulative models of user behavior, preferences, and goals across sessions. ===== The Personalization Gap ===== Most LLM agents today are stateless --- they forget everything between sessions. Users must repeatedly restate preferences, correct communication styles, and re-explain context. This creates friction that limits agent adoption for long-term use cases like personal assistants, healthcare companions, and productivity tools. Personalized agents address this through four interdependent capabilities: * **Profile modeling:** Inferring user traits, preferences, and goals from interactions * **Memory:** Persisting relevant information across sessions * **Planning:** Adapting task decomposition to user patterns * **Action execution:** Tailoring outputs to user communication preferences ===== PersonaMem ===== PersonaMem-v2 (University of Pennsylvania, 2025) is the state-of-the-art dataset for LLM personalization research. It simulates 1,000 realistic user-chatbot interactions across 300+ scenarios with 20,000+ user preferences and 128k-token context windows. Critically, most preferences are //implicitly// revealed --- users do not explicitly state "I prefer formal language" but reveal it through interaction patterns, mirroring real-world behavior. The dataset enables evaluation of reinforcement fine-tuning for long-context reasoning about user understanding and preference extraction. ===== VARS: Vector-Adapted Retrieval Scoring ===== VARS (UIUC, March 2026) is a pipeline-agnostic, frozen-backbone framework for personalization without per-user fine-tuning. Each user is represented by long-term and short-term vectors in a shared preference space: $$\mathbf{s}_u = lpha \cdot \mathbf{v}_{long} + (1 - lpha) \cdot \mathbf{v}_{short}$$ These vectors bias retrieval scoring over structured preference memory and are updated online from weak scalar rewards (e.g., thumbs up/down). On the MultiSessionCollab benchmark, VARS reduces timeout rates and user effort while matching strong baselines in task success --- the key benefit is //interaction efficiency// rather than raw accuracy gains. # Simplified VARS-style user preference scoring import numpy as np class UserPreferenceModel: def __init__(self, dim=768, alpha=0.7, lr=0.01): self.v_long = np.zeros(dim) # long-term preference vector self.v_short = np.zeros(dim) # short-term session vector self.alpha = alpha self.lr = lr def score(self, candidates): user_vec = self.alpha * self.v_long + (1 - self.alpha) * self.v_short return np.array([np.dot(user_vec, c) for c in candidates]) def update(self, chosen_embed, reward): self.v_short += self.lr * reward * chosen_embed self.v_long = 0.99 * self.v_long + 0.01 * self.v_short def retrieve_personalized(self, query_results, top_k=5): embeddings = [r["embedding"] for r in query_results] scores = self.score(embeddings) ranked = sorted(zip(scores, query_results), reverse=True) return [item for _, item in ranked[:top_k]] ===== PersonaAgent ===== PersonaAgent (Amazon/UIC, 2025) is the first personalized LLM agent framework for multi-turn, long-horizon alignment. It integrates two modules: * **Personalized memory:** Combines episodic memory (specific past interactions) and semantic memory (generalized user knowledge) to build a coherent user model * **Personalized action:** Enables tool use tailored to the user preferences and history At the core, a //persona// --- a unique system prompt per user --- functions as an intermediary: it leverages memory insights to control agent actions, while action outcomes refine the persona over time. ===== Preference Learning Approaches ===== | **Approach** | **Mechanism** | **Pros** | **Cons** | | Explicit feedback | Thumbs up/down, ratings | Clear signal | User fatigue, sparse | | Implicit signals | Click patterns, dwell time, edits | Rich, continuous | Noisy, indirect | | Reinforcement fine-tuning | RLHF/DPO on user data | Deep adaptation | Compute-heavy, per-user cost | | Frozen-backbone (VARS) | Vector updates, no fine-tuning | Scalable, instant | Limited expressiveness | | Persona prompting | Dynamic system prompt | Zero-cost, flexible | Context window limits | ===== Communication Style Adaptation ===== Effective personalization extends beyond content to //how// agents communicate: * **Formality level:** Adjusting between casual and professional registers * **Verbosity:** Matching user preference for concise vs. detailed responses * **Proactivity:** Learning when users want suggestions vs. waiting for instructions * **Domain vocabulary:** Adopting user-specific terminology and jargon * **Emotional tone:** Calibrating empathy and encouragement levels These adaptations are typically captured as lightweight persona parameters updated from interaction signals, stored alongside factual preferences in the user profile. ===== References ===== * [[https://arxiv.org/abs/2512.06688|PersonaMem-v2: Personalized Intelligence via Implicit User Personas (UPenn, 2025)]] * [[https://arxiv.org/abs/2603.20939|User Preference Modeling via VARS (UIUC, 2026)]] * [[https://arxiv.org/abs/2602.22680|Toward Personalized LLM-Powered Agents: Foundations and Evaluation (ShanghaiTech, 2026)]] * [[https://openreview.net/forum?id=PersonaAgent|PersonaAgent: LLM Agents Meet Personalization at Test Time (Amazon/UIC, 2025)]] * [[https://arxiv.org/abs/2510.07925|Persistent Memory and User Profiles for LLM Agents (Mercedes-Benz/Ulm, 2025)]] ===== See Also ===== * [[collective_agent_behavior]] * [[small_language_model_agents]] * [[agent_cost_optimization]]