Contextual priming is the practice of injecting specific text, instructions, or examples into a prompt to bias a large language model's internal representations and steer its outputs toward desired behaviors without altering model weights. 1)
The technique leverages the transformer's attention mechanisms and embeddings to activate relevant patterns, drawing on parallels from cognitive science where prior stimuli influence subsequent human processing. 2)
When a prompt is submitted to an LLM, the text is mapped to vector embeddings that transformer layers process through attention heads. These evolve into contextualized representations that influence token prediction. By carefully constructing the preceding context, practitioners can activate specific knowledge patterns, stylistic tendencies, or reasoning modes within the model.
The process works because transformers attend to all tokens in the context window simultaneously. Earlier tokens exert influence on how later tokens are generated, meaning that strategically placed context shapes every subsequent output.
System prompts set foundational context at the start of interactions, defining the model's role, tone, or constraints to guide all responses. For example, priming with “You are an assistant trained to speak like Shakespeare” biases outputs toward Elizabethan language and style. 3)
In chatbot scenarios, iterative system-level priming builds dynamic context across turns, reducing generic responses by simulating natural conversation buildup. The system prompt acts as a persistent primer that shapes every response the model generates.
Few-shot priming provides input-output examples before the task prompt, activating abstract patterns like reasoning chains or style reproduction. Chain-of-thought prompting is a prominent example, where step-by-step reasoning demonstrations prime the model to produce similarly structured logical outputs. 4)
Structural priming studies show that LLMs assign higher likelihood to target sentences matching the abstract structure of prior examples, even without lexical overlap. These effects scale with the number of primes provided. 5)
Contextual priming in LLMs parallels human structural priming, where exposure to a sentence structure biases production or comprehension of similar structures, indicating abstract linguistic knowledge activation. In humans, priming increases with prime-target similarity or repeated exposure, mirroring LLM behavior where additional prime sentences amplify effects. 6)
Cognitive psychology's spreading activation theory, where related concepts facilitate each other in neural networks, provides an analogy for how LLM embeddings and attention spread influence across the context window. 7)
Several structured approaches maximize the effectiveness of contextual priming:
The power of contextual priming extends to adversarial applications. The “Response Attack” technique demonstrates how prior mildly harmful responses can prime policy violations in LLMs, exploiting priming's covert bias on model judgments. This highlights the importance of understanding priming dynamics for both beneficial use and safety. 11)