How It Works
- Step 1: Knowledge Generation
- Step 2: Knowledge Integration
Benchmark Results
Critical Factors
Advantages
Limitations
See Also
References

Generate Knowledge Prompting

Generate Knowledge Prompting is a two-step prompting technique that improves commonsense reasoning by first generating relevant knowledge statements from a language model, then using those statements to augment the question-answering prompt. The method treats LLMs as flexible sources of external knowledge without requiring task-specific supervision or structured knowledge bases.¹⁾

How It Works

The method operates through two distinct phases:

Step 1: Knowledge Generation

A language model generates question-related knowledge statements through few-shot prompting. The generation prompt consists of three components:

An instruction describing the knowledge generation task.
Human-written demonstrations (typically five per task) that show examples of relevant knowledge statements.
A placeholder for the new question.

The model generates multiple knowledge statements (M=20 in the original paper) using nucleus sampling with p=0.5. Generation is terminated when output exceeds 64 tokens or encounters a newline character. Repetitions and empty strings are discarded.²⁾

Step 2: Knowledge Integration

Each generated knowledge statement is concatenated with the original question to create M knowledge-augmented questions:

q_0 = q (original question)
q_1 = [k_1 || q]
q_2 = [k_2 || q]
...
q_m = [k_m || q]

The model makes predictions on each augmented question, and the highest-confidence prediction across all versions is selected as the final answer. This approach requires no joint fine-tuning for knowledge integration.

Benchmark Results

Generate Knowledge Prompting achieved state-of-the-art results on multiple commonsense reasoning benchmarks:³⁾

NumerSense (numerical commonsense)
CommonsenseQA 2.0 (general commonsense)
QASC (scientific commonsense)

The method outperformed template-based knowledge generation methods like Self-Talk while performing comparably to retrieval-based systems. It improved performance across all four commonsense reasoning tasks tested.

Critical Factors

Three factors were identified as critical to effectiveness:

Quality of knowledge: The relevance and accuracy of generated statements directly impacts answer quality.
Quantity of knowledge: Performance improves with additional knowledge statements, up to a point of diminishing returns.
Integration strategy: How knowledge is combined during inference matters significantly.

Qualitative analysis revealed that generated knowledge statements cover diverse knowledge types and can transform commonsense QA into explicit reasoning procedures (such as deduction) that language models process more effectively.

Advantages

Flexibility: Generates knowledge freely beyond template constraints, unlike prior methods that relied on clarification questions or contrastive explanations.
No external knowledge base required: Uses the LLM itself as the knowledge source.
No fine-tuning needed: Maintains the flexibility of pretrained sequence models.
Task-agnostic: The same approach works across different commonsense reasoning domains.

Limitations

Computational cost: Requires multiple forward passes (one per generated knowledge statement), increasing inference time significantly.
Knowledge quality dependency: Poor knowledge generation leads to degraded answer quality.
Knowledge hallucination: The model may generate plausible-sounding but factually incorrect knowledge.
Sampling sensitivity: Results depend on the nucleus sampling parameters and number of knowledge statements generated.

References

¹⁾

Liu et al. 2022, Generated Knowledge Prompting for Commonsense Reasoning, ACL 2022

²⁾

Liu et al. 2022, Section 3.1

³⁾

Results from Liu et al. 2022

Table of Contents