Table of Contents

Reasoning-on-Tap

Reasoning-on-Tap is the concept that advanced reasoning capabilities — chain-of-thought processing, multi-step deduction, extended thinking — can be toggled on or off depending on the task, rather than being always active. It treats reasoning as a dial, not a switch: users pay for deep thinking only when the problem demands it. 1)

The Two-System Analogy

The concept mirrors Daniel Kahneman's System 1 / System 2 framework from cognitive psychology:

Traditional LLMs operate primarily in System 1 mode. Reasoning-on-Tap adds the ability to engage System 2 when needed, then disengage it to save cost and latency. 2)

How Models Implement It

Different providers offer reasoning as a toggleable capability:

Cost Implications

Reasoning-on-Tap fundamentally changes the economics of AI inference:

The key insight is that most queries do not need deep reasoning. A customer service chatbot answering FAQs should not pay the cost of a model solving differential equations. Reasoning-on-Tap allows organizations to match compute cost to task complexity. 5)

Impact on Model Selection

Reasoning-on-Tap shifts the model selection landscape from choosing a single model to routing between modes:

Task Type Appropriate Mode Cost Profile
Simple Q&A, chat Standard (System 1) Low cost, low latency
Summarization, translation Standard (System 1) Low cost, low latency
Math, logic, proofs Reasoning (System 2) Higher cost, higher latency
Complex code generation Reasoning (System 2) Higher cost, higher latency
Multi-step analysis Reasoning (System 2) Higher cost, higher latency

Intelligent routing systems can automatically detect when a query warrants reasoning mode, optimizing cost without user intervention.

Relationship to Inference-Time Compute

Reasoning-on-Tap is closely tied to the broader shift toward inference-time compute scaling described in Post-Training RL vs Model Scaling. Instead of spending more on pre-training, reasoning models spend more at inference — generating multiple candidate solutions, verifying each one, and selecting the best. Harder problems get more thinking time; easy problems get less. 6)

Limitations

See Also

References