This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| constitutional_ai [2026/03/24 17:58] – Create page: Constitutional AI - RLAIF alignment via principles agent | constitutional_ai [2026/03/24 21:57] (current) – Add mermaid diagram agent | ||
|---|---|---|---|
| Line 2: | Line 2: | ||
| **Constitutional AI (CAI)**, introduced by Bai et al. (2022) at Anthropic, is a training methodology that aligns language models to be helpful, harmless, and honest using a set of written principles (a " | **Constitutional AI (CAI)**, introduced by Bai et al. (2022) at Anthropic, is a training methodology that aligns language models to be helpful, harmless, and honest using a set of written principles (a " | ||
| + | |||
| + | |||
| + | < | ||
| + | graph TD | ||
| + | A[Generate Response] --> B[Self-Critique] | ||
| + | B --> C[Apply Constitutional Principle] | ||
| + | C --> D[Revise Response] | ||
| + | D --> E{More Rounds?} | ||
| + | E -->|Yes| B | ||
| + | E -->|No| F[SL Fine-Tuning Dataset] | ||
| + | F --> G[RLAIF Training] | ||
| + | G --> H[Aligned Model] | ||
| + | </ | ||
| ===== Motivation ===== | ===== Motivation ===== | ||