====== Open-Weights vs Open-Source AI ======

The AI community increasingly distinguishes between **open-weights** and **open-source** models — two terms that are frequently conflated but describe meaningfully different levels of openness. The distinction matters for reproducibility, trust, competition, and the future of AI governance. ((Source: [[https://opensource.org/ai/open-weights|OSI - Open Weights]]))

===== Open-Weights Models =====

An open-weights model releases the **trained model parameters** (weights and biases) that determine how the model processes inputs and generates outputs. Users can download, deploy, fine-tune, and build on these weights.

What open-weights typically provides:

  * The final trained weights
  * A model card describing capabilities and limitations
  * Inference code or compatibility with standard frameworks

What open-weights typically **does not** provide:

  * Training data or data composition details
  * Full training code and hyperparameters
  * Sufficient information to reproduce the model from scratch

Examples: **Meta Llama** (3, 3.1, 4), **Mistral**, **Google Gemma** ((Source: [[https://www.oracle.com/artificial-intelligence/ai-open-weights-models/|Oracle - Open Weights Models]]))

===== Open-Source AI (OSI Definition) =====

The **Open Source Initiative** (OSI) published its Open Source AI Definition (v1.0) in 2024, setting a rigorous standard. To qualify as open-source, an AI model must provide:

  * **Model weights** and parameters
  * **Training code** sufficient to reproduce the training process
  * **Training data** information detailed enough for a skilled person to rebuild a substantially equivalent system
  * **Evaluation methodology** and results
  * **Unrestricted licensing** — no limitations on use, modification, or redistribution ((Source: [[https://opensource.org/ai/open-weights|OSI - Open Weights]]))

Examples that approach this standard: **OLMo** (Allen Institute for AI), **Pythia** (EleutherAI)

===== The Meta Llama Controversy =====

Meta's decision to label the Llama model family as "open source" sparked significant debate. Critics point out that Llama:

  * Does not release training data
  * Does not release full training code
  * Includes **licensing restrictions** (e.g., restrictions on using Llama to train competing foundation models, and usage caps for very large deployments)

These restrictions violate the traditional open-source principle of unrestricted use. Meta argues that releasing weights provides meaningful transparency and community benefit, even without full reproducibility. The debate highlighted the need for clearer terminology — hence the adoption of "open-weights" as a distinct category. ((Source: [[https://simoninstitute.ch/blog/post/open-ai-models-an-introduction|Simon Institute - Open AI Models]]))

===== Why the Distinction Matters =====

The difference between open-weights and open-source has practical consequences:

**For reproducibility**: Open-source models can be independently verified and reproduced. Open-weights models must be taken on trust — users cannot confirm how they were trained or what data they learned from.

**For safety and auditing**: Full open-source enables independent safety research, bias auditing, and vulnerability analysis. Open-weights provides limited insight into training-time decisions.

**For competition**: Open-weights enables downstream fine-tuning and deployment, fostering an ecosystem of adapted models. But without training data and code, competitors cannot truly replicate or build equivalent base models.

**For regulatory compliance**: Some regulatory frameworks may require transparency about training data — a requirement open-weights alone cannot satisfy. ((Source: [[https://opensource.org/ai/open-weights|OSI - Open Weights]]))

===== The Spectrum of Openness =====

In practice, AI model openness exists on a spectrum:

| Level | Provides | Example |
| Closed / Proprietary | API access only | GPT-4, Claude |
| Open-weights | Weights + inference code | Llama, Mistral |
| Open-weights + data info | Weights + data documentation | Falcon 2 |
| Full open-source (OSI) | Weights + code + data + unrestricted license | OLMo |

===== 2025-2026 Developments =====

  * **OpenAI** released its first open-weights model since GPT-2 in March 2025, signaling a shift away from fully closed approaches under competitive pressure. ((Source: [[https://www.globaltechcouncil.org/ai/open-weight-ai-model/|Global Tech Council - Open Weight AI]]))
  * Enterprise adoption of open-weights models continues to grow, with surveys showing over 50% of generative AI deployments using open-weights models for cost control and customization.
  * The OSI definition is gaining traction as a reference standard, though few frontier models fully comply due to the cost and complexity of sharing training data at scale.

===== See Also =====

  * [[lora_adapter|What Is a LoRA Adapter]]
  * [[inference_economics|Inference Economics]]
  * [[model_velocity_vs_stability|Model Velocity vs Model Stability]]

===== References =====