====== Repository-Centric Learning ======

**Repository-Centric Learning (RCL)** is a training paradigm for small language models that prioritizes deep vertical mastery of individual software repositories over broad horizontal exposure across many codebases. Introduced through SWE-Spot by Peng et al. (2026), RCL proposes that compact models must internalize the 'physics' of a target software environment through parametric knowledge acquisition rather than relying on costly inference-time search.

<mermaid>
graph LR
    A[Repository] --> B[Unit 1: Design Patterns]
    A --> C[Unit 2: Implementation Details]
    A --> D[Unit 3: Evolution History]
    A --> E[Unit 4: Runtime Behavior]
    B --> F[RCL Curriculum]
    C --> F
    D --> F
    E --> F
    F --> G[Fine-tune Base Model]
    G --> H[Repo-Expert SLM]
</mermaid>

===== The Problem with Task-Centric Learning =====

The prevailing approach to training coding models follows a **Task-Centric Learning (TCL)** paradigm: expose the model to as many diverse repositories and tasks as possible, hoping it learns generalizable coding skills. This works for large frontier models with enormous parameter budgets, but fails for Small Language Models (SLMs) due to a fundamental capability gap.

SLMs trained with TCL:
  * Lack inference-time generalization to unfamiliar codebases
  * Must rely on expensive retrieval-augmented generation (RAG) and search at inference time
  * Cannot build deep understanding of any single repository's patterns, idioms, and architecture
  * Suffer cold-start problems when encountering new projects

===== The RCL Paradigm Shift =====

RCL inverts the TCL assumption. Instead of learning a little about many repositories, the model learns //everything// about a specific repository:

^ Dimension ^ Task-Centric Learning (TCL) ^ Repository-Centric Learning (RCL) ^
| Breadth vs Depth | Horizontal (many repos) | Vertical (single repo) |
| Knowledge Location | Inference-time search | Parametric (in weights) |
| Generalization | Cross-repo transfer | Repo-specific mastery |
| Inference Cost | High (RAG, search) | Low (direct generation) |
| Cold Start | Every new task | One-time training |

===== Four-Unit Repository-Centric Experience =====

RCL transforms static codebases into interactive learning signals through a structured four-unit curriculum:

**Unit 1: Design.** The model learns the repository's high-level architectural patterns -- module organization, dependency structures, design decisions, and API contracts. This builds understanding of //why// the code is structured as it is.

**Unit 2: Implementation.** Focused on code-level details -- writing, debugging, understanding function implementations, class hierarchies, and coding idioms specific to the project.

**Unit 3: Evolution.** The model studies the repository's version history -- commit patterns, refactoring trajectories, how features were added over time, and how bugs were fixed. This captures the temporal dynamics of software development.

**Unit 4: Runtime.** Incorporates execution traces, test behaviors, and dynamic properties that cannot be inferred from static code alone. This grounds the model's understanding in actual program behavior.

<code python>
# Conceptual illustration of RCL training pipeline
class RepositoryCentricExperience:
    def __init__(self, repo_path):
        self.repo = Repository(repo_path)
    
    def generate_design_examples(self):
        # Unit 1: Architecture and design patterns
        return self.repo.extract_module_relationships()
    
    def generate_implementation_examples(self):
        # Unit 2: Code writing and debugging
        return self.repo.extract_function_implementations()
    
    def generate_evolution_examples(self):
        # Unit 3: Version history and change patterns
        return self.repo.extract_commit_trajectories()
    
    def generate_runtime_examples(self):
        # Unit 4: Execution traces and test behaviors
        return self.repo.extract_test_execution_traces()
    
    def train_repo_expert(self, base_model):
        # Train a repo-specialized expert
        curriculum = (
            self.generate_design_examples() +
            self.generate_implementation_examples() +
            self.generate_evolution_examples() +
            self.generate_runtime_examples()
        )
        return fine_tune(base_model, curriculum)
</code>

===== Internalizing Repository Physics =====

The central metaphor of RCL is that each software repository has its own 'physics' -- a set of core rules, dependency patterns, idioms, conventions, and dynamics that govern how the codebase behaves and evolves. Just as a physics engine must understand gravity and collision to simulate a world, a coding agent must understand a repository's internal logic to operate effectively within it.

RCL embeds this physics directly into model weights during training, eliminating the need for inference-time discovery through RAG or search. The model develops an intuitive understanding analogous to how experienced developers build deep familiarity with codebases they work on daily.

===== Key Results =====

SWE-Spot-4B, trained with RCL, achieves remarkable results:

  * **Outperforms open-weight models up to 8x larger** including Meta's CWM and Qwen3-Coder-30B
  * **Matches or surpasses efficiency-focused commercial models** such as GPT-4.1-mini and GPT-5-nano
  * Demonstrates **higher training sample efficiency** -- fewer examples needed for comparable performance
  * Achieves **lower inference costs** -- no RAG overhead or search required
  * Excels across multiple SWE tasks: issue resolving, test generation, feature implementation, and repo Q&A

These results break established scaling trends, demonstrating that repository mastery is a distinct capability dimension that complements general coding ability.

===== Theoretical Implications =====

RCL suggests that for building efficient intelligence in constrained settings, the path forward is not always scale -- it is depth. A small model that deeply understands its operational environment can outperform a much larger model that has only shallow familiarity.

$$\text{Effectiveness} = f(\text{depth}_{\text{repo}}) \gg g(\text{breadth}_{\text{tasks}}) \quad \text{for SLMs}$$

===== References =====

  * [[https://arxiv.org/abs/2601.21649|Peng et al. (2026). SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning. arXiv:2601.21649]]

===== See Also =====

  * [[small_language_models|Small Language Models]]
  * [[code_agents|Code Agents]]
  * [[software_engineering_benchmarks|Software Engineering Benchmarks]]
  * [[retrieval_augmented_generation|Retrieval-Augmented Generation]]