====== Music Composition Agents ======

Multi-agent LLM systems for music composition deploy role-specialized agents for melody, harmony, accompaniment, and quality review, collaboratively producing symbolic music through iterative feedback loops.

===== Overview =====

Music composition requires coordinating multiple musical elements: melody, harmony, rhythm, instrumentation, and structure. Multi-agent LLM frameworks address this by assigning specialized roles to different agents, mirroring how human composers and arrangers collaborate. CoComposer(([[https://arxiv.org/abs/2509.00132|"CoComposer: Multi-Agent Collaborative Music Composition with LLMs." arXiv:2509.00132, 2025.]])) uses five role-specialized agents for iterative symbolic music creation in ABC notation, while WeaveMuse(([[https://arxiv.org/abs/2509.11183|"WeaveMuse: Open Multi-Agent Framework for Multimodal Music." arXiv:2509.11183, 2025.]])) provides an open framework for multimodal music tasks spanning text, notation, audio, and visual modalities.

===== CoComposer: Multi-Agent Collaborative Composition =====

CoComposer deploys five specialized agents using AutoGen group chats:

  * **Leader Agent**: Analyzes user prompts and decomposes them into musical specifications (title, genre, key, chord progression, instruments, tempo, rhythm)
  * **Melody Agent**: Generates the main melody in ABC notation with MIDI instrument info, tempo, rhythm, key, and genre parameters
  * **Accompaniment Agent**: Creates harmony and supporting parts synchronized with the melody, handling chord progressions and rhythmic alignment
  * **Revision Agent**: Receives feedback and applies targeted modifications to improve composition quality
  * **Review Agent**: Evaluates the overall composition against quality criteria and provides structured feedback

The compositional workflow can be modeled as an iterative function:

<latex>f: P \to (M, A), \quad R(M, A) \to (M', A')</latex>

where $P$ is the prompt, $M$ is melody, $A$ is accompaniment, and $R$ is the review feedback function that drives iterative refinement.

**Streamlined Design**: CoComposer uses 5 agents (vs. ComposerX's 6), eliminating the separate Instrument agent to reduce communication rounds while maintaining quality.

===== WeaveMuse: Open Multimodal Framework =====

WeaveMuse supports multimodal music tasks through:

  * **Specialist Agents**: Interpret requirements, validate outputs across formats (ABC, MIDI, audio)
  * **Manager Agent**: Selects tools, sequences actions, maintains state, handles user turns
  * **Intermodal Loops**: Analysis-synthesis-render cycles across text, symbolic notation, audio, and visual modalities

WeaveMuse emphasizes reproducibility with interchangeable open-source models and supports cross-format constraint validation for rhythmic and harmonic coherence.

===== Quality Evaluation =====

CoComposer is evaluated using AudioBox-Aesthetics on four criteria:

<latex>Q_{total} = w_{CE} \cdot Q_{CE} + w_{CU} \cdot Q_{CU} + w_{PC} \cdot Q_{PC} + w_{PQ} \cdot Q_{PQ}</latex>

where CE = Content Enjoyment, CU = Content Usefulness, PC = Production Complexity, PQ = Production Quality.

===== Code Example =====

<code python>
from dataclasses import dataclass, field
from enum import Enum

class AgentRole(Enum):
    LEADER = "leader"
    MELODY = "melody"
    ACCOMPANIMENT = "accompaniment"
    REVISION = "revision"
    REVIEW = "review"

@dataclass
class MusicSpec:
    title: str
    genre: str
    key: str
    tempo: int
    time_signature: str
    chord_progression: list[str] = field(default_factory=list)
    instruments: list[str] = field(default_factory=list)

class CoComposerSystem:
    def __init__(self, llm_model: str = "gpt-4o"):
        self.agents = {
            role: MusicAgent(role, llm_model)
            for role in AgentRole
        }

    def compose(self, user_prompt: str,
                max_iterations: int = 3) -> str:
        spec = self.agents[AgentRole.LEADER].decompose(
            user_prompt
        )
        melody_abc = self.agents[AgentRole.MELODY].generate(
            spec
        )
        accomp_abc = self.agents[AgentRole.ACCOMPANIMENT].generate(
            spec, melody_abc
        )
        for iteration in range(max_iterations):
            review = self.agents[AgentRole.REVIEW].evaluate(
                melody_abc, accomp_abc, spec
            )
            if review.score >= 0.85:
                break
            melody_abc, accomp_abc = (
                self.agents[AgentRole.REVISION].revise(
                    melody_abc, accomp_abc, review.feedback
                )
            )
        return self.merge_abc(melody_abc, accomp_abc)

    def merge_abc(self, melody: str, accomp: str) -> str:
        return f"X:1\n%%score 1 2\n" \
               f"V:1 name=\"Melody\"\n{melody}\n" \
               f"V:2 name=\"Accompaniment\"\n{accomp}"
</code>

===== Architecture =====

<mermaid>
graph TD
    A[User Prompt] --> B[Leader Agent]
    B --> C[Music Specification]
    C --> D[Melody Agent]
    C --> E[Accompaniment Agent]
    D --> F[Melody - ABC Notation]
    E --> G[Accompaniment - ABC Notation]
    F --> H[Review Agent]
    G --> H
    H --> I{Quality Threshold?}
    I -->|Pass| J[Final Composition]
    I -->|Fail| K[Revision Agent]
    K --> L[Feedback-Driven Edits]
    L --> D
    L --> E
    J --> M[ABC to MIDI]
    M --> N[Audio Rendering]
    subgraph WeaveMuse Extension
        O[Manager Agent] --> P[Text Analysis]
        O --> Q[Notation Validation]
        O --> R[Audio Synthesis]
        O --> S[Visual Score]
    end
</mermaid>

===== Key Results =====

^ Metric ^ CoComposer ^ ComposerX ^ Single Agent ^
| Generation success rate | 100% | 100% | 100% |
| Content Enjoyment (CE) | Higher | Baseline | Lower |
| Production Complexity (PC) | Higher | Baseline | Lower |
| Production Quality (PQ) | Higher | Baseline | Comparable |
| Agent count | 5 | 6 | 1 |
| Communication rounds | Fewer | More | None |


===== See Also =====

  * [[image_editing_agents|Image Editing Agents]]
  * [[video_editing_agents|Video Editing Agents]]
  * [[game_playing_agents|Game Playing Agents]]

===== References =====