The Meta Training and Inference Accelerator (MTIA) is a family of custom AI chips developed by Meta in partnership with Broadcom and manufactured by TSMC. Originally designed for ranking and recommendation workloads, MTIA has rapidly evolved into a multi-generational inference platform serving billions of users across Meta's platforms. 1)
Meta's AI workload landscape spans four major domains: training vs inference, crossed with recommendation models vs generative AI. At planetary scale (3+ billion daily users), general-purpose GPUs proved economically unsustainable for inference workloads. MTIA was created to:
Meta has committed to one of the fastest custom chip iteration cycles in the industry — four generations in under two years:
| Generation | Status | Primary Workload | Key Specs |
|---|---|---|---|
| MTIA 300 | In production | Ranking and recommendation training | Current production chip deployed at scale |
| MTIA 400 | Testing complete | GenAI inference | 5x compute over MTIA 300; 50% more HBM bandwidth; 400% higher FP8 FLOPS |
| MTIA 450 | In development | GenAI inference | Doubles HBM bandwidth to 18.4 TB/s (from 9.2 TB/s on MTIA 400) |
| MTIA 500 | Roadmap | GenAI inference at scale | 4.5x HBM bandwidth and 25x compute FLOPS vs MTIA 300 |
All chips are built on RISC-V architecture and manufactured by TSMC with design partnership from Broadcom. 3)
A key architectural differentiator is MTIA's memory hierarchy: instead of costly HBM alone, MTIA uses large SRAM alongside LPDDR, optimizing for inference workloads where memory access patterns differ from training. This model-chip co-design approach allows Meta to tailor the hardware to its specific neural network architectures. 4)
Meta's MTIA program is part of a broader hyperscaler trend of building custom inference silicon to reduce dependence on NVIDIA GPUs. Similar efforts include Google's TPU, Amazon's Trainium/Inferentia, and Microsoft's Maia. Meta's approach is distinguished by its aggressive iteration speed and RISC-V architecture choice. 6)