Table of Contents

Meta MTIA Chip

The Meta Training and Inference Accelerator (MTIA) is a family of custom AI chips developed by Meta in partnership with Broadcom and manufactured by TSMC. Originally designed for ranking and recommendation workloads, MTIA has rapidly evolved into a multi-generational inference platform serving billions of users across Meta's platforms. 1)

Why Custom Silicon

Meta's AI workload landscape spans four major domains: training vs inference, crossed with recommendation models vs generative AI. At planetary scale (3+ billion daily users), general-purpose GPUs proved economically unsustainable for inference workloads. MTIA was created to:

Chip Generations

Meta has committed to one of the fastest custom chip iteration cycles in the industry — four generations in under two years:

Generation Status Primary Workload Key Specs
MTIA 300 In production Ranking and recommendation training Current production chip deployed at scale
MTIA 400 Testing complete GenAI inference 5x compute over MTIA 300; 50% more HBM bandwidth; 400% higher FP8 FLOPS
MTIA 450 In development GenAI inference Doubles HBM bandwidth to 18.4 TB/s (from 9.2 TB/s on MTIA 400)
MTIA 500 Roadmap GenAI inference at scale 4.5x HBM bandwidth and 25x compute FLOPS vs MTIA 300

All chips are built on RISC-V architecture and manufactured by TSMC with design partnership from Broadcom. 3)

Architecture

A key architectural differentiator is MTIA's memory hierarchy: instead of costly HBM alone, MTIA uses large SRAM alongside LPDDR, optimizing for inference workloads where memory access patterns differ from training. This model-chip co-design approach allows Meta to tailor the hardware to its specific neural network architectures. 4)

Deployment

Industry Context

Meta's MTIA program is part of a broader hyperscaler trend of building custom inference silicon to reduce dependence on NVIDIA GPUs. Similar efforts include Google's TPU, Amazon's Trainium/Inferentia, and Microsoft's Maia. Meta's approach is distinguished by its aggressive iteration speed and RISC-V architecture choice. 6)

See Also

References