Meta MTIA Chip

The Meta Training and Inference Accelerator (MTIA) is a family of custom AI chips developed by Meta in partnership with Broadcom and manufactured by TSMC. Originally designed for ranking and recommendation workloads, MTIA has rapidly evolved into a multi-generational inference platform serving billions of users across Meta's platforms. ¹⁾

Why Custom Silicon

Meta's AI workload landscape spans four major domains: training vs inference, crossed with recommendation models vs generative AI. At planetary scale (3+ billion daily users), general-purpose GPUs proved economically unsustainable for inference workloads. MTIA was created to:

Reduce total cost of ownership — MTIA 2i achieves 44% TCO reduction compared to GPUs ²⁾
Mitigate supply risks from GPU vendor dependence
Co-design hardware with Meta's specific model architectures

Chip Generations

Meta has committed to one of the fastest custom chip iteration cycles in the industry — four generations in under two years:

Generation	Status	Primary Workload	Key Specs
MTIA 300	In production	Ranking and recommendation training	Current production chip deployed at scale
MTIA 400	Testing complete	GenAI inference	5x compute over MTIA 300; 50% more HBM bandwidth; 400% higher FP8 FLOPS
MTIA 450	In development	GenAI inference	Doubles HBM bandwidth to 18.4 TB/s (from 9.2 TB/s on MTIA 400)
MTIA 500	Roadmap	GenAI inference at scale	4.5x HBM bandwidth and 25x compute FLOPS vs MTIA 300

All chips are built on RISC-V architecture and manufactured by TSMC with design partnership from Broadcom. ³⁾

Architecture

A key architectural differentiator is MTIA's memory hierarchy: instead of costly HBM alone, MTIA uses large SRAM alongside LPDDR, optimizing for inference workloads where memory access patterns differ from training. This model-chip co-design approach allows Meta to tailor the hardware to its specific neural network architectures. ⁴⁾

Deployment

MTIA 2i (predecessor to MTIA 300 naming) is deployed at scale, serving billions of users for ranking and recommendation
Hundreds of thousands of MTIA chips are already in production across Meta data centers
MTIA handles the highest-volume AI workload by query count: deciding what content appears in Instagram and Facebook feeds ⁵⁾

Industry Context

Meta's MTIA program is part of a broader hyperscaler trend of building custom inference silicon to reduce dependence on NVIDIA GPUs. Similar efforts include Google's TPU, Amazon's Trainium/Inferentia, and Microsoft's Maia. Meta's approach is distinguished by its aggressive iteration speed and RISC-V architecture choice. ⁶⁾