Table of Contents

Together AI

Together AI is a full-stack AI cloud platform specializing in fast inference, fine-tuning, pre-training, and GPU cluster management for open-source models. Founded as an inference-focused startup, Together AI reached a $3.3 billion valuation and $300 million annualized revenue by September 2025, serving companies including Cursor, Decagon, and Cartesia.1)

Overview

Together AI provides developers and researchers with a unified API to run, train, fine-tune, and deploy open-source AI models across text, image, video, code, and voice modalities. The platform emphasizes end-to-end workflows from training to production, supporting over 200 open-source models with an OpenAI-compatible API for seamless integration.2)

Supported Models

The platform supports a broad range of open-source models including:

Inference

Together AI achieves up to 2.75x faster serverless inference compared to competitors through GPU optimizations, low-bit quantization (FP4/FP8), and ATLAS (Adaptive Speculative Decoding), which provides up to 4x acceleration via runtime learning.3)

The platform offers two inference tiers:

Fine-Tuning

Together AI provides a full fine-tuning platform supporting LoRA and DPO methods for task-specific model customization using proprietary data. The platform also supports pre-training from scratch on GPU clusters, with seamless transition from training to inference endpoints.4)

Pricing

Pricing ranges from $0.10 to $3.50 per million tokens depending on model size and optimization level. Batch inference is available at 50% lower cost. The platform claims approximately 60% cost reduction overall through quantization and inference optimizations.5)

Recent Developments

See Also

References