Together AI
confidence: high · 3 watch questions · 5 evidence citations
01 TEAM & SIZE
Founded 2022. Co-founders: Vipul Ved Prakash (CEO, ex-Topsy → Apple), Ce Zhang (CTO), Tri Dao (Chief Scientist, FlashAttention / Mamba co-author), Chris Ré (Stanford), Percy Liang (Stanford). Leadership: Charles Zedlewski (CPO), Albert Meixner (SVP Eng), Mahadev Konar (SVP Eng Infra). Headcount: Tracxn ~355 (Mar 2026); PitchBook ~335; getlatka 2024 listed 287. Stanford research DNA + production systems team. [S-6-002, S-6-004]
02 FUNDING & STAGE
Total raised ~$534M / 4 rounds (Crunchbase). Latest: $305M Series B, Feb 20 2025, General Catalyst-led, co-led Prosperity7 — $3.3B post. Prior Series A $102.5M Mar 2024, Salesforce Ventures-led, $1.25B. Earlier backers: Kleiner Perkins, Lux, NEA, a16z, Nvidia, SV Angel, Emad Mostaque. DCD (Sep 2025) reports ~$1B raise at 2x+ the $3.3B — Series C unannounced as of 2026-04-20 (unverified). [S-6-001, S-6-005, S-6-008]
03 PRODUCT STATE
Four surfaces on "AI Acceleration Cloud":
- Inference — serverless (pay-per-token, 200+ open models), dedicated model endpoints (up to 43% cheaper on-demand), container inference (media), batch (up to 30B tokens async).
- Fine-tuning — managed LoRA / full fine-tune on open weights.
- Training / GPU clusters — self-serve Instant GPU Clusters (launched Sep 2025), H100/H200/B200/GB200 NVL72 w/ InfiniBand; B200 $4/hr reserved, $5.50 on-demand; H100 $1.76–$2.39.
- Adjacent — Code Sandbox (agent exec), Managed Storage (zero-egress), Together Kernel Collection. Open-source legacy: RedPajama-v1/v2 (>100T tokens, used by Snowflake Arctic, Salesforce XGen, AI2 OLMo), OpenChatKit. [S-6-003, S-6-006]
04 GTM MOTION
PLG bottom-up (450k+ developers per company claims) + enterprise sales for dedicated capacity. Named customers: Salesforce, Zoom, SK Telecom, Washington Post, Cognition (Devin), Hedra, Zomato, Krea, Cartesia. Revenue mix: ~30–40% per-token API, 60–70% GPU rentals. Geo: Hypertec + 5C Group Europe partnership up to 100k GPUs by 2028; live DCs Maryland (Jul 2025), Memphis, Sweden (Sep 2025); ~200 MW N. America. [S-6-004, S-6-007]
05 CORE THESIS
"AI-native cloud" — verticalize the stack (kernels → schedulers → inference runtime → datasets) so open-model inference beats hyperscaler price/perf, and capture enterprises refusing OpenAI lock-in. Bet: open models stay within 6–12mo of frontier; a neutral compute layer with research-grade optimization (FlashAttention lineage) wins the long tail of fine-tunes + agent inference. Distinct from Nous Research (model lab) — Together is the cloud. ARR: $130M end-2024 → $300M annualized Sep 2025. [S-6-004, S-6-007]
06 PUBLIC SIGNALS INVENTORY
07 52-WEEK QUESTIONS
- Q1Does the rumored ~$1B Series C close in 2026, and at what valuation multiple of the $3.3B Feb-2025 mark — does it cross $10B?
- Q2Does per-token API share rise above 50% of revenue (signaling developer PLG winning) or fall below 25% (reservation/cluster-dominant, commoditizing)?
- Q3Do major Blackwell (GB200) clusters ship on time and show measurable inference $/Mtok advantage vs. AWS Bedrock / Fireworks / Baseten in published benchmarks?
08 WEEKLY TIMELINE
0 signalsTIMELINE EMPTY · WEEK 0
Auto-feed fetcher ships Month 2. Manual diffs via Friday script start Week 2.
09 EVIDENCE
5 items · expand
09 EVIDENCE
5 items · expand- S-6-001Together blog: "$305M Series B" (Feb 20 2025) — explicit round size, lead investors, $3.3B valuation
- S-6-004SiliconANGLE (Sep 9 2025): Self-service GPU infrastructure launch + ARR figures ($300M annualized)
- S-6-003Together.ai/products page (fetched 2026-04-20): enumerates 4 product surfaces (inference, fine-tuning, compute/clusters, sandbox/storage)
- S-6-006GitHub togethercomputer/RedPajama-Data + arXiv 2411.12372 — open-source dataset lineage verified
- S-6-008DCD (Sep 2025): Together seeking ~$1B round (unverified close)
10 CONFIDENCE & UNKNOWNS
expand
10 CONFIDENCE & UNKNOWNS
expandConfidence: high on funding, products, customers, revenue trajectory (multiple independent primary + tier-1 press sources). Medium on exact headcount (355 vs 335 vs 287 across trackers; rapid growth = stale data). Unverified: Series C close date/size/valuation; exact GPU installed base; YouTube handle; RSS URL; GB200 cluster customer count. Nous Research confusion avoided — Together is the compute cloud, not the model lab.