DeepInfra
Purpose-built cloud platform for high-throughput, low-latency AI inference at production scale.
AI
📍 Palo Alto, California
Founded 2022
Current Valuation
Private
as of 2026-Q2
PRIVATE
DeepInfra operates a vertically-integrated inference cloud built specifically for production-scale AI workloads, owning and operating GPU infrastructure across eight U.S. data centers. The company raised $107M Series B in May 2026 co-led by 500 Global and Georges Harik with participation from NVIDIA, Felicis, Samsung Next, and Supermicro. DeepInfra serves 150+ open-source models through OpenAI-compatible APIs and is an early NVIDIA Blackwell + Dynamo deployment partner.
Company Profile
Growth
Token processing volume grew 25x since Series A
Last Round
Series B — $107M (May 2026) · Lead: 500 Global
Founders & Key People
Nikola BorisovYessen Kanapin
Investors
500 Global · Georges Harik · A.Capital Ventures · Crescent Cove · Felicis · NVIDIA · Peak6 · Samsung Next · Supermicro · Upper90
Products
- Inference API (150+ open-source models)
- DeepCluster (managed GPU clusters)
- DeepStart
- OpenAI-compatible endpoints
Competitors
Together AI · Fireworks AI · Replicate · Modal · Anyscale · Groq
AI InfrastructureInferenceOpen Source ModelsGPU CloudNVIDIA Ecosystem
Private-company numbers are not real-time. Reflects publicly disclosed valuations from press releases, news reports, and tender offers as of 2026-Q2. Refreshed quarterly.