test/alternative-layout-model

DeepInfra
Throughput: 22.3 TPS Latency: 0.75s Uptime: 99.7%
Input: $0.05/1M tokens Output: $0.12/1M tokens

Lambda Labs

Performance: 18.9 tokens per second

Response time: 950ms

Availability: 98.5%

$0.06 input, $0.14 output per 1M tokens

Together AI

TPS: 25.1
Latency: 0.62 seconds
Uptime: 99.9%

Fireworks

Uptime: 97.8%

Pricing available on request

Other providers like Anthropic and OpenAI may be available through different endpoints.

Performance varies: some achieve 30+ TPS while others focus on quality over speed.