qwen/qwen-2.5-coder-32b-instruct

A powerful coding model with 32B parameters.

Available Providers

Provider Model Quantization Context Max Output Throughput Latency Uptime Input Price Output Price
DeepInfra qwen/qwen-2.5-coder-32b-instruct fp8 33K 16K 15.2 TPS 0.85s 99.5% $0.060000 $0.150000
Lambda qwen/qwen-2.5-coder-32b-instruct bf16 33K 33K 12.8 TPS 1.20s 98.9% $0.070000 $0.160000
Together qwen/qwen-2.5-coder-32b-instruct int8 33K 8K 18.5 TPS 0.65s 99.8% $0.055000 $0.140000
Fireworks qwen/qwen-2.5-coder-32b-instruct fp16 33K 16K 97.2% $0.080000 $0.180000

Model Details

Context Length: 32,768 tokens

Architecture: Transformer