Model Catalog

14 items
Applied Filters
Author avatar
deepseek-ai /

DeepSeek-R1-Distill-Qwen-32B

Text Generation
TGI
Accelerated Text Generation Inference
GPU 4x Nvidia L4
$ 3.8
Author avatar
beethogedeon /

gte-Qwen2-7B-instruct-Q4_K_M-GGUF

Sentence Embeddings
Llama.cpp
Accelerated llama.cpp
Q4_K_M
GPU 1x Nvidia T4
$ 0.5
Author avatar
Qwen /

Qwen2.5-14B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 2x Nvidia A100
$ 8
Author avatar
Qwen /

Qwen2.5-14B-Instruct-1M

Text Generation
TGI
Accelerated Text Generation Inference
GPU 2x Nvidia A100
$ 8
Author avatar
Qwen /

Qwen2.5-72B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 4x Nvidia A100
$ 16
Author avatar
Qwen /

Qwen2.5-7B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 1x Nvidia L40S
$ 1.8
Author avatar
Qwen /

Qwen2.5-Coder-14B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 2x Nvidia A100
$ 8
Author avatar
Qwen /

Qwen2.5-Coder-32B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 2x Nvidia A100
$ 8
Author avatar
bartowski /

Qwen2.5-Coder-32B-Instruct-GGUF

Text Generation
Llama.cpp
Accelerated llama.cpp
Q4_K_M
GPU 1x Nvidia L4
$ 0.8
Author avatar
Qwen /

Qwen2.5-Coder-7B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 1x Nvidia L40S
$ 1.8
Author avatar
Qwen /

Qwen2.5-Math-72B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 4x Nvidia A100
$ 16
Author avatar
Qwen /

Qwen2.5-Math-7B-Instruct

Text Generation
TGI
Accelerated Text Generation Inference
GPU 1x Nvidia L40S
$ 1.8
Author avatar
Qwen /

Qwen2.5-VL-7B-Instruct

Image-Text-to-Text
TGI
Accelerated Text Generation Inference
GPU 1x Nvidia L40S
$ 1.8
Author avatar
Qwen /

QwQ-32B

Text Generation
TGI
Accelerated Text Generation Inference
GPU 4x Nvidia L4
$ 3.8