Model Catalog

25 items
Author avatar
ggml-org /

gemma-4-26B-A4B-it-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
Q4_K_M
GPU 1x Nvidia L4
$ 0.8
Author avatar
google /

gemma-4-31B-it

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 2x Nvidia H200
$ 10
Author avatar
google /

gemma-4-26B-A4B-it

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia H200
$ 5
Author avatar
unsloth /

Qwen3.5-9B-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
Q4_K_M
GPU 1x Nvidia T4
$ 0.5
Author avatar
unsloth /

Qwen3.5-35B-A3B-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
Q4_K_M
GPU 1x Nvidia L4
$ 0.8
Author avatar
unsloth /

Qwen3.5-397B-A17B-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
MXFP4_MOE
GPU 2x Nvidia H200
$ 10
Author avatar
allenai /

olmOCR-2-7B-1025-FP8

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L40S
$ 1.8
Author avatar
Qwen /

Qwen3-VL-30B-A3B-Thinking

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 2x Nvidia A100
$ 5
Author avatar
Qwen /

Qwen3-VL-8B-Instruct

Deployed 153 times
Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia A100
$ 2.5
Author avatar
lightonai /

LightOnOCR-1B-1025

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L4
$ 0.8
Author avatar
deepseek-ai /

DeepSeek-OCR

Deployed 198 times
Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L4
$ 0.8
Author avatar
zai-org /

GLM-4.1V-9B-Thinking

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L40S
$ 1.8
Author avatar
ServiceNow-AI /

Apriel-1.5-15b-Thinker

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia H200
$ 5
Author avatar
rednote-hilab /

dots.ocr

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L4
$ 0.8
Author avatar
nanonets /

Nanonets-OCR-s

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L40S
$ 1.8
Author avatar
ggml-org /

InternVL3-14B-Instruct-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
Q8_0
GPU 1x Nvidia L4
$ 0.8
Author avatar
ggml-org /

Qwen2.5-VL-3B-Instruct-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
Q8_0
GPU 1x Nvidia T4
$ 0.5
Author avatar
ggml-org /

Qwen2.5-VL-7B-Instruct-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
Q8_0
GPU 1x Nvidia T4
$ 0.5
Author avatar
ggml-org /

SmolVLM2-2.2B-Instruct-GGUF

Image-Text-to-Text
Llama.cpp
Accelerated llama.cpp
F16
GPU 1x Nvidia T4
$ 0.5
Author avatar
Qwen /

Qwen2.5-VL-7B-Instruct

Deployed 281 times
Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia A100
$ 2.5
Author avatar
google /

paligemma2-10b-mix-448

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L40S
$ 1.8
Author avatar
google /

paligemma2-10b-mix-224

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L40S
$ 1.8
Author avatar
google /

paligemma2-3b-mix-448

Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L4
$ 0.8
Author avatar
google /

gemma-3-12b-it

Deployed 253 times
Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia L40S
$ 1.8
Author avatar
google /

gemma-3-27b-it

Deployed 301 times
Image-Text-to-Text
vLLM
Accelerated vLLM
GPU 1x Nvidia A100
$ 2.5