Back to Catalog
Create Endpoint
deepseek-ai

DeepSeek-R1-Distill-Qwen-32B

Catalog model officially supported by Inference Endpoints.

This model is from our Model Catalog, and comes with an optimized configuration. Deployment has been verified by Hugging Face.

/
$3.80 / h
per running replica

Contact us if you'd like to request a custom solution or instance type.

Nvidia L4
4x GPUs · 96 GB 47x vCPUs · 185 GB
$3.8 / h
available
Hardware should be compatible with the selected model.
  • Only you can access your endpoint, using a Hugging Face Token generated from your personal account.
Number of replicas
Automatically scale the number of replicas within Min and Max based on compute usage. Min is always 0 if Scale-To-Zero is active.
More options
Autoscaling Strategy
Control what type of trigger will cause your Endpoint to scale up.
Container Arguments
Arguments passed to the container entrypoint.
optional
Container Command
Command executed in the container.
optional
Default Env
Environment variables that will be provided to your container during deployment.
Secret Env
Same as Default, but people with access to this endpoint will not be able to read these values after creation.
VPC Config
Check to activate and configure AWS PrivateLink