Back to Catalog
Create Endpoint
meta-llama

Llama-3.1-70B-Instruct

Catalog model officially supported by Inference Endpoints.

This model is from our Model Catalog, and comes with an optimized configuration. Deployment has been verified by Hugging Face.

new-account
/
$8.30 / h
per running replica

Contact us if you'd like to request a custom solution or instance type.

Nvidia L40S
4x GPUs · 192 GB 47x vCPUs · 380 GB
$8.3 / h
available
Hardware should be compatible with the selected model.
  • Only you can access your endpoint, using a Hugging Face Token generated from your personal account.