Get Started With NVIDIA Triton

Find the right license to deploy, run, and scale AI for any application on any platform.

NVIDIA Triton Licensing Options

	GitHub For individuals looking to get access to Triton Inference Server open-source code for development.	NVIDIA NGC For individuals looking to access free Triton Inference Server containers for development.	NVIDIA AI Enterprise For enterprises looking to purchase Triton for production.
Features	Access Code	Get Container	Contact Sales
NVIDIA Triton™ Inference Server
Custom builds (Windows, NVIDIA® Jetson™), PyTriton
Prebuilt Docker container (version dependencies: CUDA®, framework
Triton Management Service (model orchestration for large-scale deployments)
AI Workflows and reference architectures for common AI use cases
Workload and infrastructure management features
Business-standard support, including: Unlimited technical support cases accepted via the customer portal and phone 24/7 Escalation support during local business hours (9:00 a.m.–5:00 p.m., Monday–Friday) Timely resolution provided by NVIDIA experts and engineers Security fixes and priority notifications Production branches that ensure API stability Three years of long-term support
Hands-on NVIDIA LaunchPad labs			Try Now
	Access Code	Get Container	Contact Sales

FAQs

NVIDIA Triton Inference Server, or Triton for short, is an open-source inference serving software. It lets teams deploy, run, and scale AI models from any framework (TensorFlow, NVIDIA TensorRT™, PyTorch, ONNX, XGBoost, Python, custom, and more) on any GPU- or CPU-based infrastructure (cloud, data center, or edge). For more information, please visit the Triton webpage.

Triton Model Analyzer is an offline tool for optimizing inference deployment configurations (batch size, number of model instances, etc.) for throughput, latency, and/or memory constraints on the target GPU or CPU. It supports analysis of a single model, model ensembles, and multiple concurrent models.

Triton is included with NVIDIA AI Enterprise, an end-to-end AI software platform with enterprise-grade support, security stability, and manageability. NVIDIA AI Enterprise includes Business Standard Support that provides access to NVIDIA AI experts, customer training, knowledge base resources, and more. Additional enterprise support and services are also available, including business-critical support, dedicated technical account manager, training, and professional services. For more information, please visit the Enterprise Support and Services User Guide.

Yes, there are several labs that use Triton in NVIDIA Launchpad.

NVIDIA LaunchPad is a program that provides users short-term access to Enterprise NVIDIA Hardware and Software via a web browser. Select from a large catalog of hands-on labs to experience solutions surrounding use cases from AI and data science to 3D design and infrastructure optimization. Enterprises can immediately tap into the necessary hardware and software stacks on private hosted infrastructure.

Yes, Triton is the top ecosystem choice for AI inference and model deployment. Triton is available in AWS, Microsoft Azure, and Google Cloud marketplaces with NVIDIA AI Enterprise. It’s also available in Alibaba Cloud, Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Container Service (ECS), Amazon SageMaker, Google Kubernetes Engine (GKE), Google Vertex AI, HPE Ezmeral, Microsoft Azure Kubernetes Service (AKS), Azure Machine Learning, and Oracle Cloud Infrastructure Data Science Platform.

Stay up to date on the latest AI inference news from NVIDIA.

Get Started With NVIDIA Triton

NVIDIA Triton Licensing Options

GitHub

NVIDIA NGC

NVIDIA AI Enterprise

Features

FAQs

What is NVIDIA Triton Inference Server?

When should I use the Triton Model Analyzer?

How can customers get enterprise support for Triton?

Is there a Triton lab in NVIDIA Launchpad?

Is Triton available from cloud service providers?

Contact Us