NGC Catalog
CLASSIC
Welcome Guest
Collections
NeMo Microservices

NeMo Microservices

For contents of this collection and more information, please view on a desktop device.
Logo for NeMo Microservices
Features
Description
This collection contains NeMo microservices that provide a comprehensive suite of features to build an end-to-end platform for fine-tuning, evaluating, and serving large language models (LLMs) on your Kubernetes cluster.
Curator
NVIDIA
Modified
June 11, 2025
Containers
Sorry, your browser does not support inline SVG.
Helm Charts
Sorry, your browser does not support inline SVG.
Models
Sorry, your browser does not support inline SVG.
Resources
Sorry, your browser does not support inline SVG.

Overview

NVIDIA NeMo microservices provide a modular platform for building and deploying AI workflows across on-premises or cloud Kubernetes environments. These microservices leverage proprietary data and enable continuous optimization of AI applications through a powerful data flywheel architecture.

Core Microservices

  • NVIDIA NeMo Customizer: Facilitates the fine-tuning of large language models (LLMs) using supervised and parameter-efficient fine-tuning techniques.
  • NVIDIA NeMo Evaluator: Provides comprehensive evaluation capabilities for LLMs, supporting academic benchmarks, custom automated evaluations, and LLM-as-a-Judge approaches.
  • NVIDIA NeMo Guardrails: Adds safety checks and content moderation to LLM endpoints, protecting against hallucinations, harmful content, and security vulnerabilities.

Platform Component Microservices

  • NVIDIA NeMo Data Store: Serves as the default file storage solution for the NeMo microservices platform, exposing APIs compatible with the Hugging Face Hub client (HfApi).
  • NVIDIA NeMo Entity Store: Provides tools to manage and organize general entities such as namespaces, projects, datasets, and models.
  • NVIDIA NeMo Deployment Management: Provides an API to deploy NIM for LLMs on a Kubernetes cluster and manages them through the NIM Operator microservice.
  • NVIDIA NeMo NIM Proxy: Provides a unified endpoint for accessing all deployed NIM for LLMs for inference tasks.
  • NVIDIA NeMo Operator: Manages custom resource definitions (CRDs) for NeMo Customizer fine-tuning jobs.
  • DGX Cloud Admission Controller: Enables multi-node training requirements for NeMo Customizer jobs through a mutating admission webhook.

Helm Charts

  • Use the NeMo microservices platform Helm chart to install all microservices and their dependencies. This parent Helm chart simplifies the deployment process by bundling all individual Helm charts of the NeMo microservices into one Helm chart.
  • Use the individual Helm charts if you prefer to install specific microservices:
    • NeMo Customizer Helm Chart
    • NeMo Evaluator Helm Chart
    • NeMo Guardrails Helm Chart
  • You can find all other relevant Helm charts and files under the artifacts tab.

Documentation

  • For more information about the NeMo microservices, visit the NeMo microservices documentation.
  • For microservice-specific documentation, visit the following links:
    • Fine-tuning with NeMo Customizer
    • Evaluating with NeMo Evaluator
    • Guardrailing with NeMo Guardrails

Note: Use, distribution or deployment of these microservices in production requires an NVIDIA AI Enterprise License.

Resources

  • Learn more about NVIDIA NeMo
  • NeMo Customizer Developer Page
  • NeMo Evaluator Developer Page
  • NeMo Guardrails Developer Page
  • Blog: Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices
  • Video: Customizing AI Agents for Tool Calling with NVIDIA NeMo Microservices

GOVERNING TERMS

The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.