LLM Techniques

Jun 02, 2025

Scaling to Millions of Tokens with Efficient Long-Context LLM Training

The evolution of large language models (LLMs) has been marked by significant advancements in their ability to process and generate text. Among these...

7 MIN READ

May 27, 2025

Advanced Optimization Strategies for LLM Training on NVIDIA Grace Hopper

In the previous post, Profiling LLM Training Workflows on NVIDIA Grace Hopper, we explored the importance of profiling large language model (LLM) training...

10 MIN READ

An illustration representing NeMo Guardrails.

May 23, 2025

Stream Smarter and Safer: Learn how NVIDIA NeMo Guardrails Enhance LLM Output Streaming

LLM Streaming sends a model's response incrementally in real time, token by token, as it's being generated. The output streaming capability has evolved...

8 MIN READ

Feb 25, 2025

Defining LLM Red Teaming

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to...

10 MIN READ

Feb 25, 2025

Agentic Autonomy Levels and Security

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable...

14 MIN READ

A larger and smaller cartoon llama on a sunny beach, wearing shirts that say 8B and 4B.

Feb 12, 2025

LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework

Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. ...

10 MIN READ

Jan 29, 2025

Mastering LLM Techniques: Evaluation

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and...

12 MIN READ

Jan 16, 2025

Continued Pretraining of State-of-the-Art LLMs for Sovereign AI and Regulated Industries with iGenius and NVIDIA DGX Cloud

In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and...

17 MIN READ

Jan 09, 2025

Announcing Nemotron-CC: A Trillion-Token English Language Dataset for LLM Pretraining

NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large...

4 MIN READ

Icon image of a chart and search symbol, on a purple background.

Dec 17, 2024

Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner

Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...

5 MIN READ

Dec 17, 2024

Develop Multilingual and Cross-Lingual Information Retrieval Systems with Efficient Data Storage

Efficient text retrieval is critical for a broad range of information retrieval applications such as search, question answering, semantic textual similarity,...

8 MIN READ

Dec 16, 2024

Insights, Techniques, and Evaluation for LLM-Driven Knowledge Graphs

Data is the lifeblood of modern enterprises, fueling everything from innovation to strategic decision making. However, as organizations amass ever-growing...

15 MIN READ

Nov 13, 2024

Mastering LLM Techniques: Text Data Processing

Training and customizing LLMs for high accuracy is fraught with challenges, primarily due to their dependency on high-quality data. Poor data quality and...

14 MIN READ

Dataloop and NVIDIA logos on a black background.

Nov 12, 2024

Spotlight: Dataloop Accelerates Multimodal Data Preparation Pipelines for LLMs with NVIDIA NIM

In the rapidly evolving landscape of AI, the preparation of high-quality datasets for large language models (LLMs) has become a critical challenge. It directly...

11 MIN READ

Oct 28, 2024

An Introduction to Model Merging for LLMs

One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model....

10 MIN READ

Oct 24, 2024

Augmenting Security Operations Centers with Accelerated Alert Triage and LLM Agents Using NVIDIA Morpheus

Every day, security operation center (SOC) analysts receive an overwhelming amount of incoming security alerts. To ensure the continued safety of their...

7 MIN READ