0% found this document useful (0 votes)
140 views24 pages

Building A Data Center Digital Twin With Omniverse and Air - 1647701500635001NYqj

The document discusses building a digital twin of a data center with NVIDIA Omniverse and NVIDIA Air. It describes the challenges of complex AI infrastructure and how simulation can help design, deployment, and ongoing operations. Examples of simulations shown include cable layouts, temperature modeling, and heat sink design.

Uploaded by

Sergiu Diaconu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views24 pages

Building A Data Center Digital Twin With Omniverse and Air - 1647701500635001NYqj

The document discusses building a digital twin of a data center with NVIDIA Omniverse and NVIDIA Air. It describes the challenges of complex AI infrastructure and how simulation can help design, deployment, and ongoing operations. Examples of simulations shown include cable layouts, temperature modeling, and heat sink design.

Uploaded by

Sergiu Diaconu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

S42302

BUILDING A DATA CENTER DIGITAL TWIN


WITH NVIDIA OMNIVERSE AND NVIDIA AIR
AMIT KATZ AND MARC HAMILTON
THE CHALLENGE OF AI INFRASTRUCTURE
Enterprise AI requires time, expertise and the right approach to architecture

DESIGN LENGTHY ON-GOING


COMPLEXITY DEPLOYMENT OPERATIONS

Ensuring predictable Procuring, installing, integrating Training and ramping day


performance that scales multiple technologies to day production

2
DATA CENTER DESIGN IS AN EXTREMELY COMPLEX TEAM SPORT
TODAY NO TOOL SPANS COMPONENT TO FULL DATA CENTER

LARGE TEAMS WITH DIVERSE SKILLS MANY VENDORS, MANY TOOLS RISE OF IMPORTANCE OF CFD
Component design (heatsink, etc), server design, Design teams plagued with often incompatible tools Ever rising power densities and new focus on energy
rack layout, enclosure layout, building, CFD, causes tedious import-export, mistakes, time lost. efficiency bring new design challenges to data
cooling, power, etc. Design artifacts aren’t re-used in operations. center

.
COMPLEX 8-RAIL CABLE LAYOUT

4
CAMBRIDGE-1 SIMULATION OF UNDERFLOOR PRESSURE

5
SIMULATION OF TEMPERATURE 4 FEET ABOVE RAISED FLOOR

60s – Temperature Plot 4 ft. Above Raised Floor


Mean Temperature: 95.1°F (excluding cold aisle)
Max Temperature: 110.1°F
6
NVIDIA DGX SUPERPOD CFD SIMULATION

7
CAMBRIDGE-1 IN OMNIVERSE

8
NVIDIA OMNIVERSE ENTERPRISE

9
Simulation Data Observation Data
Model Layer Templates

...
...
Initial & Boundary Differential Equations
Conditions 𝛻𝑥
𝐷ℎ 𝐷𝑝

...

...
ρ = + s ⋅ 𝑘s𝑇 + Φ 𝛻𝑦
𝐷𝑡 𝐷𝑡
Parameterized 𝜕𝜌
Geometry + s ⋅ (𝜌𝑢) = 0
𝜕𝑡

MODULUS
Data Preparation Module Neural Physics Model Compiler

Neural Physics Training Engine

TensorFlow/PyTorch NVIDIA AI stack

NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE. 10


NVIDIA MODULUS - NVSWITCH HEAT SINK
Validation/Verification CFD Solvers vs Modulus/SimNet

11
PARAMETERIZED DGX-A100 NVSWITCH HEAT SINK
10,000x faster using Modulus

Computational Times
(10 parameters, 3 values per parameter)

Modulus (Training Time) 10,800 V100 GPU hrs.

Traditional Solver (OpenFOAM) 18.4 Million CPU hrs.


59,049 separate runs
(26 wall hours on 12 CPU cores)

12
Remove DPU from this slide?

NVIDIA BASE COMMAND


Connecting Real World and Digital Twin

Proven solutions already used Dashboard / Analytics Infrastructure Monitoring Resource allocation
within NVIDIA

• Base Command Manager


• Part of every new DGX SuperPOD
deployment
DGX SuperPOD
• Infrastructure management for IT
Base Command Manager
• Scheduling, resource utilization,
analytics, etc.
Deployment | Provisioning | Monitoring | Logging & Alerting | SLURM

Features:
• Provisioning & lifecycle management
IT/DevOps
• Monitoring & Telemetry
• Logging & Alerting
• SLURM scheduler

13
NVIDIA AIR

NVIDIA AIR
14
OMNIVERSE SIMULATES THE WORLD

Architecture, Engineering, and Media, Entertainment, and Game Product Design and Manufacturing
Construction Development

Scientific Visualization Robotics Autonomous Vehicles


DATA CENTER DIGITAL TWINS
Food for thought

Foster + Partners simulates the bridge before pouring concrete

Daimler validates the AV software stack before driving a mile

Why would a data center be different?


DATA CENTER DIGITAL TWINS
Food for thought

Structural Models

AV Models

Data Center Models


THE JOURNEY TO A SIMULATED DATA CENTER
Network Operations Have Changed

1990

1990: CLI commands in the 2000: Test-to- 2020: Virtual to Physical Tomorrow: Digital Twin The Future: DC
production environment production pipelines networks from NVIDIA Air for E2E DC Validation Recommendation Engine
DATA CENTER SIMULATION
Leveraging the combined power of Omniverse and NVIDIA Air

Architecture, Engineering, and Data Center: Space, power,


Construction Cooling and cabling

Courtesy of
WPP
Data Center Facility Design Data Center Network Design Change Management
NVIDIA AIR
Platform for Simulating and Emulating Data Center Infrastructure

Hardware
DevOps SPECTRUM in the NVIDIA OFFERINGS
Loop Internal APIs

VM
VM
NOS VM
FW Real BLUEFIELD EGX DGX
ASIC Compute,
Architecture
API Network, Endpoints
WebUI Storage
Digital Twin
CERTIFIED PARTNER OFFERINGS
Engineering VM
VM
VM Gateway 3rd Party APIs

VM
VM
VM

Virtualized Network FIREWALL SDN STORAGE


AI

Operator
Outbound Inbound
Connectivity NVIDIA AIR PLATFORM Connectivity
DEPLOYMENT LIFE CYCLE
From POC to Decommission
DAY 0 DAY 1 DAY 2

PLAN BUILD MAINTAIN

DevOps
Digital Twin Physical DC

3. Deploy CI/CD
Export change 1. Apply change
Configuration
Physical Twin
Digital Twin
2. Validate
change

Training & Education Change Validation


Automation Development
Presales PoC Labs Troubleshooting Assistance
Provisioning Process Development
Interop Testing New Personnel Onboarding
NVIDIA AIR
Solving the Hardest Challenges Through Cloud Agility With On-Prem Economics

Full stack Real world Testing Availability and


workflow hardware software stack accessibility of
modeling validation interoperability test tools

NVIDIA AIR

END TO END SIMULATION HIGH FIDELITY OPEN ECOSYSTEM PUBLIC CLOUD PLATFORM
SEE YOU IN THE SIMULATION

EXPLORE OMNIVERSE ENTERPRISE GET ACCESS TO A FREE OMNIVERSE TRIAL DEVELOP ON OMNIVERSE

EXPLORE NVIDIA AIR JUMP START YOUR NETWORK AUTOMATION DIVE INTO THE NVIDIA AIR DOCS

You might also like