0% found this document useful (0 votes)
284 views8 pages

High-Density AI Rack Design Overview

EcoStruxure™ Reference Design 108 outlines a Tier III data center design optimized for high-density AI clusters, specifically using NVIDIA's GB200 NVL72 racks, with a total IT capacity of 7392 kW. The design includes advanced facility power and cooling systems, featuring liquid cooling and redundant configurations to ensure efficiency and reliability. It also emphasizes the integration of lifecycle software for planning and operational management to support the unique demands of AI workloads.

Uploaded by

hellmannconsult
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
284 views8 pages

High-Density AI Rack Design Overview

EcoStruxure™ Reference Design 108 outlines a Tier III data center design optimized for high-density AI clusters, specifically using NVIDIA's GB200 NVL72 racks, with a total IT capacity of 7392 kW. The design includes advanced facility power and cooling systems, featuring liquid cooling and redundant configurations to ensure efficiency and reliability. It also emphasizes the integration of lifecycle software for planning and operational management to support the unique demands of AI workloads.

Uploaded by

hellmannconsult
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

[EcoStruxure™ Reference Design 108]

7392 kW, Tier III, IEC, Chilled Water,


Liquid-Cooled AI Clusters
Design Overview

Data Center IT Capacity


7392 kW Introduction
High-density AI clusters and liquid cooling bring new challenges to data
Target Availability center design. Schneider Electric’s data center reference designs help
Tier III shorten the planning process by providing validated, proven, and
documented data center physical infrastructure designs to address such
Annualized PUE at 100% Load challenges. This design focuses on the deployment of high-density AI
Paris: 1.10 clusters, specifically NVIDIA’s GB200 NVL72 racks, in a single IT room. The
Singapore: 1.17 IT room is purpose-built and optimized for three NVIDIA 1152 GPU DGX
SuperPOD GB200 clusters using liquid-to-liquid CDUs and high
Racks and Density temperature chillers. Facility power and cooling design are optimized for
Total Racks: 96 capital cost, efficiency, and reliability.
Rack Power Density: Reference Design 108 includes information on four technical areas: facility
Networking racks up to 22 kW/rack power, facility cooling, IT space and lifecycle software. These areas
AI racks up to 132 kW/rack represent the integrated systems required to meet the design’s
specifications provided in this overview document.
Data Center Overall Space
3509 m2

Regional Voltage and Frequency


400V, 50Hz

About this Design

• IT space and power distribution


designed to accommodate AI clusters
with density up to 132 kW per rack

• Design optimized to support liquid-


cooled racks, with liquid-to-liquid
coolant distribution units (CDUs) and
extra high temperature chillers

• Chilled water systems optimized for


high water temperatures using Uniflair
FWCV fan walls and Uniflair XRAF air-
cooled packaged chillers

• Redundant design for increased


availability and concurrent
maintainability

Public
[EcoStruxure™ Reference Design 108] 2

Facility Power
Facility Power Block Diagram The facility power system supplies power to all components within the data center.
Utility A Utility B Utility C Utility D In this concurrently maintainable electrical design, power to the IT rooms is
3 MVA
MV/LV TX
G
3 MW
Genset 3 MVA
MV/LV TX
G
3 MW
Genset 3 MVA
MV/LV TX
G
3 MW
Genset 3 MVA
MV/LV TX
G
3 MW
Genset supplied through four 3 MW powertrains. The four powertrains provide 3+1
ATS ATS ATS ATS distributed redundant UPS power to the IT space, backed up by diesel generators.
LV SWB. A LV SWB. B LV SWB. C LV SWB. D
UPS
1A
UPS
2A
UPS
3A
UPS
1B
UPS
2B
UPS
3B
UPS
1C
UPS
2C
UPS
3C
UPS
1D
UPS
2D
UPS
3D
Each powertrain consists of a 5000-amp Okken main switchboard feeding three
LV Dist. A LV Dist. B LV Dist. C LV Dist. D
1000 kW Galaxy VXL UPS with 5 minutes of runtime in parallel and a 5000-amp
Okken distribution section. At this loading, the Galaxy VXL UPSs’ overload
capacity can manage NVIDIA’s NVL72 racks’ electrical design point (EDP) power
peak. Downstream, these powertrains feed Canalis KS 800 A busways, that
AI IT POD (2.4 MW)
power the IT racks with 3+1 redundancy. Separately, two powertrains feed the
AI IT POD (2.4 MW) AI IT POD (2.4 MW)

Mechanical Powertrain - Paris Mechanical Powertrain - Singapore

Utility A Utility B Utility A Utility B fan walls, and chillers with 2N redundant power, each powertrain sized 2.5 MVA
2.5 MVA
MV/LV TX
G
2 MW
Genset 2.5 MVA
MV/LV TX
G
2 MW
Genset
2 MVA
MV/LV TX
G
1.6 MW
Genset 2 MVA
MV/LV TX
G
1.6 MW
Genset for Paris, France or 2 MVA for Singapore. They also feed a 200 kW Galaxy VL
ATS ATS ATS ATS
UPS that provides critical power to the liquid-to-liquid coolant distribution units
MECH SWB. A MECH SWB. B MECH SWB. A MECH SWB. B
MECH
UPS A
MECH
UPS B
MECH
UPS A
MECH
UPS B
(CDUs) and facility water system pumps.
LV Dist. A LV Dist. B LV Dist. A LV Dist. B

Fan Walls, & Fan Walls, &


The facility power system is designed to support integrated peripheral devices like
Chillers Chillers
fire panels, access control systems, and environmental monitoring and control
CDUs & Pumps CDUs & Pumps
devices. Power meters in the electrical path monitor power quality and allow for
predictive maintenance & diagnostics of the system. These meters also integrate
with EcoStruxureTM Power Operation.
Every component in this design is built and tested to the applicable IEC standards.
Further design details, such as dimensions, schematics, and equipment lists are
available in the engineering package.

Facility Power Attributes


Design Options Name Value Unit
This reference design can be modified as Total facility peak power (IT and cooling) 11000 kW
follows without a significant effect on the
Total amps (IT main bus, each) 5000 A
design’s performance attributes:
Input voltage (IT main bus) 400 V
• Provision for load bank
• Change UPS battery type & runtime Switchboard kAIC (IT main bus) 50 kA
• Add/remove/change standby Generator redundancy (IT main bus) Distributed redundant
generators: IT power path 3+1
o Location & tank size IT space UPS capacity, per powertrain 3000 kW
IT space UPS redundancy Distributed redundant
IT space UPS runtime @ rated load 5 minutes
IT space UPS output voltage 400 V
Total amps (Facility cooling bus, each) 3200 A
Input voltage (Facility cooling bus) 400 V
66 (Paris)
Switchboard kAIC (Facility cooling bus) kA
50 (Singapore)
Generator redundancy (Facility cooling
2N
bus)
Facility cooling UPS capacity 200 kW
Facility cooling UPS redundancy 2N
Facility cooling UPS runtime @ rated
5 minutes
load
Public

Document Number RD108DS Revision 3


[EcoStruxure™ Reference Design 108] 3

Facility Cooling
Facility Cooling Block Diagrams
The facility cooling design features a dual path piping system optimized for data
center efficiency. A chilled water loop integrates Uniflair XRAF chillers, with free
cooling capabilities, to deliver 23°C chilled water to fan walls in N+1
configuration. This lower temperature water loop is required to handle air
cooling needs of the data center. A separate high temperature water loop
supplies 37°C water to Motivair liquid-to-liquid CDUs used to cool NVIDIA’s
GB200 NVL72 liquid cooled racks. Outdoor heat rejection for this loop is
supported by Uniflair EHT XRAF chillers.
An integrated thermal storage system provides 5 minutes of continuous cooling,
in case of power outage, to allow the chillers to restart. The CDUs and facility
pumps are on UPS power. More information on fan wall and CDU cooling
architecture is detailed in the IT room section of this document.
This design is instrumented to work with EcoStruxureTM IT Expert and AVEVA
Unified Operations Center.
Further design details such as dimensions, schematics, and equipment lists are
available in the engineering package.

Facility Cooling Attributes


Name Value Unit
7730 (Paris)
Total max cooling capacity kW
8423 (Singapore)
Input voltage 400 V
Heat rejection medium Water
Chiller redundancy N+1
Packaged chiller with
Outdoor heat exchange
Design Options available free cooling
Chiller CW supply temperature 23 °C
This reference design can be modified as Chiller CW return temperature 33 °C
follows without a significant effect on the HT chiller CW supply temperature 37 °C
design’s performance attributes: HT chiller CW return temperature 47 °C
Combined* storage tank size 56.8 m3
• Change storage tank size
• Integrate dry coolers with adiabatic Ride-through time 5 minutes
Outdoor ambient temperature
assist to further optimize energy -9.6 to 39.3 °C
range
electricity Economizer type Water-side
*Summation of both facility water systems

Public

Document Number RD108DS Revision 3


[EcoStruxure™ Reference Design 108] 4

IT Room
IT Room Diagrams
The IT room features forty-eight 132 kW liquid cooled IT racks, modeled after
NVIDIA’s GB200 NVL72, with forty-eight 22 kW air-cooled networking racks
organized into three pods. Each pod consists of two rows of eight 132 kW IT
racks and eight 22 kW networking racks, with the 132 kW IT racks bookended
by an empty rack (if additional rack space is required) and the networking racks.
The liquid cooled racks remove 87% of the heat via liquid while 13% requires air.
Each pod is deployed with Prefabricated Modular EcoStruxureTM Pod Data
Center to provide a 2 m wide ducted hot aisle containment for proper airflow,
busway and cabling support, and TCS piping. Ducted hot aisles and a common
ceiling plenum return hot air to the fan walls for cooling,
Six Uniflair FWCV chilled water fan walls supply conditioned air to the IT room in
an N+1 configuration. Three Motivair Liquid to Liquid CDUs provide precise liquid
cooling to each pod with N+1 redundancy. Redundant piping system across the
IT room provides an alternate path for chilled water in case of cooling equipment
failure or maintenance.
The 22 kW networking racks are NetShelter SX racks configured with 1+1 63 A
NetShelter Rack PDU Advanced rack-mount power distribution units (rPDUs).
The 132 kW AI racks are NetShelter Open Architecture MGX racks configured
with 6+2 33 kW NetShelter Open Architecture Power Shelves. Each row is
powered by four Canalis KS 800 A busways, providing A-, B-, C-, and D-side
power to the row. Each 132 kW AI rack has a pair of 63 A feeds coming from a
tap-off unit on each busway, for a total of eight power feeds per rack from the
busways – one for each power shelf. Each 22 kW networking rack has a pair of
63 A feeds from two of the four 800 A busways for 2N redundant feeds. Each tap
off unit can be configured to house up to two 63 A ComPacT NSXm circuit
breakers with associated auxiliaries (e.g., shunt trip for leak detection) and an
Acti9 iEM3000 energy meter.

IT Room Attributes
Name Value Unit
IT load 7,392 kW
Supply voltage to IT 400 V
Rack power feed redundancy Redundant
Number of 132 kW liquid-cooled racks 48 racks
Number of 22 kW networking racks 48 racks
Design Options IT floor space 510 m2
CRAC/CRAH type Fan wall
This reference design can be modified as CRAC/CRAH redundancy N+1
follows without a significant effect on the CRAC/CRAH supply air temperature 28 °C
design’s performance attributes:
CW supply temperature 23 °C
• Use Uniflair FXCV fan walls CW return temperature 33 °C
• Use Uniflair CPOR1000A liquid-to- Containment type Ducted hot aisle
liquid CDUs CDU type L2L
• CRAHs can be selected instead of CDU redundancy N+1
fan walls CDU CW supply temperature 37 °C
• Variations in AI cluster configuration CDU CW return temperature 47 °C
• Use NetShelter Aisle Containment TCS loop supply temperature 40 °C
TCS loop return temperature 50 °C
Public

Document Number RD108DS Revision 3


[EcoStruxure™ Reference Design 108] 5

Lifecycle Software
High-density AI clusters push the limits of data center facility infrastructure, so it’s
critical to leverage advanced planning and operation tools to ensure safe and
reliable operations.
Planning & Design
Electrical Safety and Reliability: Due to the high amount of power supplied to an
AI cluster, design specifications such as available fault current, arc flash hazards
and breaker selectivity must be analyzed in the design phase. Applications like
Ecodial and ETAP simulate the electrical design and reduce the chance of costly
mistakes or even worse, injury.
Cooling: AI clusters are pushing the limits of what can be done with air-cooling.
Modeling the IT space with computational fluid dynamics (CFD) helps spot issues
including high pressure areas, rack recirculation, and hot spots. This is especially
true when retrofitting an existing data center with an AI cluster. Schneider
Electric’s EcoStruxureTM IT Design CFD can quickly model airflow, allowing rapid
iteration to find the best design and layout.
Operations
EcoStruxureTM is Schneider Electric’s open, interoperable, integrated Internet of
Things (IOT)-enabled system architecture and platform. It consists of three
layers: connected products, edge control, and applications, analytics, and
services.

EcoStruxureTM Data Center is a combination of three domains of EcoStruxureTM:


Power, Building, and IT. Each domain is focused on a subsystem of the data
center: power, cooling, and IT. These three domains combined will reduce risks,
increase efficiencies, and speed operations across the entire facility.

• EcoStruxureTM Power monitors power quality, generates alerts, while


protecting and controlling the electrical distribution the electrical
distribution system of the data center from the MV level to the LV level.
It uses any device for monitoring and alerting, uses predictive analytics
for increased safety, availability, and efficiency, while lowering
maintenance costs.
• EcoStruxureTM Building controls cooling effectively while driving
reliability, efficiency, and safety of building management, security, and
fire systems. It performs data analytics on assets, energy use, and
operational performance.
• EcoStruxureTM IT makes IT infrastructure more reliable and efficient
while simplifying management by offering complete visibility, alerting
Visit EcoStruxureTM for Data Center and modelling tools. It receives data, generates alerts, predictive
for more details. analytics, and system advice on any device to optimize availability and
efficiency in the IT space.
There are several options for supervisory visibility and control. AVEVA Unified
Operations Center can provide visibility at a site or across an entire enterprise.

Public

Document Number RD108DS Revision 3


[EcoStruxure™ Reference Design 108] 6

Design Attributes
OVERVIEW Value Unit
Target availability III Tier
1.10 Paris
Annualized PUE at 100% load
1.17 Singapore
Data center IT capacity 7392 kW
Data center overall space 3509 m2
Maximum rack power density 132 kW/rack
FACILITY POWER Value Unit
Total facility peak power (IT and cooling) 11000 kW
Total amps (IT main bus, each) 5000 A
Input voltage (IT main bus) 400 V
Switchboard kAIC 50 kA
Generator redundancy (IT main bus) Distributed redundant
IT Power path 3+1
IT space UPS capacity, per powertrain 3000 kW
IT space UPS redundancy Distributed redundant
IT space UPS runtime @ rated load 5 minutes
IT space UPS output voltage 400 V
Total amps (facility cooling bus, each) 3200 A
Input voltage (facility cooling bus) 400 V
Switchboard kAIC (facility cooling bus) 66 (Paris); 50 (Singapore) kA
Generator redundancy (facility cooling
2N
bus)
FACILITY COOLING Value Unit
Total max cooling capacity 7730 (Paris), 8423 (Singapore) kW
Input voltage 400 V
Heat rejection medium Water
Chiller redundancy N+1
Outdoor heat exchange Packaged chiller with available free cooling
Chiller CW supply temperature 23 °C
Chiller CW return temperature 33 °C
HT chiller CW supply temperature 37 °C
HT chiller CW return temperature 47 °C
Combined* storage tank size 56.8 m3
Ride-through time 5 minutes
Outdoor ambient temperature range -9.6 to 39.3 °C
Economizer type Water-side
*Summation of both facility water systems

Public

Document Number RD108DS Revision 3


[EcoStruxure™ Reference Design 108] 7

Design Attributes continued


IT SPACE Value Unit
IT load 7392 kW
Supply voltage to IT 400 V
Maximum rack power density 132 kW/rack
Number of racks 96 racks
IT floor space 510 m2
Rack power feed redundancy Redundant
CRAC/CRAH redundancy N+1
Containment type Ducted hot aisle
CDU Type L2L
CDU redundancy N+1
CRAC/CRAH supply air temperature 28 °C
CW supply temperature 23 °C
CW return temperature 33 °C
CDU CW supply temperature 37 °C
CDU CW return temperature 47 °C
TCS loop supply temperature 40 °C
TCS loop return supply temperature 50 °C

Public

Document Number RD108DS Revision 3


[EcoStruxure™ Reference Design 108] 8

Schneider Electric Life-Cycle Services


Life Cycle Services
Plan
1 Team of over 7,000 trained specialists covering every
phase and system in the data center
What are my options?

Install Standardized, documented, and validated methodology


How do I install and commission? 2 leveraging automation tools and repeatable processes
developed over 45 years
Operate
How do I operate and maintain?
3 Complete portfolio of services to solve your technical or
business challenge, simplify your life, and reduce costs
Optimize
How do I optimize?

Renew
How do I renew my solution?

Get more information for this design:


Engineering Package
Every reference design is built with technical documentation for engineers
and project managers. This includes engineering schematics (CAD, PDF),
floor layouts, equipment lists containing all the components used in the
design and 3D images showing real world illustrations of our reference
3D spatial views Floor layouts designs.
Documentation is available in multiple formats to suit the needs of both
engineers and managers working on data center projects. The engineering
package of this design can be downloaded here.

One-line schematics Bill of materials

Public
Click here to download the Engineering Package
Document Number RD108DS Revision 3
Email referencedesigns@[Link] for further assistance

You might also like