0% found this document useful (0 votes)

139 views12 pages

Titan System Insights for Scientists

The Titan supercomputer at Oak Ridge National Laboratory features 18,688 compute nodes each with a 16-core AMD Opteron CPU and NVIDIA Tesla K20x GPU providing a total peak performance of 27.1 petaflops. It uses a Cray XK7 system architecture with Gemini interconnect and has over 710 terabytes of memory. Early benchmark results show GPU acceleration providing performance gains of up to 3.8 times for some applications compared to a previous Cray XE6 system without GPUs. Scientists are using Titan for materials science, climate modeling, combustion simulation, and other research.

Uploaded by

bernasek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

139 views12 pages

Titan System Insights for Scientists

Uploaded by

bernasek

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Buddy Bland Project Director Oak Ridge Leadership Computing Facility

November 13, 2012

Office of Science

ORNLs Titan Hybrid System: Cray XK7 with AMD Opteron and NVIDIA Tesla processors

4,352 ft2 404 m2

SYSTEM SPECIFICATIONS: Peak performance of 27.1 PF (24.5 & 2.6) 18,688 Compute Nodes each with: 16-Core AMD Opteron CPU (32 GB) NVIDIA Tesla K20x GPU (6 GB) 512 Service and I/O nodes 200 Cabinets 710 TB total system memory Cray Gemini 3D Torus Interconnect 8.9 MW peak power 8.3 avg. Buddy Bland SC12

X86 processor provides fast, single thread performance for control & communications AMD Opteron 6274

16 cores 141 GFLOPs peak

Buddy Bland SC12

GPUs are designed for extreme parallelism, performance & power efficiency

NVIDIA Tesla K20x

14 Streaming Multiprocessors

2,688 CUDA cores

1.31 TFLOPs peak (DP) 6 GB GDDR5 memory HPL: >2.0 GFLOPs per Watt
6
Buddy Bland SC12

(Titan full system measured power)

Cray XK7 Compute Node

XK7 Compute Node Characteristics
AMD Opteron 6274 16 core processor @ 141 GF Tesla K20x @ 1311 GF Host Memory 32GB 1600 MHz DDR3

Tesla K20x Memory 6GB GDDR5

Gemini High Speed Interconnect

Y
X

Slide courtesy of Cray, Inc.

Buddy Bland SC12

Titan: Cray XK7 System

System: 200 Cabinets 18,688 Nodes 27 PF 710 TB

Compute Node: 1.45 TF 38 GB

Board: 4 Compute Nodes 5.8 TF 152 GB

Cabinet: 24 Boards 96 Nodes 139 TF 3.6 TB

Buddy Bland SC12

Why GPUs? High Performance and Power Efficiency on a Path to Exascale

Hierarchical parallelism Improves scalability of applications Exposing more parallelism through code refactoring and source code directives Heterogeneous multi-core processor architecture Use the right type of processor for each task.

Data locality Keep the data near the processing. GPU has high bandwidth to local memory for rapid access. GPU has large internal cache
Explicit data management Explicitly manage data movement between CPU and GPU memories.
13
Buddy Bland SC12

Hybrid Programming Model

On Jaguar, with 299,008 cores, we were seeing the limits of a single level of MPI scaling for most applications To take advantage of the vastly larger parallelism in Titan, users need to use hierarchical parallelism in their codes
Distributed memory: MPI, SHMEM, PGAS Node Local: OpenMP, Pthreads, local MPI communicators Within threads: Vector constructs on GPU, libraries, OpenACC

These are the same types of constructs needed on all multi-PFLOPS computers to scale to the full size of the systems!
14
Buddy Bland SC12

Compilers OpenACC is a set of compiler directives that allows the user to express hierarchical parallelism in the source code so that the compiler can generate parallel code for the target platform, be it GPU, MIC, or vector SIMD on CPU Cray compiler supports XK7 nodes and is OpenACC compatible CAPS HMPP compiler supports C, C++ and Fortran compilation for heterogeneous nodes with OpenACC support PGI compiler supports OpenACC and CUDA Fortran Tools Allinea DDT debugger scales to full system size and with ORNL support will be able to debug heterogeneous (x86/GPU) apps ORNL has worked with the Vampir team at TUD to add support for profiling codes on heterogeneous nodes
15

How do you program these nodes?

CrayPAT and Cray Apprentice support XK6 programming

Buddy Bland SC12

Early Science Applications on Titan

Material Science (WL-LSMS)

Role of material disorder, statistics, and fluctuations in nanoscale materials and systems.

Climate Change (CAM-SE)

Answer questions about specific climate change adaptation and mitigation scenarios; realistically represent features like precipitation patterns/statistics and tropical storms.

Biofuels (LAMMPS)
A multiple capability molecular dynamics code.

Astrophysics (NRDF)

Radiation transport critical to astrophysics, laser fusion, combustion, atmospheric dynamics, and medical imaging.

Combustion (S3D)
Combustion simulations to enable the next generation of diesel/biofuels to burn more efficiently.

Nuclear Energy (Denovo)

Unprecedented high-fidelity radiation transport calculations that can be used in a variety of nuclear energy and technology applications.

Buddy Bland SC12

How Effective are GPUs on Scalable Applications?

OLCF-3 Early Science Codes Very early performance measurements on Titan
XK7 (w/ K20x) vs. XE6

Cray XK7: K20x GPU plus AMD 6274 CPU Cray XE6: Dual AMD 6274 and no GPU Cray XK6 w/o GPU: Single AMD 6274, no GPU

Application S3D Denovo sweep

Performance Ratio

Comments
Turbulent combustion 6% of Jaguar workload Sweep kernel of 3D neutron transport for nuclear reactors 2% of Jaguar workload High-performance molecular dynamics 1% of Jaguar workload Statistical mechanics of magnetic materials 2% of Jaguar workload 2009 Gordon Bell Winner Community atmosphere model 1% of Jaguar workload
Buddy Bland SC12

1.8 3.8 7.4*

(mixed precision)

LAMMPS
WL-LSMS CAM-SE
22

3.8 1.8*
(estimate)

Questions? [email protected]

Want to join our team? ORNL is hiring. Contact us at http://jobs.ornl.gov

The research and activities described in this presentation were performed using the resources of the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC0500OR22725.
Buddy Bland SC12

Postal Manual PDF
No ratings yet
Postal Manual PDF
26 pages
Walmart & Rfid
No ratings yet
Walmart & Rfid
53 pages
4443167
No ratings yet
4443167
86 pages
DefCon 23-Lastest
No ratings yet
DefCon 23-Lastest
57 pages
Engineering Software Resources
No ratings yet
Engineering Software Resources
9 pages
Summary of Pegasus
No ratings yet
Summary of Pegasus
37 pages
Arizona Fuel Skimmer Locations List
No ratings yet
Arizona Fuel Skimmer Locations List
1 page
V6000T Anti Drone and RCIED Convoy Jammer Product Sheet - 2019v2
No ratings yet
V6000T Anti Drone and RCIED Convoy Jammer Product Sheet - 2019v2
5 pages
Coffee Shop Business Plan
No ratings yet
Coffee Shop Business Plan
41 pages
RFID0950 Elevator ID Card Manual: Core Lift Accessories Co.,Ltd
No ratings yet
RFID0950 Elevator ID Card Manual: Core Lift Accessories Co.,Ltd
8 pages
Transit Mobile Payments
No ratings yet
Transit Mobile Payments
19 pages
Install Kali Nethunter on Termux
No ratings yet
Install Kali Nethunter on Termux
4 pages
International Cricket Council: Cricket Wireless Network More Coverage For Less Money
No ratings yet
International Cricket Council: Cricket Wireless Network More Coverage For Less Money
2 pages
Class Xii Pb-I Syllabus (2022-23)
No ratings yet
Class Xii Pb-I Syllabus (2022-23)
4 pages
DP SM-0016-SiteManager 11xx-33xx-35xx Initial Setup
No ratings yet
DP SM-0016-SiteManager 11xx-33xx-35xx Initial Setup
2 pages
Remote Work Setup for BlackRock
No ratings yet
Remote Work Setup for BlackRock
3 pages
Flashpoint - Actor Profile - Killnet - (October 18, 2022)
No ratings yet
Flashpoint - Actor Profile - Killnet - (October 18, 2022)
19 pages
Massive X-16x9 Version 3.0 - 3.7 (Latest New Updates in Here!!!)
No ratings yet
Massive X-16x9 Version 3.0 - 3.7 (Latest New Updates in Here!!!)
234 pages
Connect Facebook Net Signals Config 1548069268553425 V 2!9!89&r Stable
No ratings yet
Connect Facebook Net Signals Config 1548069268553425 V 2!9!89&r Stable
43 pages
GPU Seminar for Tech Students
No ratings yet
GPU Seminar for Tech Students
26 pages
Rufus Software Update Log
No ratings yet
Rufus Software Update Log
13 pages
PacComm Micropower-2 Tiny-2
No ratings yet
PacComm Micropower-2 Tiny-2
152 pages
Quantum Stealth Cloak
0% (1)
Quantum Stealth Cloak
6 pages
RS232 Serial Communication Guide
100% (2)
RS232 Serial Communication Guide
5 pages
Cybersecurity Threats to Nuclear Facilities
No ratings yet
Cybersecurity Threats to Nuclear Facilities
18 pages
Onboarding Tool Kit Jun
No ratings yet
Onboarding Tool Kit Jun
82 pages
CITIBANK - Presentation
No ratings yet
CITIBANK - Presentation
14 pages
Purnachandra Samal: Account Statement
No ratings yet
Purnachandra Samal: Account Statement
6 pages
Hackpage
No ratings yet
Hackpage
5 pages
Troubleshooting Guide
No ratings yet
Troubleshooting Guide
25 pages
Add Payload Files Adding Uefi Images
No ratings yet
Add Payload Files Adding Uefi Images
21 pages
Creditcard Survey
No ratings yet
Creditcard Survey
8 pages
TRANSCRIPT Ep. 50 - Lockbit Diaries
No ratings yet
TRANSCRIPT Ep. 50 - Lockbit Diaries
13 pages
Hardware Store Business Plan
100% (1)
Hardware Store Business Plan
39 pages
BloombergGPT: Finance-Focused LLM
100% (1)
BloombergGPT: Finance-Focused LLM
76 pages
Txtkey 1
No ratings yet
Txtkey 1
12 pages
TITAN Fusion Center Privacy Policy
No ratings yet
TITAN Fusion Center Privacy Policy
13 pages
P-2.2 - 2
No ratings yet
P-2.2 - 2
102 pages
Project Failure in Software Industry of Bangladesh PDF
No ratings yet
Project Failure in Software Industry of Bangladesh PDF
13 pages
(DarkTracer) List of Victim Organizations Attacked by Ransomware Gangs Released On The DarkWeb
No ratings yet
(DarkTracer) List of Victim Organizations Attacked by Ransomware Gangs Released On The DarkWeb
95 pages
Python Programming Resources
No ratings yet
Python Programming Resources
3 pages
Dealify Worlds Biggest Encyclopedia of Growth Hacks 2 33
No ratings yet
Dealify Worlds Biggest Encyclopedia of Growth Hacks 2 33
32 pages
Is There An EFI Monster Inside Your Apple?: FG! at CODE BLUE 2015
No ratings yet
Is There An EFI Monster Inside Your Apple?: FG! at CODE BLUE 2015
173 pages
Robert Marvin Box Content List
No ratings yet
Robert Marvin Box Content List
39 pages
Technical Note: Software Device Drivers For Micron MT29F NAND Flash Memory
No ratings yet
Technical Note: Software Device Drivers For Micron MT29F NAND Flash Memory
12 pages
Exploiting UniFi with Log4J Vulnerability
No ratings yet
Exploiting UniFi with Log4J Vulnerability
15 pages
hw7 110107699 e Commerce
No ratings yet
hw7 110107699 e Commerce
8 pages
Banking Paper1
No ratings yet
Banking Paper1
38 pages
Zombie Van Electrical Diagram
No ratings yet
Zombie Van Electrical Diagram
1 page
West Virginia First MOU
No ratings yet
West Virginia First MOU
44 pages
Fosdem 2018
No ratings yet
Fosdem 2018
184 pages
Codeanalysis Hackingteam
No ratings yet
Codeanalysis Hackingteam
25 pages
Iss35 - Digital Version - V2 PDF
No ratings yet
Iss35 - Digital Version - V2 PDF
50 pages
Havells Enviro Gti: Web Monitoring Manual
No ratings yet
Havells Enviro Gti: Web Monitoring Manual
26 pages
An Enhancement On Targeted Phishing Attacks in The State of Qatar
No ratings yet
An Enhancement On Targeted Phishing Attacks in The State of Qatar
207 pages
2009 04 13 00 25 18 USUARIO-PC Log
No ratings yet
2009 04 13 00 25 18 USUARIO-PC Log
265 pages
98abec086b9b26631fb86537e5c55445
No ratings yet
98abec086b9b26631fb86537e5c55445
73 pages
Important Cisco Chow Commands
No ratings yet
Important Cisco Chow Commands
21 pages
10 03 Messer
No ratings yet
10 03 Messer
38 pages
Cray xt5
No ratings yet
Cray xt5
6 pages
Dangerous Google - Searching For Secrets PDF
90% (31)
Dangerous Google - Searching For Secrets PDF
12 pages
S&TR October/November 2011
No ratings yet
S&TR October/November 2011
8 pages
Crypto 101: Laurens Van Houtven (LVH)
No ratings yet
Crypto 101: Laurens Van Houtven (LVH)
242 pages
78 Cbo 012
No ratings yet
78 Cbo 012
100 pages
Webmin User Guide
100% (4)
Webmin User Guide
808 pages
X Window
No ratings yet
X Window
7 pages
Application of The Mole-8.5 Supercomputer: Probing The Whole Influenza Virion at The Atomic Level
No ratings yet
Application of The Mole-8.5 Supercomputer: Probing The Whole Influenza Virion at The Atomic Level
5 pages
AP Physics B - Atomic and Nuclear Physics
100% (1)
AP Physics B - Atomic and Nuclear Physics
32 pages
A Study of Rsa Cryptosystem and Other Public-Key Cryptography
No ratings yet
A Study of Rsa Cryptosystem and Other Public-Key Cryptography
1 page
Instructor's Guide Programming: Bjarne Stroustrup
No ratings yet
Instructor's Guide Programming: Bjarne Stroustrup
36 pages
Petascale Challenges For Cosmological Simulation: by Paul Ricker
No ratings yet
Petascale Challenges For Cosmological Simulation: by Paul Ricker
3 pages
Benchmarking Final
No ratings yet
Benchmarking Final
124 pages
Christen 07
No ratings yet
Christen 07
8 pages
Research Taxonomy: Le Xuan Hung, PHD Fellow U-Security Research Group
No ratings yet
Research Taxonomy: Le Xuan Hung, PHD Fellow U-Security Research Group
17 pages
BLKHL DTH THTHT Black Holes and The Math That Describes Them Describes Them
No ratings yet
BLKHL DTH THTHT Black Holes and The Math That Describes Them Describes Them
32 pages
Java for Numerical Computing Review
No ratings yet
Java for Numerical Computing Review
2 pages
Kernel Oopsing
No ratings yet
Kernel Oopsing
21 pages
ChenruoQi Proposal
No ratings yet
ChenruoQi Proposal
4 pages
The Flask Security Architecture: System Support For Diverse Security Policies
No ratings yet
The Flask Security Architecture: System Support For Diverse Security Policies
17 pages
ParallelR-Accelerating R Applications With CUDA
No ratings yet
ParallelR-Accelerating R Applications With CUDA
59 pages
OpenACC 3
No ratings yet
OpenACC 3
23 pages
Advanced Multi-GPU Programming Guide
No ratings yet
Advanced Multi-GPU Programming Guide
91 pages
Cupti
No ratings yet
Cupti
117 pages
OpenACC 2 6 Final-Changes
No ratings yet
OpenACC 2 6 Final-Changes
129 pages
Introduction To CUDA Platform 1
No ratings yet
Introduction To CUDA Platform 1
18 pages
OpenACC 3 0
No ratings yet
OpenACC 3 0
149 pages
GPU Insights for CPU Experts
100% (1)
GPU Insights for CPU Experts
70 pages
Titan System Insights for Scientists
No ratings yet
Titan System Insights for Scientists
12 pages
Parallel Computing Techniques
No ratings yet
Parallel Computing Techniques
19 pages
4 1 MWagner GPU Volta
No ratings yet
4 1 MWagner GPU Volta
36 pages
CMake Lists
No ratings yet
CMake Lists
14 pages
C/C++ to OpenCL/CUDA Toolkit
No ratings yet
C/C++ to OpenCL/CUDA Toolkit
52 pages
Parallel Programming For Modern High Performance Computing Systems (Czarnul, Pawel)
No ratings yet
Parallel Programming For Modern High Performance Computing Systems (Czarnul, Pawel)
330 pages
OpenACC Advanced Fixed
No ratings yet
OpenACC Advanced Fixed
53 pages
Introduction To Gpu Programming With Cuda and Openacc
100% (1)
Introduction To Gpu Programming With Cuda and Openacc
40 pages
Openacc For Programmers Concepts and Strategies Sunita Chandrasekaran Guido Juckeland PDF Download
No ratings yet
Openacc For Programmers Concepts and Strategies Sunita Chandrasekaran Guido Juckeland PDF Download
72 pages
Nvidia-Learning-Training Course-Catalog
No ratings yet
Nvidia-Learning-Training Course-Catalog
32 pages
Parallel Computing (BCS702) As Per VTU Syllabus:, June
No ratings yet
Parallel Computing (BCS702) As Per VTU Syllabus:, June
2 pages
Nvidia Profiling Tools Keipert 10 4 22
No ratings yet
Nvidia Profiling Tools Keipert 10 4 22
27 pages
OpenACC Programming Guide 0 0
No ratings yet
OpenACC Programming Guide 0 0
73 pages
OpenACC Fundamentals
No ratings yet
OpenACC Fundamentals
38 pages
OpenACC Princeton Bootcamp PDF
No ratings yet
OpenACC Princeton Bootcamp PDF
51 pages
OpenACC 1
No ratings yet
OpenACC 1
44 pages

Titan System Insights for Scientists

Uploaded by

Titan System Insights for Scientists

Uploaded by

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Buddy Bland Project Director Oak Ridge Leadership Computing Facility

November 13, 2012

4,352 ft2 404 m2

16 cores 141 GFLOPs peak

Buddy Bland SC12

NVIDIA Tesla K20x

2,688 CUDA cores

(Titan full system measured power)

Cray XK7 Compute Node

Tesla K20x Memory 6GB GDDR5

Slide courtesy of Cray, Inc.

Titan: Cray XK7 System

Compute Node: 1.45 TF 38 GB

Board: 4 Compute Nodes 5.8 TF 152 GB

Cabinet: 24 Boards 96 Nodes 139 TF 3.6 TB

Buddy Bland SC12

Why GPUs? High Performance and Power Efficiency on a Path to Exascale

Hybrid Programming Model

How do you program these nodes?

CrayPAT and Cray Apprentice support XK6 programming

Early Science Applications on Titan

Material Science (WL-LSMS)

Climate Change (CAM-SE)

Nuclear Energy (Denovo)

Buddy Bland SC12

How Effective are GPUs on Scalable Applications?

Application S3D Denovo sweep

1.8 3.8 7.4*

Want to join our team? ORNL is hiring. Contact us at http://jobs.ornl.gov

You might also like