0% found this document useful (0 votes)

758 views8 pages

Hardware Accelerators For Machine Learning (CS 217) by cs217

This document provides information about a course titled "Hardware Accelerators for Machine Learning (CS 217)" offered at Stanford University in Winter 2020. The course covers architectural techniques for designing accelerators for machine learning training and inference. It will discuss algorithms like linear regression, support vector machines, and deep neural networks. Students will learn how to design accelerators that use techniques like parallelism, locality, and low precision to implement core ML computations efficiently. The course involves reading papers and completing a design project. It meets on Tuesdays and Thursdays from 10:30-11:50 AM and is taught by professors Ardavan Pedram and Kunle Olukotun with teaching assistant Nathan Zhang.

Uploaded by

Pramod Reddy R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

758 views8 pages

Hardware Accelerators For Machine Learning (CS 217) by cs217

Uploaded by

Pramod Reddy R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

Hardware Accelerators for Machine Learning

(CS 217)
Stanford University, Winter 2020

3/5

❮ ❯

Fewer moving parts

This course provides in-depth coverage of the architectural techniques used to design accelerators
for training and inference in machine learning systems. This course will cover classical ML
algorithms such as linear regression and support vector machines as well as DNN models such as
[Link] 1/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

convolutional neural nets, and recurrent neural nets. We will consider both training and inference
for these models and discuss the impact of parameters such as batch size, precision, sparsity and
compression on the accuracy of these models. We will cover the design of accelerators for ML
model inference and training. Students will become familiar with hardware implementation
techniques for using parallelism, locality, and low precision to implement the core computational
kernels used in ML. To design energy-eﬃcient accelerators, students will develop the intuition to
make trade-oﬀs between ML model parameters and hardware implementation techniques.
Students will read recent research papers and complete a design project.

Instructors and oﬃce hours:

Ardavan Pedram Kunle Olukotun

Oﬃce Hours TBA Oﬃce Hours TBA
This class meets Tuesday and Thursday from 10:30 - 11:50 AM in Huang Engineering Center 18.

Teaching Assistants
Nathan Zhang
Oﬃce Hours TBA

Class Information

[Link] 2/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

Funding for this research/activity was partially provided by the National Science Foundation
Division of Computing and Communication Foundations under award number 1563113.

Schedule

Lecture Date Topic Reading Spatial

Assignments Lecture Assignment
Slides

1 1/07/2020 Introduction,

Software 2.0

Role of hardware accelerators in

post Dennard and Moore era

2 1/09/2020 Kian Katan:Classical ML Is Dark silicon

algorithms: Regression, SVMs useful?
Hennessy Patterson
Chapter 7.1-7.2

3 1/14/2020 Linear algebra fundamentals and Why Systolic Linear

accelerating linear algebra Architectures? Algebra
BLAS operations Anatomy of high Accelerators
20th century techniques: Systolic performance GEMM
arrays and MIMDs, CGRAs
Dark Memory

4 1/16/2020 Introduction to Spatial: Analyzing Spatial

Performance and Energy with Aladdin
Spatial
Codesign Tradeoffs

5 1/21/2020 MLPs and CNNs Inference Efficient Processing

of DNNs

NVIDIA Tesla V100

6 1/23/2020 Evaluating Performance, Energy Roofline Model

efficiency, Parallelism, Locality,
Google TPU
Memory hierarchy, Roofline model

Luigi Nardi: Design Space

Optimization with Spatial
[Link] 3/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

7 1/28/2020 Boris Ginsburg:Generalization Caterpillar

and Regularization of Training Optimizing Gradient
Descent

8 1/30/2020 Azalia Mirhosseini: Reinforcement A Beginner's Guide

Learning for Hardware Design to RL
Resource
Management w DRL

9 2/04/2020 Fanny Nina Paravecino: Catapult Catapult

Brainwave Brainwave

10 2/06/2018 Amir Gholami: Quantized Deep SqueezNext CNN

Learning Inference
Accelerators

11 2/11/2020 Tze Meng Low: Fast Systematic Approach

Implementation of Deep Learning to Blocking
Kernels
High Performance
Zero-Memory
Overhead Direct
Convolutions

12 2/13/2020 Guest Lecture: Paulius Mixed Precision

Micikevicius Training Nvidia
Mixed Precision
GPU Design Tradeoffs for Training With 8-bit
Deeplearning and MLPerf Floating Point

Nvidia Volta

13 2/18/2020 Guest Lecture: Cliff Young DawnBench

MLPerf
Neural Networks Have Rebooting
Computing: What Should We
Reboot Next?

14 2/20/2020 Mohammad Shoeybi GNMT

BERT
Accelerating Natural Language
Processing

[Link] 4/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

15 2/25/2020 Mikhail Smelyanskiv:AI at ML @ Facebook

Facebook Datacenter Scale

16 2/27/2020 Assignment 1 Feedback and Midterm

Discussion

17 3/03/2020 Boris Ginsburg: Large Scale Revisiting Small

Training Batch Training for
Neural Networks
Large Batch Training
of Convolutional
Networks
Deep Learning At
Supercomputer Scale
Deep Gradient
Compression

18 3/05/2020 Sparsity in Deep Learning EIE

Campfire

19 3/10/2020 Machine Learning Systems and Taxonomy of

Software Stack Accelerator
Architectures
ML Systems Stuck in
a Rut

20 3/12/2020 TBD

Guest Lectures
Kian Katanforoosh, [Link] and Stanford University
From Machine Learning to Deep Learning: a computational transition
Thursday January 9, 2020

[Link] 5/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

Luigi Nardi, Lund University and Stanford University

Design Space Optimization with Spatial
Thursday January 23, 2020

Boris Ginsburg, NVIDIA

Generalization and Regularization of Training
Tuesday January 28, 2018

Azalia MirHosseini, Google Brain

Reinforcement Learning and Hardware Design
Thursday January 30, 2020

Fanny Nina Paravecino, Microsoft Research

Real-Time AI at Cloud Scale with Project Brainwave
Tuesday February 4, 2020

Amir Gholami, UC Berkeley

Precision and Quantized Training for Deep Learning
Thursday February 6, 2020

Tze Meng Low, Carnegie Melon University

Fast Implementation of Deep Learning Kernels
Tuesday February 11, 2020

Paulius Micikevicius, NVIDIA

GPU Design Tradeoﬀs for Deeplearning and MLPerf
Thursday February 13, 2020

[Link] 6/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

Cliﬀ Young, Google

Neural Networks Have Rebooting Computing: What Should We Reboot Next?
Tuesday February 18, 2020

Mohammad Shoeybi, NVIDIA

Natural Language Processing
Thursday February 20, 2020

Mikhail Smelyanskiy, Facebook

AI at Facebook Datacenter Scale
Tuesday February 25, 2020

Boris Ginsburg, NVIDIA

Large Batch Training of Convolution Networks
Tuesday March 3, 2020

Lecture Notes (Fall 2018)

Related Stanford Courses

CS230
CS231n
STATS 385

Reading list and other resources

Basic information about deep learning

Cheat sheet – things that everyone needs to know

[Link] 7/8
7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

Blogs

Grading

This page was generated by GitHub Pages.

[Link] 8/8

Google TPU
No ratings yet
Google TPU
27 pages
Top Analog Design Engineer Interview Questions
No ratings yet
Top Analog Design Engineer Interview Questions
1 page
Dec 2019-Jan 2020 (18EVE14)
100% (1)
Dec 2019-Jan 2020 (18EVE14)
2 pages
MLPerf: Benchmarking ML Performance
No ratings yet
MLPerf: Benchmarking ML Performance
62 pages
FPGAIntroduction Xilinx
No ratings yet
FPGAIntroduction Xilinx
60 pages
Handbook of Floating-Point Arithmetic
No ratings yet
Handbook of Floating-Point Arithmetic
11 pages
Logic DesignCh05
100% (1)
Logic DesignCh05
92 pages
Implementation of Image Processing Lab Using Xilinx System Generator
100% (1)
Implementation of Image Processing Lab Using Xilinx System Generator
9 pages
SystemC Methodologies and Applications-Müller-Kluwer
No ratings yet
SystemC Methodologies and Applications-Müller-Kluwer
356 pages
Digital Logic RTL Amp Verilog Interview Questions Preview
No ratings yet
Digital Logic RTL Amp Verilog Interview Questions Preview
34 pages
X5-400M Matlab BSP Manual
No ratings yet
X5-400M Matlab BSP Manual
71 pages
Verilog HDL: Digital Design Guide
No ratings yet
Verilog HDL: Digital Design Guide
40 pages
VERILOG
100% (1)
VERILOG
111 pages
Introduction To Verilog
No ratings yet
Introduction To Verilog
32 pages
Fpga Based Coin Recognition System: A Technical Report
No ratings yet
Fpga Based Coin Recognition System: A Technical Report
14 pages
Fuzzy Logic Applications: Bram Heyns
No ratings yet
Fuzzy Logic Applications: Bram Heyns
7 pages
FPGA Based System Design
No ratings yet
FPGA Based System Design
28 pages
Fuzzy Logic in AI Lecture 32
No ratings yet
Fuzzy Logic in AI Lecture 32
34 pages
Introduction To Digital Design Using Dig
100% (1)
Introduction To Digital Design Using Dig
195 pages
Top-Down Design of High-Performance Sigma-Delta Modulators
100% (1)
Top-Down Design of High-Performance Sigma-Delta Modulators
148 pages
Iot Merged
No ratings yet
Iot Merged
132 pages
Hoare 2600
No ratings yet
Hoare 2600
65 pages
Digital Integrated Circuits: Introduction To TTL
No ratings yet
Digital Integrated Circuits: Introduction To TTL
19 pages
Digital Design with Verilog HDL
No ratings yet
Digital Design with Verilog HDL
14 pages
AHB Vs AXI Vs APB
No ratings yet
AHB Vs AXI Vs APB
2 pages
Introduction To Formal Hardware Verification-Springer-Verlag Berlin Heidelberg (1999)
No ratings yet
Introduction To Formal Hardware Verification-Springer-Verlag Berlin Heidelberg (1999)
308 pages
Ngspice23 Manual PDF
No ratings yet
Ngspice23 Manual PDF
517 pages
Klayout-0 21 16
No ratings yet
Klayout-0 21 16
511 pages
VLSI Design Cycle & Partitioning
No ratings yet
VLSI Design Cycle & Partitioning
31 pages
Digital Integrated Circuits
No ratings yet
Digital Integrated Circuits
58 pages
Vlsi Cad: Logic To Layout: Programming Assignment 1
100% (1)
Vlsi Cad: Logic To Layout: Programming Assignment 1
9 pages
LC3 Specification v1.0
No ratings yet
LC3 Specification v1.0
221 pages
DSP Arithmetic for Academics
No ratings yet
DSP Arithmetic for Academics
96 pages
Building Better IP With RTL Architect NoC IP Physical Exploration by Arteris
No ratings yet
Building Better IP With RTL Architect NoC IP Physical Exploration by Arteris
30 pages
Fundamentals of The New Artificial Intelligence (Neural, Evolutionary, Fuzzy and More) (2nd Edition) Munakata
No ratings yet
Fundamentals of The New Artificial Intelligence (Neural, Evolutionary, Fuzzy and More) (2nd Edition) Munakata
10 pages
SPLDS-PPT 1
No ratings yet
SPLDS-PPT 1
88 pages
Quartus II Handbook Volume 2: Design Implementation and Optimization
No ratings yet
Quartus II Handbook Volume 2: Design Implementation and Optimization
321 pages
FSM Slides
0% (1)
FSM Slides
37 pages
Analysis and Design of Sequential Logic Circuits
No ratings yet
Analysis and Design of Sequential Logic Circuits
115 pages
8085 Assembly Programs for Data Processing
No ratings yet
8085 Assembly Programs for Data Processing
29 pages
Lec02 Sequential Circuit
No ratings yet
Lec02 Sequential Circuit
45 pages
Solution Manual Automata Computability PDF
No ratings yet
Solution Manual Automata Computability PDF
20 pages
Datapath Subsystems
No ratings yet
Datapath Subsystems
29 pages
Basic Design Approaches To Accelerating Deep Neural Networks
No ratings yet
Basic Design Approaches To Accelerating Deep Neural Networks
93 pages
Ebooks File Mathematical Structures For Computer Science 7th Edition Judith Gersting All Chapters
No ratings yet
Ebooks File Mathematical Structures For Computer Science 7th Edition Judith Gersting All Chapters
77 pages
Volnei A. Pedroni Finite State Machines in Hardware Theory and Design PDF
No ratings yet
Volnei A. Pedroni Finite State Machines in Hardware Theory and Design PDF
349 pages
RISC-VTF RISC-V Based Extended Instruction Set For Transformer
No ratings yet
RISC-VTF RISC-V Based Extended Instruction Set For Transformer
6 pages
A Practical Introduction To Hardwaresoftware Codesign 1
No ratings yet
A Practical Introduction To Hardwaresoftware Codesign 1
64 pages
VHDL Lecture Notes - Navabi
100% (2)
VHDL Lecture Notes - Navabi
556 pages
Introduction to Verilog HDL Basics
100% (1)
Introduction to Verilog HDL Basics
204 pages
ECE618: ML Hardware Accelerators Course
No ratings yet
ECE618: ML Hardware Accelerators Course
64 pages
Lecture 1 v61
No ratings yet
Lecture 1 v61
50 pages
Syllabi
No ratings yet
Syllabi
2 pages
Syl5 ML
No ratings yet
Syl5 ML
5 pages
Introduction To Hardware Accelerator Systems For Artificial Intelligence and Machine Learning
No ratings yet
Introduction To Hardware Accelerator Systems For Artificial Intelligence and Machine Learning
21 pages
Efficient Processing of Deep Neural Networks
No ratings yet
Efficient Processing of Deep Neural Networks
341 pages
Syllabus DL Spring 2023
No ratings yet
Syllabus DL Spring 2023
9 pages
Efficient Processing of Deep Neural Networks
No ratings yet
Efficient Processing of Deep Neural Networks
19 pages
Efficient Processing of Deep Neural Networks
No ratings yet
Efficient Processing of Deep Neural Networks
58 pages
GOK LAnd Revenue Part 1
No ratings yet
GOK LAnd Revenue Part 1
10 pages
BDA UserManualBuildingPlanApproval
No ratings yet
BDA UserManualBuildingPlanApproval
18 pages
Build Your Own Deep Learning Chip
No ratings yet
Build Your Own Deep Learning Chip
18 pages
ARM System-On-chip Architecture 2nd Edition Book R
No ratings yet
ARM System-On-chip Architecture 2nd Edition Book R
2 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
15 pages
FIT Rate Aware EM Analysis: Govind Saraswat, William Au, Qing He and Subramanian Venkateswaran
No ratings yet
FIT Rate Aware EM Analysis: Govind Saraswat, William Au, Qing He and Subramanian Venkateswaran
5 pages
Eden Brigade Unit Floor Plans
No ratings yet
Eden Brigade Unit Floor Plans
5 pages
FIT Rate Aware EM Analysis: Govind Saraswat, William Au, Qing He and Subramanian Venkateswaran
No ratings yet
FIT Rate Aware EM Analysis: Govind Saraswat, William Au, Qing He and Subramanian Venkateswaran
5 pages
FSM IC Validator 2018
No ratings yet
FSM IC Validator 2018
36 pages
Applied Sciences: The Challenges of Advanced CMOS Process From 2D To 3D
No ratings yet
Applied Sciences: The Challenges of Advanced CMOS Process From 2D To 3D
32 pages
Guidelines for Timothy Lake Lodge Stay
No ratings yet
Guidelines for Timothy Lake Lodge Stay
3 pages
Overview of 3D Architecture Design Opportunities and Techniques
No ratings yet
Overview of 3D Architecture Design Opportunities and Techniques
6 pages
Treaty Countries
No ratings yet
Treaty Countries
6 pages
Tilda Go: A Casual Conversation
No ratings yet
Tilda Go: A Casual Conversation
1 page
Public Administration Material Part 1 Administrative Theory
No ratings yet
Public Administration Material Part 1 Administrative Theory
365 pages
Advancing High Performance Heterogeneous Integration Through Die Stacking
No ratings yet
Advancing High Performance Heterogeneous Integration Through Die Stacking
8 pages
Tilda Go: A Brief Exchange
No ratings yet
Tilda Go: A Brief Exchange
1 page
Tilda Go Man No Man Gigo Hi Hello How Are U Go Maaaan Yes Man
No ratings yet
Tilda Go Man No Man Gigo Hi Hello How Are U Go Maaaan Yes Man
1 page
MA GrandMA2 Manual v3.4 2018-07-25 en
No ratings yet
MA GrandMA2 Manual v3.4 2018-07-25 en
1,846 pages
Medical Image Classification With Convolutional Neural Network
No ratings yet
Medical Image Classification With Convolutional Neural Network
5 pages
Foc MCQS - 1
No ratings yet
Foc MCQS - 1
61 pages
PES W Computer Worksheet Class 7
No ratings yet
PES W Computer Worksheet Class 7
3 pages
Airline System for Admins & Users
No ratings yet
Airline System for Admins & Users
4 pages
Introducing The PIC32MZ v4.6 (ENG)
100% (1)
Introducing The PIC32MZ v4.6 (ENG)
158 pages
Online Corel Draw Course
No ratings yet
Online Corel Draw Course
4 pages
Meditation App
No ratings yet
Meditation App
13 pages
3D Holographic Display User Manual
No ratings yet
3D Holographic Display User Manual
16 pages
Ict 9 q4 w1
No ratings yet
Ict 9 q4 w1
4 pages
Create POP List in OAF Page
No ratings yet
Create POP List in OAF Page
26 pages
Lab Assignment 4
No ratings yet
Lab Assignment 4
2 pages
4thKibo-RPC PGManual
No ratings yet
4thKibo-RPC PGManual
52 pages
Romid - Digital Presentation Class 9 QnAns
No ratings yet
Romid - Digital Presentation Class 9 QnAns
3 pages
Fonts For Ink Stitch
No ratings yet
Fonts For Ink Stitch
16 pages
DR-M260 User Manual EN
No ratings yet
DR-M260 User Manual EN
87 pages
B.Tech Internship Report
No ratings yet
B.Tech Internship Report
55 pages
Software Design Principles
No ratings yet
Software Design Principles
5 pages
Interactive Mind Map: A Model For Pedagogical Resource
No ratings yet
Interactive Mind Map: A Model For Pedagogical Resource
12 pages
CS10-8L: Computer Programming Laboratory Machine Problem #2: Variables, Input and Output
No ratings yet
CS10-8L: Computer Programming Laboratory Machine Problem #2: Variables, Input and Output
4 pages
CSS Fonts: HTML Styling Guide
No ratings yet
CSS Fonts: HTML Styling Guide
9 pages
Echoes in Space Forest Mapping Tutorial
No ratings yet
Echoes in Space Forest Mapping Tutorial
4 pages
2022IRDS MoreMoore
No ratings yet
2022IRDS MoreMoore
39 pages
IOT Lab Da 01
No ratings yet
IOT Lab Da 01
10 pages
Keshav
No ratings yet
Keshav
21 pages
Posture Detection and Comparison of Different Physical Exercises Based On Deep Learning Using Media Pipe Opencv
No ratings yet
Posture Detection and Comparison of Different Physical Exercises Based On Deep Learning Using Media Pipe Opencv
30 pages
ITER Radial Neutron Camera Architecture
No ratings yet
ITER Radial Neutron Camera Architecture
14 pages
Health Disease Project Report
No ratings yet
Health Disease Project Report
53 pages
Autodesk Fusion Shortcut Commands Guide
No ratings yet
Autodesk Fusion Shortcut Commands Guide
4 pages
FactoryTalk View SE and ViewPoint Software Customer Presentation
No ratings yet
FactoryTalk View SE and ViewPoint Software Customer Presentation
46 pages

Hardware Accelerators For Machine Learning (CS 217) by cs217

Uploaded by

Hardware Accelerators For Machine Learning (CS 217) by cs217

Uploaded by

7/14/2020 Hardware Accelerators for Machine Learning (CS 217) by cs217

Hardware Accelerators for Machine Learning

Fewer moving parts

Instructors and oﬃce hours:

Ardavan Pedram Kunle Olukotun

Lecture Date Topic Reading Spatial

Role of hardware accelerators in

2 1/09/2020 Kian Katan:Classical ML Is Dark silicon

3 1/14/2020 Linear algebra fundamentals and Why Systolic Linear

4 1/16/2020 Introduction to Spatial: Analyzing Spatial

5 1/21/2020 MLPs and CNNs Inference Efficient Processing

NVIDIA Tesla V100

6 1/23/2020 Evaluating Performance, Energy Roofline Model

Luigi Nardi: Design Space

7 1/28/2020 Boris Ginsburg:Generalization Caterpillar

8 1/30/2020 Azalia Mirhosseini: Reinforcement A Beginner's Guide

9 2/04/2020 Fanny Nina Paravecino: Catapult Catapult

10 2/06/2018 Amir Gholami: Quantized Deep SqueezNext CNN

11 2/11/2020 Tze Meng Low: Fast Systematic Approach

12 2/13/2020 Guest Lecture: Paulius Mixed Precision

13 2/18/2020 Guest Lecture: Cliff Young DawnBench

14 2/20/2020 Mohammad Shoeybi GNMT

15 2/25/2020 Mikhail Smelyanskiv:AI at ML @ Facebook

16 2/27/2020 Assignment 1 Feedback and Midterm

17 3/03/2020 Boris Ginsburg: Large Scale Revisiting Small

18 3/05/2020 Sparsity in Deep Learning EIE

19 3/10/2020 Machine Learning Systems and Taxonomy of

Luigi Nardi, Lund University and Stanford University

Boris Ginsburg, NVIDIA

Azalia MirHosseini, Google Brain

Fanny Nina Paravecino, Microsoft Research

Amir Gholami, UC Berkeley

Tze Meng Low, Carnegie Melon University

Paulius Micikevicius, NVIDIA

Cliﬀ Young, Google

Mohammad Shoeybi, NVIDIA

Mikhail Smelyanskiy, Facebook

Boris Ginsburg, NVIDIA

Lecture Notes (Fall 2018)

Related Stanford Courses

Reading list and other resources

Basic information about deep learning

Cheat sheet – things that everyone needs to know

This page was generated by GitHub Pages.

You might also like