Weekly Schedule
| Week 1 1/12-1/16 | Lecture Topic | Assignment |
|---|---|---|
| Mon | Course Introduction | Welcome B. Dally et. al Domain-Specific Hardware Accelerators Comm of the ACM 2020 |
| Weds | Technology Trends Review | Review from Chapter 1 Computer Architecture: A Quantitative Approach, Hennessy and Patterson |
| Fri | Moore’s Law, Dennard Scaling Review | Reading: Cramming more components onto integrated circuits |
| Week 2 1/19-1/23 | Lecture Topic | Assignment |
| Mon | Martin Luther King Day | |
| Weds | Roofline Model | Roofline: an insightful visual performance model for multicore architectures](https://dl.acm.org/doi/10.1145/1498765.1498785) S. Williams et. al. |
| Fri | Performance Measuring Recap | |
| Week 3 1/26-1/30 | Lecture Topic | Assignment |
| Mon | General Purpose Processors and the Virtuous Cycle | Reading: N. Thompson, S. Spanuth The decline of computers as a general purpose technology A. Fuchs, D. Wentzlaff The Accelerator Wall: Limits of Chip Specialization |
| Weds | General Purpose Processors and the Virtuous Cycle | Reading: N. Thompson, S. Spanuth The decline of computers as a general purpose technology A. Fuchs, D. Wentzlaff |
| Fri | Machine Learning Boot Camp | Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto |
| Week 4 2/2-2/6 | Lecture Topic | Assignment |
| Mon | Machine Learning Boot Camp MLPs | Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto |
| Weds | Machine Learning Boot Camp - Systolic Arrays | Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto |
| Fri | Machine Learning Boot Camp CNNs | Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto |
| Week 5 2/9-2/13 | Lecture Topic | Assignment |
| Mon | Computational Power and AI | Read Computational Power and AI |
| Weds | Computational Power and AI | Read Computational Power and AI |
| Fri | In-Datacenter Performance Analysis of a Tensor Processing Unit | In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al. |
| Week 6 2/16-2/20 | Lecture Topic | Assignment |
| Mon | In-Datacenter Performance Analysis of a Tensor Processing Unit | In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al. |
| Mon | In-Datacenter Performance Analysis of a Tensor Processing Unit | In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al. |
| Fri | Processor In Memory Architectures | |
| Week 7 2/23-2/27 | Lecture Topic | Assignment |
| Mon | Processor In Memory Architectures | |
| Weds | AUP-ZU3 | AMD Vivado Install AUP-Zu3 ref manual |
| Fri | Paper Selections | |
| Week 8 3/2-3/6 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 9 3/9-3/13 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 10 3/16-3/20 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 11 3/23-3/27 | Lecture Topic | Assignment |
| Mon | Spring Break Yah ! | |
| Weds | Spring Break Yah ! | |
| Fri | Spring Break Yah ! | |
| Week 12 3/30-4/3 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 13 4/6-4/10 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 14 4/13-4/17 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 15 4/20-4/24 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | ||
| Week 16 4/27-5/1 | Lecture Topic | Assignment |
| Mon | ||
| Weds | ||
| Fri | Reading Day | All Done! |
| Final | 10:15pm - 12:15pm |
Cerebras Architecture Deep Dive: First Look Inside the Hardware/Software Co-Design for Deep Learning IEEE Micro Volume: 43, Issue: 3, May-June 2023
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing Proceedings of the 47th ACM/IEEE International Symposium on Computer Architecture (ISCA) Valencia, Spain, 2020, pp. 790-803, doi: 10.1109/ISCA45697.2020.00070.
Emerging NVMs
UPMEM | Accelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
Memory-Centric Computing with SK hynix’s Domain-Specific Memory 2023 Y. Kwon et. al.,IEEE Hot Chips 35 Symposium (HCS) 2023
Y. Chen and M. S. Abdelfattah, “BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs,” 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Marina Del Rey, CA, USA, 2023, pp. 52-62, doi: 10.1109/FCCM57271.2023.00015.
The true Processor in Memory AcceleartorF. Devaux, IEEE Hot Chips 31 Symposium (HCS) 2019
J.D. Kendall, S. Kumar The building blocks of a brain-inspired computer Appl. Phys. Rev. 1 March 2020; 7 (1): 011305.
Accelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
Tutorial on Memory-Centric Computing
Accelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
K. Asi Fuzzaman et. al. A Survey on processing-in-memory techniques: Advances and challenges
In Memory Intelligence Tim Finkbeiner et. al., IEEE Micro Volume: 37, Issue: 4, 2017
N. Jouppi et. al. Ten Lessons From Three Generations Shaped Google’s TPUv4i Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.
Dennis Abts et. al. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 2020
Dennis Abts et. al. A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.
Jack Kendall, Suhas Kumar The building blocks of a brain-inspired computer Applied Physics Reviews, Volume 7, Issue 1, March 2020
Jin Hyun Kim Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond IEEE Hot Chips 33, 2021
S. Lee et. al. Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.