Weekly Schedule

Week 1 1/12-1/16	Lecture Topic	Assignment
Mon	Course Introduction	Welcome B. Dally et. al Domain-Specific Hardware Accelerators Comm of the ACM 2020
Weds	Technology Trends Review	Review from Chapter 1 Computer Architecture: A Quantitative Approach, Hennessy and Patterson
Fri	Moore’s Law, Dennard Scaling Review	Reading: Cramming more components onto integrated circuits
Week 2 1/19-1/23	Lecture Topic	Assignment
Mon	Martin Luther King Day
Weds	Roofline Model	Roofline: an insightful visual performance model for multicore architectures](https://dl.acm.org/doi/10.1145/1498765.1498785) S. Williams et. al.
Fri	Performance Measuring Recap
Week 3 1/26-1/30	Lecture Topic	Assignment
Mon	General Purpose Processors and the Virtuous Cycle	Reading: N. Thompson, S. Spanuth The decline of computers as a general purpose technology A. Fuchs, D. Wentzlaff The Accelerator Wall: Limits of Chip Specialization
Weds	General Purpose Processors and the Virtuous Cycle	Reading: N. Thompson, S. Spanuth The decline of computers as a general purpose technology A. Fuchs, D. Wentzlaff
Fri	Machine Learning Boot Camp	Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto
Week 4 2/2-2/6	Lecture Topic	Assignment
Mon	Machine Learning Boot Camp MLPs	Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto
Weds	Machine Learning Boot Camp - Systolic Arrays	Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto
Fri	Machine Learning Boot Camp CNNs	Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4 Optional: Implications of Makimoto’s Wave T. Makimoto
Week 5 2/9-2/13	Lecture Topic	Assignment
Mon	Computational Power and AI	Read Computational Power and AI
Weds	Computational Power and AI	Read Computational Power and AI
Fri	In-Datacenter Performance Analysis of a Tensor Processing Unit	In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al.
Week 6 2/16-2/20	Lecture Topic	Assignment
Mon	In-Datacenter Performance Analysis of a Tensor Processing Unit	In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al.
Mon	In-Datacenter Performance Analysis of a Tensor Processing Unit	In-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al.
Fri	Processor In Memory Architectures
Week 7 2/23-2/27	Lecture Topic	Assignment
Mon	Processor In Memory Architectures
Weds	AUP-ZU3	AMD Vivado Install AUP-Zu3 ref manual
Fri	Paper Selections
Week 8 3/2-3/6	Lecture Topic	Assignment
Mon	Presentation Assignments
Weds	How to give a bad presentation
Fri	Emerging NVMs
Week 9 3/9-3/13	Lecture Topic	Assignment
Mon
Weds
Fri
Week 10 3/16-3/20	Lecture Topic	Assignment
Mon	Group #1: Randall,Pretha	Cerebras Architecture Deep Dive: First Look Inside the Hardware/Software Co-Design for Deep Learning IEEE Micro Volume: 43, Issue: 3, May-June 2023
Weds	Group #2: Parker, Braylon	N. Jouppi et. al. Ten Lessons From Three Generations Shaped Google’s TPUv4i Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021
Fri	Group #3 Hunter, Chance	Jack Kendall, Suhas Kumar The building blocks of a brain-inspired computer Applied Physics Reviews, Volume 7, Issue 1, March 2020
Week 11 3/23-3/27	Lecture Topic	Assignment
Mon	Spring Break Yah !
Weds	Spring Break Yah !
Fri	Spring Break Yah !
Week 12 3/30-4/3	Lecture Topic	Assignment
Mon	Laboratory discussion	Bring in your boards
Weds	Group #4: Ryan, Joseph, Phillip	The true Processor in Memory AcceleartorF. Devaux, IEEE Hot Chips 31 Symposium (HCS) 2019
Fri	Group #5: Chase, Jacob	Y. Chen and M. S. Abdelfattah, “BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs,” 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Marina Del Rey, CA, USA, 2023, pp. 52-62, doi: 10.1109/FCCM57271.2023.00015
Week 13 4/6-4/10	Lecture Topic	Assignment
Mon	Group #6:Christian, Max	Dennis Abts et. al. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 2020
Weds	Group #7: McKyala,Nathan,Christian	K. Asi Fuzzaman et. al. A Survey on processing-in-memory techniques: Advances and challenges
Fri	Final Project Discussion
Week 14 4/13-4/17	Lecture Topic	Assignment
Mon	Individual Group Preparation
Weds	Group #1 Ryan, Pretha, Phillip	J. Gomez et. al.,Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware Proceedings of the 12th International Green and Sustainable Computing Conference (IGSC), 2021
Fri	Group #2 Chase, Jacob	A. Arora et. al., CoMeFa Compute in Memory Blocks for FPGAs Proceedings of the 30th IEEE International Symposium on Field-Programmable Custom Computing Machines, May 15-18, 2022
Week 15 4/20-4/24	Lecture Topic	Assignment
Mon	Group #3 Randall, Joseph	A. Rico etl.al, AMD XDNA NPU in Ryzen AI Processors IEEE Micro Nov-Dec 2024
Weds	Group #4 MyKala, Nathan, Christian	S. Lee et. al. Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.
Fri	Group #5 Christian, Max Dennis	Abts et. al. A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.
Week 16 4/27-5/1	Lecture Topic	Assignment
Mon	Group #6 Hunter, Chance	Accelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
Weds	Group #7 Parker, Braylon	In Memory Intelligence Tim Finkbeiner et. al., IEEE Micro Volume: 37, Issue: 4, 2017
Fri	Reading Day	All Done!
Final	Monday May 4th 10:15am - 12:15pm

S. Lee et. al., Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021 G. Kim et. al.,SK Hynix AI-Specific Computing Memory Solution: From AiM Device to Heterogeneous AiMX-xPU System for Comprehensive LLM Inference
A Gajjar et. al.,Azure-Lily: An FPGA Architecture with Analog IMC Engines for Efficient AI ACM Transactions on Architecture and Code Optimization, Volume 23, Issue 1 Article No.: 29, Pages 1 - 26
X. Wang et. al.,Compute Capable Block RAMs for Efficient Deep Learning Acceleration on FPGAs Proceedings of the 29th IEEE International Symposium on Field-Programmable Custom Computing Machines, 2021
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing Proceedings of the 47th ACM/IEEE International Symposium on Computer Architecture (ISCA) Valencia, Spain, 2020, pp. 790-803, doi: 10.1109/ISCA45697.2020.00070.
Memory-Centric Computing with SK hynix’s Domain-Specific Memory 2023 Y. Kwon et. al.,IEEE Hot Chips 35 Symposium (HCS) 2023
J.D. Kendall, S. Kumar The building blocks of a brain-inspired computer Appl. Phys. Rev. 1 March 2020; 7 (1): 011305.
Tutorial on Memory-Centric Computing
Accelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
Jin Hyun Kim Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond IEEE Hot Chips 33, 2021