Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Weekly Schedule

Week 1 1/12-1/16Lecture TopicAssignment
MonCourse IntroductionWelcome
B. Dally et. al Domain-Specific Hardware Accelerators Comm of the ACM 2020
WedsTechnology Trends ReviewReview from Chapter 1 Computer Architecture: A Quantitative Approach, Hennessy and Patterson
FriMoore’s Law, Dennard Scaling ReviewReading: Cramming more components onto integrated circuits
Week 2 1/19-1/23Lecture TopicAssignment
MonMartin Luther King Day 
WedsRoofline ModelRoofline: an insightful visual performance model for multicore architectures](https://dl.acm.org/doi/10.1145/1498765.1498785) S. Williams et. al.
FriPerformance Measuring Recap 
Week 3 1/26-1/30Lecture TopicAssignment
MonGeneral Purpose Processors and the Virtuous CycleReading: N. Thompson, S. Spanuth The decline of computers as a general purpose technology
A. Fuchs, D. Wentzlaff The Accelerator Wall: Limits of Chip Specialization
WedsGeneral Purpose Processors and the Virtuous CycleReading: N. Thompson, S. Spanuth The decline of computers as a general purpose technology
A. Fuchs, D. Wentzlaff
FriMachine Learning Boot CampReading: H&P Comp Arch: a Quant. Approach Ch 7.3-4
Optional: Implications of Makimoto’s Wave T. Makimoto
Week 4 2/2-2/6Lecture TopicAssignment
MonMachine Learning Boot Camp MLPs Reading: H&P Comp Arch: a Quant. Approach Ch 7.3-4
Optional: Implications of Makimoto’s Wave T. Makimoto
WedsMachine Learning Boot Camp - Systolic ArraysReading: H&P Comp Arch: a Quant. Approach Ch 7.3-4
Optional: Implications of Makimoto’s Wave T. Makimoto
FriMachine Learning Boot Camp CNNsReading: H&P Comp Arch: a Quant. Approach Ch 7.3-4
Optional: Implications of Makimoto’s Wave T. Makimoto
Week 5 2/9-2/13Lecture TopicAssignment
MonComputational Power and AIRead Computational Power and AI
WedsComputational Power and AIRead Computational Power and AI
FriIn-Datacenter Performance Analysis of a Tensor Processing UnitIn-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al.
Week 6 2/16-2/20Lecture TopicAssignment
MonIn-Datacenter Performance Analysis of a Tensor Processing UnitIn-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al.
MonIn-Datacenter Performance Analysis of a Tensor Processing UnitIn-Datacenter Performance Analysis of a Tensor Processing Unit N. Jouppi et. al.
FriProcessor In Memory Architectures 
Week 7 2/23-2/27Lecture TopicAssignment
MonProcessor In Memory Architectures 
WedsAUP-ZU3AMD Vivado Install
AUP-Zu3 ref manual
FriPaper Selections 
Week 8 3/2-3/6Lecture TopicAssignment
MonPresentation Assignments 
WedsHow to give a bad presentation 
FriEmerging NVMs 
Week 9 3/9-3/13Lecture TopicAssignment
Mon  
Weds  
Fri  
Week 10 3/16-3/20Lecture TopicAssignment
MonGroup #1: Randall,PrethaCerebras Architecture Deep Dive: First Look Inside the Hardware/Software Co-Design for Deep Learning IEEE Micro Volume: 43, Issue: 3, May-June 2023
WedsGroup #2: Parker, BraylonN. Jouppi et. al. Ten Lessons From Three Generations Shaped Google’s TPUv4i Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021
FriGroup #3 Hunter, ChanceJack Kendall, Suhas Kumar The building blocks of a brain-inspired computer Applied Physics Reviews, Volume 7, Issue 1, March 2020
Week 11 3/23-3/27Lecture TopicAssignment
MonSpring Break Yah ! 
WedsSpring Break Yah ! 
FriSpring Break Yah ! 
Week 12 3/30-4/3Lecture TopicAssignment
MonLaboratory discussionBring in your boards
WedsGroup #4: Ryan, Joseph, PhillipThe true Processor in Memory AcceleartorF. Devaux, IEEE Hot Chips 31 Symposium (HCS) 2019
FriGroup #5: Chase, JacobY. Chen and M. S. Abdelfattah, “BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs,” 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Marina Del Rey, CA, USA, 2023, pp. 52-62, doi: 10.1109/FCCM57271.2023.00015
Week 13 4/6-4/10Lecture TopicAssignment
MonGroup #6:Christian, MaxDennis Abts et. al. Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 2020
WedsGroup #7: McKyala,Nathan,ChristianK. Asi Fuzzaman et. al. A Survey on processing-in-memory techniques: Advances and challenges
FriFinal Project Discussion 
Week 14 4/13-4/17Lecture TopicAssignment
MonIndividual Group Preparation 
WedsGroup #1 Ryan, Pretha, PhillipJ. Gomez et. al.,Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware Proceedings of the 12th International Green and Sustainable Computing Conference (IGSC), 2021
FriGroup #2 Chase, JacobA. Arora et. al., CoMeFa Compute in Memory Blocks for FPGAs Proceedings of the 30th IEEE International Symposium on Field-Programmable Custom Computing Machines, May 15-18, 2022
Week 15 4/20-4/24Lecture TopicAssignment
MonGroup #3 Randall, JosephA. Rico etl.al, AMD XDNA NPU in Ryzen AI Processors IEEE Micro Nov-Dec 2024
WedsGroup #4 MyKala, Nathan, ChristianS. Lee et. al. Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.
FriGroup #5 Christian, Max DennisAbts et. al. A Software-defined Tensor Streaming Multiprocessor for Large-scale Machine Learning Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021.
Week 16 4/27-5/1Lecture TopicAssignment
MonGroup #6 Hunter, ChanceAccelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
WedsGroup #7 Parker, BraylonIn Memory Intelligence Tim Finkbeiner et. al., IEEE Micro Volume: 37, Issue: 4, 2017
FriReading DayAll Done!
FinalMonday May 4th 10:15am - 12:15pm 

S. Lee et. al., Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), 2021 G. Kim et. al.,SK Hynix AI-Specific Computing Memory Solution: From AiM Device to Heterogeneous AiMX-xPU System for Comprehensive LLM Inference
A Gajjar et. al.,Azure-Lily: An FPGA Architecture with Analog IMC Engines for Efficient AI ACM Transactions on Architecture and Code Optimization, Volume 23, Issue 1 Article No.: 29, Pages 1 - 26
X. Wang et. al.,Compute Capable Block RAMs for Efficient Deep Learning Acceleration on FPGAs Proceedings of the 29th IEEE International Symposium on Field-Programmable Custom Computing Machines, 2021
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing Proceedings of the 47th ACM/IEEE International Symposium on Computer Architecture (ISCA) Valencia, Spain, 2020, pp. 790-803, doi: 10.1109/ISCA45697.2020.00070.
Memory-Centric Computing with SK hynix’s Domain-Specific Memory 2023 Y. Kwon et. al.,IEEE Hot Chips 35 Symposium (HCS) 2023
J.D. Kendall, S. Kumar The building blocks of a brain-inspired computer Appl. Phys. Rev. 1 March 2020; 7 (1): 011305.
Tutorial on Memory-Centric Computing
Accelerating Neural Network Inference with Processing-in-DRAM:From the Edge to the CloudG. Oliveira et. al.,extended and updated version of a paper published in IEEE Micro, pp. 1-14, 29 Aug. 2022.
Jin Hyun Kim Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond IEEE Hot Chips 33, 2021