CS194: Engineering Parallel Software

CS194: Engineering Parallel Software

Kurt Keutzer and Tim Mattson

Fall 2013

From cell phones to cloud computing, parallel processors are the computing platform of the future. This course will enable students to design, implement, optimize, and verify programs to run on parallel processors. Our approach to this course reflects our view that a well designed software architecture is a key to designing parallel software, and a key to software architecture is design patterns and a pattern language.  Our course will use this pattern language as the basis for describing how to design, implement, verify, and optimize parallel programs.  Following this approach we will introduce each of the major patterns that are used in developing a high-level architecture of a program. These eight structural and thirteen computational patterns may be found at: http://parlab.eecs.berkeley.edu/wiki/patterns/patterns.

We also allow that writing efficient parallel programs requires insights into the hardware architecture of contemporary parallel processors as well as an understanding as to how to write efficient code in general. As a result a significant amount of time in the course will be spent on these topics as well.

Other lectures and laboratories of the course will focus on implementation using contemporary parallel programming languages, verification of parallel software using invariants and testing, and performance tuning and optimization.

Course Work and Grading

The course consists of twice-weekly lectures and once-weekly lab sessions. For the first two thirds of the course, there will be a series of programming assignments.  There will be two take-home examinations during the first two thirds of the course.

Course Projects

The final third of the course will be an open-ended course project. Projects using quad-core cell phones will be among the acceptable platforms. Students will create their own projects in project teams of 4-6 students. Course Staff

Professor: Kurt Keutzer

Guest Lecturer: Tim Mattson, Intel

TAs: Patrick Li

Recommended Course Textbook

Patterns for Parallel Programming, T. Mattson, B. Sanders, B. Massingill, Addison Wesley, 2005. (This text is being revised with this course in mind.)

Week Date What Topic
Week 1 Tues
8/27
No class  
Wed
8/28
No class  
Thurs
8/29
Lecture 1 First Lecture: Intro, Background, Course Objectives
and Course Projects Video Games
–Keutzer
Fri
8/30
No class  
 
Week 2 Tues
9/3
Lecture 2 A programers introduction to parallel computing: Amdahl’s law, Concurrency vs. Parallelism, and the jargon of parallel computing. Getting started with OpenMP and Pthreads.
–Mattson
Wed
9/4
Discussion 1 Intro to the Lab Environment.
Assign Intro_1: Matrix multiplication with OpenMP and pthreads
Thurs
9/5
Lecture 3 Parallel programing on shared memory computers: complete the introduction to OpenMP and pthreads. Along the way address granularity, parallel overhead, load balancing, and Weak vs. strong scaling.
–Mattson
Fri
9/6
No class  
 
Week 3 Tues
9/10
Lecture 4 Shared Memory Concurrency Issues: Races, Livelock, Deadlock, Dining Philosophers
–Mattson
Wed
9/11
Discussion 2 C++ for Java/C Programmers; Working with OpenMP and Pthreads.
Thurs
9/12
Lecture 5
Software Architecture Overview: Overview of Computational and Structural Patterns
–Keutzer
Fri
9/13
Intro_1 due  
 
Week 4 Tues
9/17
Lecture 6 Sequential Processor Performance: Notions of performance: Insufficiency of Big-O, Matrix-Multiply Example; Pipelining, Superscalar, etc.; Compiler Optimizations; Processor “Speed of Light”
–Keutzer/Allen
Wed
9/18
Discussion 3 pthreads examples. Assign Intro2
Thurs
9/19
Lecture 7 Memory System Performance: Caches, Cache Hierarchies, benchmarking; Optimizing Matrix Multiplication
–Keutzer
Fri
9/20
   
 
Week 5 Tues
9/24
Lecture 8 Optimizing Matrix Multiply, Introduction to Parallel Processor Architectures: Multi-Core, Cache Coherence, Memory Consistency; SIMD / SIMT; Vectors; NUMA–Mattson
Wed
9/25
Discussion 4 Interactive session exploring performance issues in pthreads and openMP. They’ve been thinking about cache coherence … let’s use the discussion section to bring up a discussion of vectorization.
Thurs
9/26
Lecture 9 Data Parallelism–Mattson
Fri
9/27
intro_2 due  
 
Week 6 Tues
10/1
Lecture 10 Introduction to CUDA
–Li
Wed
10/2
Discussion 5 Assign MP1: Discuss the CUDA and OpenCL environments in the lab. Help students where appropriate install them on their own laptops.Project teams finalized.
Thurs
10/3
Lecture 11 CUDA and OpenCL continued

  • Li
     
 
Week 7 Tues
10/8
Lecture 12 The Roofline Model
–Keutzer
Wed
10/9
Discussion 6 CUDA and data parallel programming.
Thurs
10/10
Lecture 13 Distributed Memory Systems, Supercomputing, and MPI
–Mattson
Sun   MP1 due
 
Week 8 Tues
10/15
Midterm review Announce final project details and review for midterm
Wed
10/16
Discussion 7 Midterm review/Project proposals due
Thurs
10/17
MIDTERM
 
Week 9 Tues
10/22
Lecture 14 Design patterns, pattern languages, PLPP overview
–not Kurt Keutzer
Wed
10/23
Discussion 8 Discuss some exemplar projects from the past.
Thurs
10/24
Lecture 15 PLPP algorithm structure and supporting structures
–Keutzer
 
Week 10 Tues
10/29
Lecture 16 Structural patterns and software architecture
–Keutzer
Wed
10/30
Discussion 9 Project meetings: show up with evidence of work! Q and A session on GPGPU programming
Thurs
10/31
Lecture 17 Graph algorithms, dynamic programming, and speech recognition
–Keutzer
Fri
11/1
MP2 due  
 
Week 11 Tues
11/5
Lecture 18 Speech – part2
–Keutzer,Li
Wed
11/6
Discussion 10 Project meetings: show up with evidence of work!
Thurs
11/7
Lecture 19 Sparse linear algebra and image contour detection
–Li
 
Week 12 Tues
11/12
Lecture 20 Principle component analysis and 3D reconstruction
–Li
Wed
11/13
Discussion 11 Project meetings: show up with evidence of work!
Thurs
11/14
Lecture 21 Object recognition
–Li
 
Week 13 Tues
11/19
Lecture 22 Optimization patterns
–Li
Wed
11/20
Discussion 12 Project meetings: show up with evidence of work!
Thurs
11/21
Lecture 23 Future of parallel computing
–Keutzer
 
Week 14 Tues
11/26
Lecture 24 Use class time to talk about projects
 
Week 15 Tues
12/3
Lecture 25 Project presentations
Wed
12/4
Discussion 13 Final exam review
Thurs
12/5
Lecture 26 Project presentations
 

Final Exam:
Final project write-up due date: December 6
th, 2013
Grades:
35% Assignments (MPs)
30% Midterm and final exam
25% Final project
10% attendance and class participation