Principi della Programmazione Parallela
Academic Year 2025/2026 - Teacher: EVA SCIACCAExpected Learning Outcomes
Knowledge and Understanding:
Students will acquire adequate communication skills and language skills to clearly and coherently describe issues related to the design and analysis of parallel algorithms. They will be able to present the main technical and conceptual solutions related to parallel programming, even to non-specialists, using a communicative register appropriate to the context.
Course Structure
Lessons will take place in person, in a traditional lecture format. The instructor will present the theoretical content with the support of slides and the blackboard. Active student participation will be encouraged through questions and moments of classroom discussion.
The course will also include laboratory activities. Lectures will introduce theoretical concepts, while the lab will enable their practical application through exercises and case studies.
If the course were to be delivered in blended or online mode, the necessary adjustments may be introduced with respect to the above description, in order to comply with the program outlined in the syllabus.
Required Prerequisites
Attendance of Lessons
Detailed Course Content
The course Principles of Parallel Programming provides a structured introduction to the main methodologies for the design and implementation of parallel algorithms, with an approach oriented toward computer science and programming. It is intended for students who already have a basic knowledge of programming and of computer architecture principles. The training path explores the fundamental concepts of High-Performance Computing (HPC), illustrating strategies to optimize code execution on modern architectures.
Particular attention is devoted to the analysis of parallelism models, the identification of computational bottlenecks, and the application of programming techniques on both shared and distributed memory. Students learn how to develop parallel programs using the OpenMP and MPI standards, and how to evaluate performance on real systems. The approach balances theoretical aspects with practical experimentation, keeping formal references accessible while emphasizing algorithmic intuition.
Textbook Information
[1] Hager, Georg, and Gerhard Wellein. Introduction to high performance computing for scientists and engineers. CRC Press, 2010.
[2] Multi-platform Shared-memory Parallel Programming in C/C++ and Fortran - https://www.openmp.org/
[3] OpenMPI implementation of the Message passing interface - https://www.open-mpi.org/Course Planning
Subjects | Text References | |
---|---|---|
1 | Introduction to High Performance Computing (HPC) | Lecturer's notes |
2 | Modern Architectures for Parallel Computing: Processors and HPC Systems | Chap. 1 + 4 from [1] |
3 | Software Stack for HPC Systems | Chap. 4 from [1] / Lecturer's notes |
4 | Parallelism Models and Scalability | Chap. 5 from [1] |
5 | Code Optimization: Introduction and Basic Concepts | Chap. 2 + 3 from [1] |
6 | Guidelines for Code Optimization and Memory Management | Chap. 3 from [1] |
7 | Techniques for Optimized Compilation and Profiling | Chap. 2 from [1] |
8 | Parallel Programming on Shared Memory: Fundamentals | Chap. 6 from [1] |
9 | OpenMP: Directives for Structured Parallelism | Chap. 6 from [1] and documents from [2] |
10 | OpenMP: Synchronization, Sections, and Workload Balancing | Chap. 6 from [1] and documents from [2] |
11 | OpenMP: Shared and Private Data, Scheduling | Chap. 6 from [1] and documents from [2] |
12 | Parallel Programming on Distributed Memory: Fundamentals | Chap. 9 from [1] |
13 | Introduction to MPI and Its Communication Model | Chap. 9 from [1] |
14 | Point-to-Point Communication in MPI | Chap. 9 from [1] and documents from [3] |
15 | Collective Communication and Process Synchronization in MPI | Chap. 9 from [1] and documents from [3] |
16 | Concepts of Topology and Efficient Communication in MPI | Chap. 9 from [1] and documents from [3] |
17 | Principles of Hybrid Programming with MPI and OpenMP | Chap. 11 from [1] |
18 | MPI/OpenMP: Context and Hierarchical Management | Lecturer's notes |
Learning Assessment
Learning Assessment Procedures
The course exam is divided into two parts: an initial written test and a subsequent project evaluation.
These tests may take place remotely, should conditions require it. The project evaluation may be held on the same day as the written test or within a few days thereafter.
The purpose of the exam is to thoroughly assess the student’s preparation, analytical and reasoning skills regarding the topics covered during the course, as well as the appropriateness of the technical language used.
The project evaluation has an integrative value with respect to the written test and contributes inseparably to the determination of the final grade. It does not represent an opportunity to increase the score, but rather a necessary component for the overall evaluation of the student’s preparation.
The following criteria will generally be used for awarding the final grade:
Fail: the student has not acquired the basic concepts and is unable to carry out the exercises.
18–23: the student demonstrates a minimal grasp of the basic concepts; their ability to explain and connect contents is modest, but they can solve simple exercises.
24–27: the student demonstrates good command of the course content; their ability to explain and connect contents is good, and they solve exercises with few errors.
28–30 with honors: the student has acquired all the course content and is able to present it comprehensively and connect it with critical insight; they solve the exercises completely and without errors.
Students with disabilities and/or specific learning disorders (SLD) must contact, well in advance of the exam date, the lecturer, the CInAP contact person for the DMI (Prof. Daniele), and CInAP to notify their intention to take the exam with the appropriate compensatory measures.
To participate in the final exam, students must register on the SmartEdu portal. For any technical problems related to registration, they should contact the Teaching Office.
Examples of frequently asked questions and / or exercises
What is the recommended strategy to implement a global sum across all ranks in MPI while minimizing synchronization overhead, and when would you prefer a nonblocking variant?
Explain the difference between point-to-point blocking and nonblocking communication in MPI, and describe a common pattern to avoid deadlocks when exchanging halos in a stencil computation.
What is the recommended OpenMP way to parallelize a sum over an array while avoiding races and minimizing contention?
In shared-memory systems, what is the difference between UMA and NUMA?