Computational Genomics
Module Genomics Data Analysis

Academic Year 2025/2026 - Teacher: Marco FICHERA

Expected Learning Outcomes

Knowledge and Understanding: By the end of the course, students will have acquired solid theoretical knowledge of the main bioinformatics methods for genomic data analysis, including high-throughput sequencing (NGS), variant identification, gene expression analysis, and structural bioinformatics. They will understand the basis of the algorithms, their applications, and the methodological and computational limitations in the context of genomics.

Applying Knowledge and Understanding: Students will be able to use specific bioinformatics software for the analysis and interpretation of genomic data. They will be able to implement computational pipelines for genetic variant detection, transcriptomic analysis, and functional interpretation of generated data, applying these skills to real cases and datasets.

Making Judgements: Students will develop the ability to critically evaluate the results obtained from genomic analyses, selecting appropriate methods based on the type of available data and the objectives of the study. They will also be able to identify potential technical and interpretative limitations of the results obtained.

Communication Skills: Students will acquire skills in effectively communicating results derived from genomic analyses, using appropriate scientific-technical language both in written form (reports and scientific articles) and orally (presentations and scientific discussions).

Learning Skills: Students will develop the ability to autonomously update their knowledge in computational genomics, consulting scientific literature, online resources, and databases to deepen their expertise and remain current with methodological and technological developments.

Course Structure

Lectures will be held in person, combining traditional frontal teaching with practical computer lab activities. The theoretical exposition by the instructor will be supported by slides, guided tutorials, and hands-on exercises on real cases. Active student participation will be encouraged through discussion and problem-solving sessions in class. If the course is delivered in a blended or remote format, necessary adjustments may be made to meet the syllabus objectives.

Required Prerequisites

Basic knowledge of molecular biology, genetics, and statistics. Familiarity with the Linux environment is strongly recommended.

Attendance of Lessons

Regular attendance is highly recommended for a thorough understanding of the topics and methodologies presented.

Detailed Course Content

Course Content

After a brief review of fundamental medical genetics concepts to ensure a common theoretical foundation, the course will integrate theory and practice in computational genomics. Topics will include algorithmic approaches for high-throughput sequencing (WES, WGS, RNA-seq) using short and long reads, including sequence quality control and mapping to the reference genome.

Particular attention will be given to genetic variant identification and annotation, including Bayesian variant calling for accurate error probability estimation and the reliability of calls, as well as in silico prediction of functional impact using predictive tools and genomic databases. Students will learn advanced techniques for differential gene expression analysis via RNA-seq and associated bioinformatics pipelines.

Bioinformatics techniques for epigenomic data analysis will also be explored, with a focus on DNA methylation and its correlation with gene expression regulation.

Textbook Information

Testi di riferimento

1.            Bioinformatics: A Practical Guide to Next Generation Sequencing Data Analysis (Chapman & Hall/CRC Computational Biology Series)

2.         Computational Exome and Genome Analysis (Chapman & Hall/CRC Computational Biology Series)

3.         Clinical Genomics. 2 Edizione Academic Pr

4.         Dispense fornite dal docente

Course Planning

 SubjectsText References
1Introduction to Medical Genetics
2Introduction to computational genomics
3Genetic and genomic variants
4Genomic Databasesc
5Principles of NGS and related Technologies
6Sequence assembly and mapping
7Variant identification and annotation
8Functional Interpretation of variants
9Gene expression analysis
10Methods for structural variant detection
11Differential and functional transcriptomic data analysis
12Computational epigenomics

Learning Assessment

Learning Assessment Procedures

The assessment will consist of an oral exam with theoretical and practical questions on analytical pipelines and computational result interpretation. The evaluation aims to assess the student’s reasoning and critical analysis skills as well as the appropriateness of technical language used.

The exam may be conducted remotely if necessary.

Grading Criteria:

  • Fail: Student has not acquired basic concepts and cannot perform exercises.

  • 18-23: Student demonstrates minimal mastery of basic concepts; presentation and linking of contents are modest; able to solve simple exercises.

  • 24-27: Student demonstrates good mastery of course content; presentation and linking of contents are good; exercises are solved with few errors.

  • 28-30 cum laude: Student has mastered all course content; presents it fully with critical insight; solves exercises completely and without errors.

Students with disabilities and/or learning disorders should contact the instructor and the CInAP representative (Prof. Daniele) sufficiently in advance to communicate their intention to take the exam using appropriate compensatory measures.

Examples of frequently asked questions and / or exercises

  1. Describe the main steps for Bayesian variant calling from high-throughput sequencing data.

  2. How would you assess the reliability of a variant identified through RNA-seq? What are the main factors affecting result quality?

  3. What are the main methodological differences between short-read and long-read sequencing, and in which application scenarios is each advantageous?

  4. What is the Phred quality score and how does it influence the reliability of called bases in sequencing data?

Note: These questions are indicative; actual exam questions may differ significantly.

VERSIONE IN ITALIANO