Bioinformatic Foundations
Academic Year 2025/2026 - Teacher: ALFREDO PULVIRENTIExpected Learning Outcomes
Below, we report the general learning objectives of the course in terms of expected learning outcomes:
1. Knowledge and Understanding: The course aims to provide foundational knowledge and skills for the analysis, representation, and organization of bioinformatics data.
2. Applying Knowledge and Understanding: The student will acquire knowledge of models and algorithms for bioinformatics data analysis, such as sequence alignment and comparison, analysis of nucleic acid and protein structures, workflow construction, and analysis reproducibility.
3. Making Judgements: Through concrete examples and case studies, the student will be able to independently develop solutions to specific problems related to bioinformatics data analysis. The final part of the course will focus on case studies that allow students to apply the skills acquired.
4. Communication Skills: The student will acquire the necessary communication skills and appropriate use of technical language in the general field of bioinformatics data analysis.
5. Learning Skills: The course aims to provide students with the theoretical and practical methodologies needed to independently address and solve new problems encountered during professional activities. To this end, various topics will be presented by involving students in finding possible solutions to real-world problems using benchmarks from literature and case studies.
Course Structure
Should teaching be carried out in mixed mode or remotely, it may be necessary to introduce changes with respect to previous statements, in line with the programme planned and outlined in the syllabus.
Required Prerequisites
- Programming
- Data structures
Attendance of Lessons
Attendance is
mandatory.
Slides will be made available by the instructor to aid lesson comprehension.
Note: Slides are not a substitute for study. Students should study the provided
materials, textbook, and complete exercises to fully understand course
concepts.
Detailed Course Content
Introduction to Bioinformatics
· Course objectives, structure, and assessment methods
· Overview of bioinformatics: definition, applications
· Types of non-omics biological data: sequences, structures, interactions
· Introduction to the program and tools to be used in the course
Fundamentals of Probability, Statistics, Inference, Statistical Tests
· Basic concepts of probability
· Discrete and continuous probability distributions
· Random variables and statistical independence
· Bayes’ Theorem with bioinformatics applications
· Descriptive statistics for biological data
· Hypothesis testing, p-value, type I and II errors
· Common statistical tests (t-test, chi-squared, ANOVA)
· Regression and correlation models
· Visualization: histograms, boxplots
· Concept of statistical vs biological significance
· Practical examples using biological data
Introduction to R for Bioinformatics Analysis
· Data structures: vectors, data frames, lists
· Basic functions
· Use of the tidyverse packages
· Bioinformatics packages in R (Bioconductor)
· Statistical analysis in R
· Graph creation in R: histograms, scatterplots, boxplots, heatmaps
· Brief intro to ggplot2
Introduction to Python and Biopython
· Fundamentals of Biopython
· Sequence manipulation and access to biological databases
· Sequence transcription and translation
· GC-content calculation, reverse complement
· Parsing of annotations and biological features
· First homework assignment using Python or R
Representation of Biological Sequences
· File formats for sequences (FASTA, FASTQ, GenBank)
· Properties of nucleotide and protein sequences
· Biological databases
· Importing and manipulating sequences
Sequence Alignment I
· Basic concepts of sequence similarity: similarity, identity, and homology
· Local vs global alignment
· Global alignment algorithms (Needleman-Wunsch)
· Substitution matrices (PAM, BLOSUM)
· Alignment evaluation
Sequence Alignment II
· Local alignment algorithms (Smith-Waterman)
· Multiple sequence alignment
· Alignment programs (BLAST, CLUSTAL)
· Practical alignment exercises in Python and R
· MSA interpretation and profile construction
Pattern Search in Sequences
· Exact and approximate pattern matching
· Pattern search algorithms (Boyer-Moore, Knuth-Morris-Pratt)
· Hidden Markov Models (HMM) for sequences
· Applications in bioinformatics
Molecular Phylogeny I
· Basics of molecular evolution
· Construction of phylogenetic trees
· Distance and parsimony methods
· Phylogenetic analysis software
Molecular Phylogeny II
· Maximum likelihood in phylogeny
· Bayesian inference
· Interpretation of phylogenetic results
· Applications in bioinformatics
Textbook Information
Recommended text: "Fondamenti di bioinformatica"
Authors: Manuela Helmer Citterich, Fabrizio Ferrè, Giulio Pavesi,
Graziano Pesole, Chiara Romualdi
Publisher: Zanichelli (2018)
Other recommended text:
·
“Bioinformatics”
Authors: Andreas D. Baxevanis, Gary D. Bader, David S. Wishart
Publisher: Wiley (2020)
· “R
Bioinformatics Cookbook: Utilize R packages for bioinformatics, genomics, data
science, and machine learning”
Authors: Dan MacLean
Publisher: Packt Publishing (2023)
· “Mastering
Python for Bioinformatics: How to Write Flexible, Documented, Tested Python
Code for Research Computing”
Publisher: Ken Youens-Clark
Editore O'Reilly Media (2021)
· “Bioinformatica:
Dalla sequenza alla struttura delle proteine”
Authors: Stefano Pascarella, Alessandro Paiardini
Publisher: Zanichelli (2011)
Additional resources will be indicated by the instructor through slides used in class.
Course Planning
| Subjects | Text References | |
|---|---|---|
| 1 | Introduction to bioinformatics | |
| 2 | Fundamentals of Probability, Statistics, Inference, Statistical Tests | |
| 3 | Introduction to R for Bioinformatics Analysis | |
| 4 | Introduction to Python and Biopython | |
| 5 | Representation of Biological Sequences | |
| 6 | Sequence Alignment I | |
| 7 | Sequence Alignment II | |
| 8 | Pattern Search in Sequences | |
| 9 | Molecular Phylogeny I | |
| 10 | Molecular Phylogeny II |
Learning Assessment
Learning Assessment Procedures
The final exam consists of a written test and an oral interview in which a project, agreed upon between student and instructor, will be discussed.
· The written test and oral interview will be graded out of 30. The final grade is a weighted average:
o Written test: 25% of the final grade
o Oral exam/project: 75% of the final grade
· The written test includes a theory question on course topics, where the student must demonstrate comprehensive understanding.
· Minimum passing score for written test: 18/30. Students must pass the written test to access the oral exam.
· The written exam can be reviewed with the instructor at any time.
· Minimum score to pass the full exam: 18/30.
· The project must be completed within one month of passing the written test. It can be arranged at any time.
· If the student refuses the written test score, the project score is retained for the entire academic year. If the final grade is refused, both written and project must be retaken.
Exam logistics (time and location) will be announced through official university channels.Notes:
· Use of any hardware (calculators, tablets, smartphones, headphones, etc.) or personal documents during the written exam is prohibited.
· Exam registration via the university student portal is mandatory.
· Late registration via email is not allowed. Without registration, the exam result cannot be recorded.
· Remote exams may be offered if necessary.
Examples of frequently asked questions and / or exercises
Examples of written exam questions will be provided during lessons.