KFSCIS Associate Professor Fahad Saeed received the prestigious National Institute of Health (NIH) Maximizing Investigators’ Research Award (MIRA) award. The US$ 1.75 million award is a single-PI grant that will fund Dr. Saeed and his Lab for the next 5 years.
The project entitled, “Machine Learning Models for Big Data Omics,” proposes to design and develop machine-learning models that can search for “overlooked” proteins, hidden in mass spectrometry data. The current algorithmic infrastructure to deduce peptides from mass spectrometry data can confidently identify only 25% of the (abundant protein) data and are rest are wasted due to inefficiency of computational techniques that deduce peptides, states Prof. Saeed in a recent proposal. Non-abundant peptides are presumed to be as important as their abundant counterpart(s), and often associated with human aliments including rare diseases, neurological disorders, and cancers.
“The overall objective of Saeed Lab (https://saeedlab.cs.fiu.edu/) using this MIRA mechanism is to design and develop robust, reliable, and generalizable machine-learning models for peptide deduction from MS data from omics experiments,” stated Dr. Saeed. This work will fill key computational-infrastructure gaps which, if developed, will lead to superior computational techniques capable of inferring both abundant and non-abundant peptides.
Dr. Saeed explained that the “general strategy will involve design and development of generative models, self-learning models, biologically inspired models, and methods to infer uncertainty quantification. In addition, his lab will strive to focus on two key gaps in adaptation of ML models that will be filled via developing ML-ready workflows and developing easy-to-use software infrastructure that can be used by scientists.”
Dr Saeed and his team believe this effort via the MIRA grant mechanism will fill a critical gap in our scientific understanding, and will enhance our ability to deduce novel peptides from complex biological communities using the proposed machine-learning models. Such comprehensive and systematic analysis of thousands of proteins with the promise of discovering new biomarkers for various disease conditions and better understanding of human systems biology – will be a highly impactful outcome.
Further details can be found at https://reporter.nih.gov/project-details/10842826.