ELIXIR AI PROTEOMICS

Advancing proteomics through AI and high-performance computing

ELIXIR AI PROTEOMICS aims to leverage the European high-performance computing (HPC) resources to pioneer innovative AI-driven methodologies for analysing mass spectrometry (MS) proteomics data. The aim is to set a new standard in proteome data analysis. The project aims to both enhance the scientific understanding of complex biological data and promote the widespread adoption of these technologies among life scientists. The project has three main objectives:

  1.  Establish an AI-driven HPC framework for MS proteomics

Using advanced AI algorithms and GPU computing, project aims to enhance the interpretation of large-scale MS proteomics data and develop open-source tools for their utilisation. Project will focus on creating and optimising algorithms that can efficiently process and analyse the massive datasets using the computational resources of EuroHPC.

      2. Assess the HPC framework on different types of biological samples

The project will conduct a series of evaluation studies using gold-standard spike-in MS proteomics data, and data from diverse types of biological samples. As part of the EMBL-EBI collaboration, it will utilise a ELIXIR proteomics Core Data Resouce, the PRIDE Archive. The project will also promote the use of global data standards to ensure FAIRness of data and research.

      3. Utilise the HPC framework to discover new biomarkers for type 1 diabetes

The project will analyse data from human serum and plasma samples to identify new candidate protein markers that correlate with the progression of Type 1 Diabetes (T1D). In addition, the project will cross-reference findings with publicly available data from the PRIDE Archive to validate the specificity of the markers when compared to healthy individuals and those with other autoimmune diseases or metabolic disorders.

In collaboration with the University of Turku, through this project CSC plays a key role in advancing high-performance computing (LUMI). A key practical implementation is integrating proteomics datasets from EMBL-EBI (the ELIXIR EMBL node) into HPC environments, enabling more efficient data processing and analysis. The project has significant potential impact on improving human health by facilitating efficient analysis and re-analysis of proteomics data and their integration with other digital health data. The open and FAIR data policy enables new opportunities and supports digital transition in biomedicine. The project drives educational impact by spreading supercomputing know-how and influences international developments via collaborations.

This project has received funding from the Research Council of Finland​​​​​​​ under funding decision No 364899.