ImmuneFM: Pre-training Foundation Model from Cytometry Data for Immunology Research

Authors

  • Sirui Ding, Sanchita Bhattacharya, Atul J. Butte Author

Keywords:

Pre-training, Cytometry Data, Immunology Research

Abstract

Immunology is an essential field in the biomedicine domain, which plays an important role in oncology, vaccines, infection, etc. With the increasing amount of data available in immunology and artificial intelligence technique development, there is a need to develop data-driven AI methods in the field. However, the data from various immunology studies is very hard to integrate into an AI-ready dataset due to the lack of a standard. Moreover, independent immunology studies’ data lacks enough labels to train the supervised model. Motivated by these challenges, we curated a large-scale AI-ready cytometry dataset for immunology from the publicly available ImmPort portal. We design the framework to pre-train a foundation model, ImmuneFM, on the cytometry dataset. ImmuneFM can be applied to a wide range of downstream immunology diseases with fine-tuning on a limited number of labeled samples. The experiment results on eight downstream tasks demonstrate the superior performance of ImmuneFM compared to baseline deep learning and traditional methods.

Downloads

Published

2025-12-01

Issue

Section

Articles