Subscribe to Newsletter
Inside the Lab Technology and innovation, Digital and computational pathology, Biochemistry and molecular biology

SCimilarity

US researchers have developed a deep metric learning framework called SCimilarity to address the challenges of analyzing and querying single-cell RNA sequencing (scRNA-seq) data across diverse studies and conditions. A report published in Nature describes how the model enables efficient searches of over 23.4 million human cell profiles, spanning 412 studies and representing a wide array of tissues, diseases, and experimental contexts.

SCimilarity organizes data in a way that groups similar cells based on their gene activity patterns, making it easier to identify connections between them. The system uses advanced learning techniques to balance sensitivity and consistency across different datasets. It was trained on data from 7.9 million cells, enabling it to recognize patterns in new, unseen datasets while working reliably across various research platforms.

To test SCimilarity, the researchers used macrophage and fibroblast profiles from interstitial lung disease (ILD) as queries. The model identified similar cells across datasets in a matter of seconds, revealing shared states in fibrotic diseases, COVID-19, and various cancers. Notably, SCimilarity demonstrated its ability to differentiate closely related cell types, outperforming other computational tools in precision and speed. For instance, querying macrophages associated with fibrosis showed that these cells are present in fibrotic lung diseases and certain cancers like pancreatic ductal adenocarcinoma.

In addition to in vivo searches, SCimilarity successfully identified an ex vivo model that mimics fibrosis-associated macrophages. Using public datasets, it pinpointed a 3D hydrogel culture system that replicated these cell states in vitro; the finding was later validated through experimental replication, highlighting the model’s ability to bridge the gap between observational studies and experimental biology.

By enabling scalable and interpretable cell queries, SCimilarity could become a foundational tool in single-cell research, with applications ranging from discovering new cell state to understanding disease mechanisms. 

The open-source framework provides researchers with a powerful resource to accelerate insights from the growing Human Cell Atlas. Future enhancements may expand its capabilities, supporting the integration of even more diverse data types and biological contexts.

Receive content, products, events as well as relevant industry updates from The Pathologist and its sponsors.
Stay up to date with our other newsletters and sponsors information, tailored specifically to the fields you are interested in

When you click “Subscribe” we will email you a link, which you must click to verify the email address above and activate your subscription. If you do not receive this email, please contact us at [email protected].
If you wish to unsubscribe, you can update your preferences at any point.

About the Author
Helen Bristow

Combining my dual backgrounds in science and communications to bring you compelling content in your speciality.

Register to The Pathologist

Register to access our FREE online portfolio, request the magazine in print and manage your preferences.

You will benefit from:
  • Unlimited access to ALL articles
  • News, interviews & opinions from leading industry experts
  • Receive print (and PDF) copies of The Pathologist magazine

Register