Research

We are passionate about developing computational techniques that accelerate the discovery of new medicines, and we work across all disease areas, from cancer and psychiatric disease to rare disorders and infectious disease.

A major focus is image-based profiling: we invented CellProfiler and the Cell Painting assay and are pioneering novel computational methodsare using advanced machine learning methods, including deep learning, to identify morphological patterns in cell populations. We use this rich, image-based information to identify patterns resulting from chemical or genetic perturbations, to probe the causes and cures for disease.

Here we provide a broad overview, but check out our recent papers for our latest research directions!

Applications in Image-based Profiling / Cell Painting

correlationmatrixHigh-throughput imaging experiments generate extremely large, multidimensional data sets with quantifiable phenotypic information for every individual cell. Using machine learning, including deep learning, we mine this rich, latent information to identify patterns resulting from chemical or genetic perturbations to probe the causes and cures for various diseases. For example:

  • Predicting how new chemical compounds act in cells
  • Identifying and classifying toxicity of compounds destined for clinical trials
  • Identifying differences in cell structure between patient cells affected by bipolar disorder or schizophrenia
  • Discerning the functional impact of gene variants associated with human disease
  • Identifying gene function in genome-scale pooled Cell Painting studies
  • Check out our Recent papers for more!

We invented the Cell Painting assay in order to carry out high-throughput image-based profiling experiments.

We led the JUMP-Cell Painting Consortium to create the world's largest public Cell Painting dataset of chemical and genetic perturbations.

We are co-leading the OASIS Consortium to create a first-of-its-kind benchmark dataset to predict chemical toxicity from Cell Painting, transcriptomics, and proteomics data in multiple liver cell models.

We are leading the VISTA Consortium to rapidly identify existing drugs that may reverse the impact of diseases caused by protein mislocalization.

Novel Methods in Image-based Profiling

We develop computational methods to extract deeper insights from biological image data. For example:

Representation Learning: We develop approaches for transforming complex cellular images into meaningful numerical representations. Examples include our specialized Cell Painting CNN, the InfoAlign framework for molecular representation learning, and a self-supervised contrastive learning method that captures cell heterogeneity.

Multi-modal Data Integration: We create methods that combine data from different sources. Examples include integrating chemical structures with phenotypic profiles to enhance bioactivity prediction, and the MOTIVE benchmark and dataset for drug-target interaction prediction..
 
Evaluation & Quality Control: We build frameworks to assess data quality and method performance. Examples include an information retrieval-based approach for evaluating profile strength and similarity, and benchmarking solutions for batch correction in image-based profiling datasets.
 
Note: This is a 2025 snapshot of our methodological work, which evolves rapidly. Please see our Papers page for our latest research directions.

Impact on human health

tuberculosisOur research has yielded discoveries in several translational projects, some of which have already had a direct impact on the treatment of disease. For example, we invented and maintained the CellProfiler project through 2021; it has been cited in over 20,000 papers. Discoveries made using CellProfiler have even led to clinical trials in humans and directly improved patient outcomes [more details]. Our lab's freely available image-based profiling strategies and assays have been adopted by several startups and pharmaceutical companies, one of which (Recursion), now has many drugs in clinical trials. Cell Painting is used widely in academia and in dozens of pharma companies and biotechs.

Community Organizing / Open Source

We helped create academic societies to bring the community together (SBI2)(CytoData). We brought the bioimage software community together in a single online forum (forum.image.sc), now with over 300,000 posts.. We organize public resources of data (BBBC (now maintained by the Cimini lab) and the Cell Painting Gallery). We organize public data challenges (Data Science Bowl). We launched and led the NIH Center for Open Bioimage Analysis (COBA) until 2024, to serve the cell biology community’s growing need for sophisticated software for light microscopy image analysis.

Past Research Areas:

CellProfiler and other bioimage analysis software

We launched and led the open-source CellProfiler project for 18 years. It is beloved by tens of thousands of biologists around the world. Since 2021, the Cimini lab is leading development (CellProfiler site).

 

CellProfiler Analyst: Machine learning for high-content screenscellmosaic

We launched and led the open-source CellProfiler Analyst project for 16 years. It helps biologists to train supervised machine learning algorithms to identify complex phenotype in high-throughput microscopy experiments. Since 2021, the Cimini lab is leading development (CellProfiler Analyst site) and working on Piximi to enable this kind of analysis using deep learning in a web browser.

Quantifying dynamic phenotypes

timelapseMany biological questions can only be investigated by collecting time-lapse movies. We have analyzed these images to identify, for example, novel cell cycle landmarks and motor protein regulators. We have also integrated this data with flow cytometry data to quantify unusual cell cycle outcomes.

Imaging flow cytometry

Image FlowImaging flow cytometry combines the high-throughput nature of flow cytometry with the high-resolution nature of fluorescence microscopy. For each experimental sample, it yields hundreds of thousands of images of individual cells. We developed methods to mine these large datasets [NSF project page] and demonstrated several novel applications where label-free imaging could replace customized fluorescence biomarkers.

Co-culture systems

thomas_cropped_colorIn co-cultured cell systems, two or more cell types are grown together in order to maintain more native physiological functions, enabling experiments that test genetic and chemical perturbations in a more realistic environment. We developed image analysis approaches to extract information from fluorescence microscopy images of these cell systems, enabling experiments in liver regeneration and hepatotoxicity [NSF CAREER project page], and demonstrated several novel applications where label-free imaging could replace customized fluorescence biomarkers.

Quantifying C. elegans

oilredwormThe worm C. elegans can be robotically prepared and imaged and is an effective model to probe a variety of biological questions that require whole animals rather than isolated cells. We developed sorely needed C. elegans analysis algorithms and validated them in specific large-scale experiments to identify regulators of fat metabolism and pathogen infection.