Sema4 Research

Sema4 is a patient-centered health intelligence company dedicated to advancing healthcare through data-driven insights. Our cutting-edge research is essential to this mission.

Our accomplished research team is led by Sema4’s Founder and CEO, Eric Schadt, Ph.D., a world-renowned expert on constructing predictive models of disease. We publish regularly in peer-reviewed, high-impact factor journals and collaborate extensively with health systems, clinicians, and pharmaceutical companies to deliver patient-centered, clinically relevant insights that drive precision medicine.

By the numbers


peer-reviewed research papers published since 2017


h-index score of our Founder and CEO, Eric Schadt, Ph.D.


PhDs and MDs currently employed at Sema4


large next generation sequencing (NGS) panels run

Our research

Sema4’s research concentrates on structuring multidimensional data collected from our advanced genomic tests and real-world data sources into clinical insights. We use this structured data and our proprietary technology platforms, including bioinformatic pipelines and integrative predictive modeling, to discover disease mechanisms and identify clinically actionable biomarkers, and create products that support patient care, from information-driven genomic tests to digital tools.

Much of our research is powered by Centrellis®, Sema4’s innovative health intelligence platform, which enables us to generate a more complete understanding of disease and wellness and deliver data-driven insights to our clinical collaborators to help drive better health decisions.

Since 2017, Sema4’s scientists have published more than 180 papers in peer-reviewed, high-impact factor journals. We also regularly present our research at national and international conferences. For examples, click here to review our abstracts from ASCO 2021. Our research focuses on five interlinked areas:

We develop, test, and maintain big data solutions to transform, aggregate, and abstract data into high-quality formats that are optimized for query and analysis. We then apply advanced technologies, including artificial intelligence and natural language processing (NLP), to these data to extract meaningful predictive information, which furthers our understanding of disease and wellness. Click here to see some of our recent data science & engineering publications.
Centrellis enables us to aggregate, abstract, and structure real-world data, such as the data contained in electronic health records (EHRs). We run this unstructured data through multiple pipelines leveraging machine learning-enabled NLP, augmented as needed by human annotators, to extract information and knowledge. Our multiscale, integrative strategy then allows us to connect the processed EHR data with complex biological data from many sources, including the genome, proteome, and transcriptome, and conduct real-world evidence studies. Click here to see some of our recent real-world evidence publications.
Sema4 has developed methodologies to integrate diverse multi-omics data, including genomic, transcriptomic, and proteomic data, into causal probabilistic network models. These machine learning-based models help us to understand disease processes and identify key biomarkers through advanced network analysis. Our scientists have also pioneered the use of DNA variation information to statistically infer causal relationships among any number of traits that have common genetic variance components. We can then systematically apply these causal relationships to traits to infer probabilistic causal network structures that we can mine for a broad range of discoveries. Click here to see some of our recent network modeling publications.
We are continually designing, developing, and improving assays across a range of sequencing technologies. One example of this is our pharmacogenomics (PGx) research, which focuses on advancing the development of tests to identify genetic variants for drug response associated with medically actionable and clinically relevant data. Such tests can help clinicians to make more informed treatment decisions. Click here to see some of our recent molecular profiling publications.
Informed by years of experience in patient-centered care, we build digital tools and solutions to enable our clinician partners and their patients to engage with complex data and insights via friendly user interfaces. These technologies support mobile health research, medical record integration, and patient input into the research process, enabling multidimensional clinical insights.

Click the links below to see some of our recent publications by disease area:

Precision oncology
Reproductive health
& rare disease

See our complete publication list here.

Our People

Sema4’s Founder and CEO, Eric Schadt, Ph.D., is a world-renowned expert on constructing predictive models of disease that link molecular biology to physiology to enable clinical medicine. He has published more than 450 peer-reviewed papers in leading journals, with a publication citation or “h-” index of 132. In addition, Dr. Schadt has contributed to discoveries relating to the genetic basis of common human diseases, such as cancer, diabetes, obesity, and Alzheimer’s disease, and received many awards, including the Thomson Reuters World’s Most Influential Scientific Minds Award.

Our research team, led by Dr. Schadt, includes world leaders in data science, machine learning, network modeling, and genomics. We currently employ more than 160 Ph.D.-level scientists, in addition to physician-scientists and certified technicians. Our ongoing collaborations with scientists and clinicians in the Mount Sinai and other healthcare systems allow Sema4’s research to remain patient-centered and clinically relevant.

We’re proud to welcome Gustavo Stolovitzky, Ph.D., as Chief Science Officer of Sema4. Dr. Stolovitzky is a globally-acclaimed expert in computational biology, disease modeling, and nano-biotechnology with over 25 years of experience in high throughput data analysis for biology and the application of technology to solve biomedical problems. He will lead the advancement of Sema4’s strategic research direction to enable healthcare systems to deliver cutting-edge translational medicine to providers and patients.

Ready to join the Sema4 team? Click here to apply for open scientific positions

Our Partnerships

As a research partner, Sema4 can support a range of studies, from optimizing biomarker discovery to leveraging our biorepository services and accelerating clinical trial enrollment. We can bring a range of capabilities to partnerships, including:

A comprehensive portfolio of sequencing solutions for oncology, women’s health, rare disease, and immunology.
Assistance with in-depth interpretation of generated sequencing data. Our precision medicine experts work with partners to interpret results, including de novo biomarker signature discovery, and create predictive network models.
Expert structuring and analysis of large data sets, and industry-leading bioinformatics pipelines. Using machine learning and natural language processing, we draw insights from structured data at the individual and cohort level and make these insights accessible to partners through our digital tools.

We currently partner with numerous pharmaceutical companies, major health systems, research consortia, clinicians, and advocacy groups.

Interested in partnering with us? Fill out the contact form below for more information.


Contact Us Today

Please reach out to our team by filling out the form below.