Industrialized drug discovery Requires an iterative experiment machine.

Our core platform is a continuous, iterative loop of "biology and bits" where wet lab biology experiments are executed automatically and results are produced by machine learning models in the cloud. Over and over, and at increasingly greater scale, our system gets 'smarter.'

A drug discovery platform that reliably drives decision-making based on five million images of human cells every week requires built-for-purpose components.

Let’s look under the hood at our Recursion internal tools:


These software tools are designed to enable selection and design of chemical compounds to evaluate in our robotic laboratory. They can be applied throughout testing cycles, and are based on deep integration of computational chemistry, machine learning and previous screening results on our platform.


These tools facilitate a streamlined workflow for planning large, complex experiments that span hundreds of thousands of microwells and require application of precisely planned combinations of reagents. Notably, ReScreen generates scientific and statistical experiment variables that enable quantitative machine learning analysis and integrate automatically with our laboratory robotics.


Our robotic automation hardware and software let us screen hundreds of thousands of drug compounds and cellular disease models — such as models for rare genetic disorders, infectious disease, immuno-oncology and inflammation — quickly switching between experiments and allowing for rapid follow-on experiments based on the latest results.


We create digital mathematical signatures, or Phenoprints, for each biological condition that is tested in ReScreenRun. Representation Learning is designed to allow us to quantitatively calculate high-dimensional representations of our human cell images. Ultimately, relationships between Phenoprints help uncover potential new drugs and novel biological relationships.


Our ReAnalyze tools use ReRun and are designed to use data science to compute the effectiveness of each drug compound in our assays, as well as any unintended effects (such as potential off-target liabilities). ReAnalyze helps drug discovery scientists to rapidly home in on the most promising drug compounds and generates details and visualizations to drive unbiased decision-making.


A suite of machine learning solutions to model drug compound relationships using Phenoprints, chemical structure and pharmacological properties. These predictions will inform what screens we execute in ReChem through ReAnalyze, helping complete the iterative loop.

Our platform is robust, but what matters most is that it delivers for our own team.
Here’s what we’re generating from our platform:

Proprietary Dataset:

Recursion’s massive, proprietary dataset of human cellular images. Our dataset automatically aggregates over time ensuring that new questions and new hypotheses can be tested without having to conduct new experiments, increasing the speed of experimentation while also improving the quality of those hypotheses over time.

Learn more at


Unique mathematical identifiers or “blueprints” for cell states, disease models, and drug compound effects, that can be quantitatively compared and contrasted to other ‘prints to drive drug discovery insights.

Intelligence Reports:

Reports that expose novel insights about drug compounds, including both desired and undesired effects on human cells, and Phenoprint and structural similarities to other drug compounds.


We have discovered novel and repurposed drug compounds on our platform.

See our Pipeline for more.

Data-First Discovery

Data is central to everything we do. At Recursion, we validate the unbiased, data-first discoveries from our platform in gold-standard traditional wet lab settings. That information is then fed back into the platform for a virtuous cycle of continuous learning about both the biology and the platform itself.  

In order to generate the highest quality, fit-for-purpose dataset in the industry we focus on:

Data reliability and relatability:

Generating our own quality-controlled data, fit-for-the purpose of machine learning, minimizes experimental noise and ensures we can compare any new data point on our platform to existing data from days to years old.

Generalized assay framework for broad biology:

In a core experimental setup, we induce disease states in many different cell types and screen them alongside healthy cells using specific fluorescent probes. By applying potential drug compounds to the diseased cells, we can identify signals of experimental efficacy and "rescue" of diseased cells to a healthy state, as well as identify signals of potential side-effects.

Scale and scale again through automation and innovation:

We continually evolve our capacity and capabilities while driving down cost per data point, with throughput increases of up to 10-fold per year. We automate as much as non-humanly possible. Software engineers work closely with screening technicians and data scientists to create a platform with speed and efficiency that keeps pace with our discoveries.