The Business of Digital Twins in Drug Development

Market landscape, business models, and entry points — Post 6 (final) in the Digital Twins for Vaccine Trials series

digital twin

vaccine

clinical trial

business

market

pharma

Author

Jong-Hoon Kim

Published

April 22, 2026

1 What this series has built

Over five technical posts we built a complete digital-twin pipeline for vaccine clinical trials:

Post	Topic	What you can do with it
Overview	Conceptual map	Orient stakeholders, frame regulatory conversations
1	Within-host ODE	Model individual antibody kinetics for any vaccine
2	Correlate of protection	Translate titres into protection probabilities
3	Virtual patient cohorts	Generate realistic populations via LHS sampling
4	Synthetic control arms	Replace or reduce placebo groups; Bayesian borrowing
5	Adaptive trial simulation	Compute power, optimise design, propagate uncertainty

This final post steps back from the code and asks: what is this worth commercially, who is already doing it, and where could a new player add value?

2 Why this is a genuine business opportunity

2.1 The structural problem

Drug development is expensive, slow, and increasingly difficult. The cost of bringing a new drug to market has risen to an estimated $2–3 billion (including failure costs), driven largely by Phase 2 and 3 clinical trials. Vaccine trials for infectious diseases face additional structural pressures:

Epidemic timing: a trial designed for a pathogen that wanes before enrolment completes loses statistical power (Post 5).
Placebo ethics: randomising participants to placebo during an active outbreak is increasingly unacceptable to ethics committees, patients, and regulators.
Paediatric and rare-disease challenges: small eligible populations make traditional randomisation impractical (1).
Booster decisions: regulatory agencies now expect efficacy predictions for boosters before Phase 3 data are available — which requires a model.

Digital twins do not eliminate these problems. But they provide tools to make trials faster, smaller, more ethical, and more informative — which translates directly into commercial value.

2.2 Market size signals

The broader “clinical trial simulation” and “model-informed drug development” (MIDD) market is difficult to isolate — it is embedded within pharmacometrics, biostatistics, and CRO services — but observable signals suggest it is large and growing:

The FDA’s MIDD paired-meeting program received over 100 meeting requests in its first three years, across oncology, rare disease, and immunology.
EMA’s PROCOVA qualification (2023) was the first regulatory approval of a digital-twin-derived endpoint adjustment, validating the entire concept commercially.
ICH M15 (2024 draft) creates global harmonised standards — the kind of infrastructure that precedes large market expansion.
Investor activity: companies like Unlearn.AI have raised >$50M, and traditional CROs (IQVIA, Certara, Simulations Plus) have made acquisitions in the modelling space.

3 The market landscape

3.1 Segment 1: Software platforms for MIDD

Who: Certara, Simulations Plus (SimBiology), Pumas AI, Rosa & Co

What they sell: Software that enables pharmacometricians inside pharma companies to build, run, and submit MIDD models. Revenue model: software licenses, consulting, and training.

Competitive moat: Deep regulatory relationships; data packages that support FDA/EMA submissions; existing integration with clinical data management systems (EDC, eClinical).

Weakness: Requires the pharma client to have in-house pharmacometricians to use the software; poor accessibility for small biotechs.

3.2 Segment 2: Digital twin as a service (DTaaS)

Who: Unlearn.AI (the current category leader for synthetic control arms), InSilico Medicine, Intelligencia AI

What they sell: A complete synthetic control arm service. The sponsor provides participant baseline data; the vendor runs their model and delivers a validated synthetic placebo cohort. Revenue model: per-trial fees (typically $500K–$2M per trial depending on indication complexity), with potential milestone payments.

What Unlearn.AI specifically does: They use a hierarchical Bayesian model trained on historical trial data for a given indication. The model predicts what each vaccinated (or treated) participant’s outcome would have been on placebo. Their platform has received breakthrough qualification interest from FDA in CNS and oncology. Their approach is predominantly statistical, not mechanistic — which is faster to train but less generalisable to novel pathogens.

The gap Unlearn.AI does not fill: They do not have mechanistic ODE-based models for novel vaccine platforms (e.g., mRNA vaccines against emerging pathogens). Their historical-data approach requires abundant historical placebo data to train on — which does not exist for pandemic vaccines.

3.3 Segment 3: Academic and government modelling groups

Who: Imperial College MRC Centre for Global Infectious Disease Analysis, Johns Hopkins Bloomberg School of Public Health, Seattle Children’s Vaccine & Infectious Disease Division, NIH-funded modelling consortia

What they do: Publish the methodological foundations (the papers cited throughout this series). Often provide free software (R packages, GitHub repositories) that commercial players then productise.

Business relevance: Partnership with academic groups is often how a new commercial player builds credibility and accesses data. Many successful MIDD companies began as academic spinouts.

3.4 Segment 4: CRO services

Who: IQVIA (Model-Informed Drug Development practice), PPD (now Thermo Fisher), Covance (now Labcorp Drug Development), Quantitative Solutions (a Parexel company)

What they sell: Integrated modelling and clinical execution. A pharma sponsor outsources not just the model but also the trial design, regulatory filing strategy, and pharmacometrics team. Revenue model: project-based consulting at $500–$2000/hour for senior pharmacometricians, embedded in multi-year CRO contracts.

Competitive moat: One-stop shop for sponsors who want to offload all trial complexity; deep regulatory filing experience.

4 Business models in detail

4.1 Model 1: Platform software

You build a validated, reusable computational infrastructure — virtual patient engines, CoP fitting pipelines, adaptive trial simulators — and license it to pharma companies. Revenue is recurring (subscription or license), predictable, and scales without linear headcount growth.

Challenges: Requires deep regulatory trust; pharma companies are risk-averse and prefer validated, established tools. Sales cycles are 12–24 months. FDA/EMA submission credibility requires regulatory precedent.

Example economics: A specialised MIDD software platform serving 10 mid-size pharma companies at $500K/year license = $5M ARR. Add consulting at $300/hour × 2000 hours/year = $600K. Early-stage realistic target.

4.2 Model 2: Per-trial service

You develop expertise in a specific therapeutic area (e.g., pandemic vaccines, paediatric infectious disease) and deliver synthetic control arm analyses as a service. Each trial is a project; pricing is milestone-based.

Challenges: High initial investment to build validated models before your first customer. Regulatory precedent must be established either with FDA or EMA. Requires a team including both computational scientists and regulatory affairs specialists.

Example economics: 3–5 trials/year at $1M–$2M each = $3M–$10M revenue. Margin depends on team size; a lean 10-person team can achieve 40–50% gross margins once model infrastructure is amortised.

4.3 Model 3: Data + model network

You build or aggregate a data network — longitudinal immunogenicity datasets from multiple trials — and use it to train and validate models that no single sponsor could build alone. Participants (pharma companies) contribute anonymised data and receive validated model access in return.

Example: A consortium of 5 vaccine manufacturers contributing mRNA vaccine Phase 1/2 immunogenicity data, jointly training a mechanistic-statistical hybrid VPC model, and jointly validating it against Phase 3 outcomes. Each manufacturer gets a more accurate model than they could build alone; the consortium runs the service for regulators as a neutral party.

Challenges: Data sharing between competitors is legally and commercially complex. Requires a neutral convener (academic institution, standards body, or foundation). This model has succeeded in genomics (TCGA) and oncology biomarkers (ORIEN) — vaccine immunology may be next.

5 Where a new entrant can compete

Given the landscape above, a realistic entry strategy for a well-positioned new player (such as a quantitative infectious-disease modelling group) would focus on the mechanistic gap:

Most commercial digital twin players use statistical models. The mechanistic ODE-based approach built in this series — within-host dynamics, biologically interpretable parameters, extrapolation to novel antigens — is not yet well-represented commercially.

Specific niches where a mechanistic player has defensible advantages:

Novel pandemic vaccines: when historical data does not exist (e.g., a new coronavirus), statistical approaches cannot be trained. Mechanistic models calibrated from first principles can still generate predictions.
Variant bridging: predicting VE against a novel variant from immunogenicity data against the original strain requires a mechanistic model of cross-neutralisation. Statistical models cannot bridge across antigenically distinct variants.
Paediatric and elderly sub-populations: immunosenescence and immature immune systems alter model parameters systematically. A mechanistic model parameterised by age can extrapolate to new populations; a statistical model trained only on adults cannot (1).
Combination vaccines and novel platforms: predicting the immunological interaction of two antigens, or the kinetics of a novel lipid-nanoparticle delivery system, requires mechanism — not statistics.

6 The regulatory path is the moat

The single biggest barrier to entry in this market is regulatory acceptance — convincing FDA and EMA that a model is fit for the intended purpose. This is not primarily a scientific problem; it is a relationship and precedent-building problem.

The sequence of regulatory milestones that builds the moat:

Publish methodology in peer-reviewed journals (builds scientific credibility; establishes priority).
Submit a MIDD briefing document for a Phase 2 trial where the model informs dose selection (low-risk use; FDA feedback without approval requirement).
Pre-specify a synthetic control analysis as a secondary endpoint in a Phase 3 trial (if the analysis succeeds, it is a precedent; if it fails, you learn without staking the approval).
Seek formal qualification (FDA Biomarker Qualification program, EMA qualification opinion) for a specific use case in a specific indication. This is the gold standard and takes 3–5 years, but once achieved, it is a durable competitive asset.

The companies that achieve step 4 first in each therapeutic area will capture the majority of the commercial opportunity. The race is already underway in oncology (Unlearn.AI, Pentara, Immunetrics). For infectious-disease vaccines, it is still early.

7 Risk factors for the business

Scientific risk: If a virtual patient cohort fails to predict Phase 3 outcomes in a high-profile trial, it will set back the entire field — not just one company. Regulatory trust, once lost, takes years to rebuild.

Regulatory risk: FDA or EMA guidance could tighten rather than loosen requirements for synthetic control arms. The 2024–2025 window of regulatory experimentation could close.

Competition from large CROs: IQVIA and Certara have capital, client relationships, and regulatory credibility. If they build or acquire mechanistic modelling capabilities, they can bundle MIDD services into existing CRO contracts, making it very hard for a standalone player to compete on clinical execution.

Data access: The mechanistic advantage requires calibration data — real longitudinal immunogenicity datasets. Accessing these from pharma companies requires trust, legal frameworks, and often financial compensation. Early-stage companies may find themselves unable to calibrate their models without a chicken-and-egg problem.

8 A realistic 3-year roadmap

For a quantitative infectious disease research group looking to enter this space:

Year 1: Establish scientific credibility - Publish mechanistic VPC methodology for mRNA vaccines in a high-impact journal. - Submit open-source R/Julia tooling (e.g., a validated VPC package for CRAN). - Establish one industry partnership (co-development agreement with a mid-size vaccine company) that provides access to real immunogenicity data.

Year 2: Build regulatory track record - Submit a MIDD briefing package to FDA for an ongoing Phase 2 trial (with industry partner). - Pre-specify a synthetic control secondary analysis in the Phase 2 protocol. - Present at FDA advisory committees or ISOP/PAGE conferences to build profile.

Year 3: Commercialise - Deliver first commercial per-trial synthetic control analysis. - Begin data consortium discussions with two or three non-competing vaccine manufacturers. - Raise seed or Series A funding based on regulatory precedent and consortium commitments.

9 Closing thoughts

Digital twins for clinical trials are not a technology looking for a problem. The problems are well-defined — expensive trials, ethical placebo arms, sparse paediatric data, epidemic unpredictability — and the regulatory framework is actively being built to accommodate model-based solutions. What is missing is the combination of mechanistic rigor, regulatory engagement, and commercial execution.

The five technical posts in this series cover the mechanistic rigor: an ODE model of antibody kinetics, a correlate of protection, a virtual patient cohort, a synthetic control arm, and an adaptive trial simulator. Anyone with a quantitative background in infectious disease modelling and access to immunogenicity data has the scientific foundation to compete in this market.

The regulatory engagement and commercial execution are harder but not mysterious: they require patience, the willingness to build slowly through partnership rather than competition, and a long-term view of what regulatory precedent is worth.

The market is early. The science is ready. The timing is good.

10 Series index

This post concludes the Digital Twins for Vaccine Trials series. The full series:

Digital Twins in Clinical Trials — overview
Within-Host Dynamics of Vaccine-Induced Immunity (Post 1)
Correlates of Protection (Post 2)
Building Virtual Patient Cohorts (Post 3)
Synthetic Control Arms (Post 4)
Adaptive Trial Design with Digital Twins (Post 5)
The Business of Digital Twins in Drug Development (Post 6, this post)

References

Akbarialiabad H et al. Enhancing randomized clinical trials with digital twins. npj Systems Biology and Applications. 2025;11:110. doi:10.1038/s41540-025-00592-0