Phase 1 of 6
Scoping & Target / Trial Definition
Define the therapeutic target, discovery modality, trial design, and IP-sovereignty constraints before any GPU-heavy inference is deployed.
0/10
Phase Progress
Required Recommended Optional Open-Source Proprietary Trinidy
Discovery Program & Target
Identify discovery modality in scope
Why This Matters
Modality choice drives the entire compute footprint — small molecule design can run on a single H100 per-campaign, while multi-chain antibody design or protein-protein interaction modulators require AlphaFold 3-class multi-complex inference with much larger memory envelopes. Over a dozen AI-discovered or AI-designed molecules have entered clinical trials by early 2026, and the distribution is heavily skewed toward small molecules and antibodies — not because the other modalities do not work, but because the tooling maturity and benchmark coverage are still thinner outside those two classes.
Note prompts — click to add
+ Which modalities have the strongest internal medicinal chemistry / biologics expertise to consume AI-generated candidates?+ Do we have validated assays in place for the modalities we are planning to generate, or will assay development itself be the bottleneck?+ Are we aligning modality choice with an existing target portfolio, or exploring new target space opportunistically?Select the molecule classes the program will design or triage computationally.
Select all that apply
Define target validation status
Why This Matters
Roughly 90% of drugs that enter Phase I fail to reach approval, and the largest single attributable cause is target biology that did not translate — not chemistry. AI-designed molecules fail in the clinic for the same reasons non-AI molecules do, and sinking high-cost generative campaigns into an unvalidated target compounds risk rather than reducing it. Insilico Medicine's INS018_055 (advancing through Phase II by early 2026) was directed at TNIK, a target with supporting human genetic evidence — the methodology paired AI chemistry with a defensible biology hypothesis.
Note prompts — click to add
+ What is the human genetic evidence for this target, and has it been reproduced?+ If the target is novel, have we funded a target validation workstream alongside the AI chemistry effort?+ Who in biology signs off on target readiness before computational design begins?Computational discovery is only as good as the biology underneath it — confirm where the target sits on the validation curve.
Single choice
Specify therapeutic area and indication
Indication choice determines patient-matching feasibility, trial design complexity, and regulatory pathway.
Select all that apply
Define program stage and IND timeline
Why This Matters
The NPV of one year of IND acceleration on a peak-revenue drug is estimated at $300–$500M by BCG / MIT NEWDIGS. Insilico's INS018_055 reportedly reached IND in roughly 18 months versus a 4–5 year industry average, compressing the generative-chemistry advantage into a measurable NPV line. Programs should explicitly budget the AI spend against the NPV of the compression it delivers at the current stage — the marginal return is very different at hit-finding versus IND-enabling.
Note prompts — click to add
+ What is our measured cycle time at the current stage, and what would a 30% compression be worth in NPV?+ Which stage has the most internal-to-external cycle-time gap today?+ Do we track AI-program impact against a non-AI control arm to prove the lift is real?Where the program sits on the discovery-to-IND arc determines which models add value now.
Single choice
Identify IP sovereignty and compound library exposure
Why This Matters
Major pharma AI platform deals signed with Isomorphic Labs, Insilico, Exscientia, and others are publicly valued in the multi-billion-dollar range — which is a market-level revealed preference for treating AI-accessible discovery data as a strategic asset rather than a fungible input. The IP exposure question is therefore a board-level decision, not an IT procurement one. Model IP ownership is also legally unsettled across research consortia, and cloud inference makes the provenance question materially harder to defend later.
Note prompts — click to add
+ Has legal / IP signed off on every third-party AI API that touches SMILES or assay results?+ Do we have a data-use agreement that bars use of our inputs for vendor model training, and is it enforceable?+ If the current vendor were acquired tomorrow, what happens to the derived model weights trained partly on our data?Proprietary compound libraries and assay data represent accumulated R&D value that cannot be shared with a third-party training pool.
Select all that apply
Trinidy — Public-cloud LLM and generative-chemistry APIs send prompts, structures, or SMILES to a third-party tenant — with few exceptions, they do not contractually guarantee exclusion from future training data. Trinidy runs generative chemistry, protein-structure inference, and literature LLMs entirely inside the sponsor's infrastructure — compound libraries, SMILES, and model weights never leave the perimeter.
Confirm regulatory jurisdiction and cross-border constraints
Map the jurisdictions whose trial data and submissions the program will produce.
Select all that apply
Deployment topology for inference
Select the physical deployment target for protein-structure, generative chemistry, and literature LLM inference.
Single choice
Trinidy — AlphaFold 3-class inference and generative chemistry require H100 / B200-class GPUs with large memory envelopes — and the data crossing the inference boundary is the program's core IP. Trinidy is the on-premises substrate for GPU-local inference: Isomorphic-grade protein inference and de novo chemistry inside the sponsor's own data center.
Clinical Trial Design & Matching Scope
Define trial protocol and phase
Why This Matters
Trial matching difficulty scales non-linearly with eligibility-criterion count. Oncology Phase II / III protocols routinely exceed 30–50 inclusion/exclusion criteria, many of which require parsing unstructured physician notes rather than structured EHR fields — which is exactly where LLM-based matching (versus structured-query matching) earns its 10–20x speedup versus manual chart review. Adaptive and basket designs add a second layer because eligibility can change mid-trial, and matching pipelines must be re-runnable.
Note prompts — click to add
+ How many eligibility criteria in our current protocols require reading unstructured notes versus structured EHR?+ Do we have a retrospective cohort we can use to validate matching precision before going live?+ Is the matching system designed to re-evaluate eligibility if protocol amendments change criteria mid-trial?Specify the trial phase, design class, and patient population for AI-assisted matching.
Single choice
Trial matching scope — EHR integration
Define the EHR scope and PHI exposure surface for patient matching.
Select all that apply
Trinidy — Patient matching against EHR data is PHI processing under HIPAA regardless of the downstream trial purpose — and LLM-based matching means the full unstructured note enters the model context. Trinidy keeps the matching LLM inside the health system's own HIPAA perimeter — no PHI crosses to a shared-tenancy cloud API.
Define trial matching precision and recall targets
Why This Matters
A JAMA Network Open meta-analysis of 28 AI-assisted recruitment trials showed median enrollment timelines reduced from roughly 18.7 to 10.4 months — but that speedup depends on the matching system earning clinician trust. A low-precision system generates enough false-eligible candidates to burn out reviewing clinicians, at which point they ignore the AI queue entirely and the speedup collapses. The tiered approach — auto-advance high-confidence matches, route ambiguous ones to clinician adjudication — is what most successful deployments converge on.
Note prompts — click to add
+ What is the clinician review capacity in hours per week, and does our precision target respect it?+ Have we quantified the cost of a missed-eligible patient versus a false-eligible candidate in our indication?+ How is clinician feedback fed back into matching thresholds so the system improves over time?AI trial matching must balance missed-eligible (under-enrollment) against false-eligible (clinician review burden).
Single choice