Phase 1 of 6
Scoping & Multi-Site Governance
Define the research question, participating sites, data-sovereignty constraints, and IRB/DGC posture that will govern every downstream architectural choice.
0/10
Phase Progress
Required Recommended Optional Open-Source Proprietary Trinidy
Research Question & Consortium Shape
Define the federated research question and study type
Why This Matters
The study type fundamentally shapes the federation pattern — a cohort feasibility analysis needs only aggregate counts (federated query, no model training), while a federated imaging classifier needs gradient exchange, secure aggregation, and dedicated GPU at each site. Conflating the two produces months of wasted infrastructure build. NIH N3C demonstrated that rare-disease feasibility that took 14–18 months under traditional DUAs can be compressed to 2–6 weeks when the right federation pattern is chosen up front.
Note prompts — click to add
+ Is the question answerable with federated query only, or does it require federated model training?+ What is the minimum cohort size across sites that makes the study statistically viable?+ Have we written the protocol before choosing the federation technology, or are we letting the platform dictate the protocol?Confirm the clinical or translational question the federated model will answer.
Select all that apply
Inventory participating sites and their class
List every site in the federation and its institutional type.
Select all that apply
Select the federation topology
Why This Matters
Star topology is operationally simplest and matches the N3C and Rhino Health defaults, but it concentrates trust in the aggregator operator — which matters when participants include international partners or competing AMCs. Hierarchical topologies are common when regional coordinating centers already exist (CTSA hubs, PCORnet CDRNs). The topology decision drives the secure-aggregation protocol choice, the legal contracting surface, and where NIST AI RMF governance responsibilities land.
Note prompts — click to add
+ Who operates the central aggregator, and is every participating site comfortable with that operator as a trust root?+ Do we have a regional coordinating center that naturally becomes a sub-aggregator?+ Does our topology support adding a new site mid-study without re-contracting all existing sites?Decide whether the federation is star, hierarchical, peer-to-peer, or hybrid.
Single choice
Trinidy — In every topology, NEXUS OS runs the full inference and gradient stack inside each institution's perimeter — the aggregator never sees raw PHI, and each site retains cryptographic control of what leaves the building.
Define data sovereignty and residency constraints per site
Why This Matters
Data sovereignty for federated studies is not a single regulation — it is the intersection of HIPAA, GDPR Article 9, state laws that go beyond HIPAA (notably Texas HB 300 and the CMIA), and institution-specific data-use agreements. GDPR Article 89 research derogations have been enacted by 22 EU member states with non-trivial differences, and the European Health Data Space Regulation adopted in 2024 layers additional structure on top. A single contradictory residency rule at one site can invalidate the federation for every other site.
Note prompts — click to add
+ Has legal review confirmed that gradient exchange — not just PHI — is permissible under each site's DUA and BAA?+ Have we mapped each site against GDPR Article 89 derogations in its member state if any EU site is participating?+ Does any participating site have residency rules stricter than HIPAA that will bound the architecture?Map the jurisdictional constraints that govern where PHI may be computed on.
Select all that apply
Trinidy — NEXUS OS keeps training, gradient computation, and inference entirely within each institution's existing network boundary — zero raw-PHI egress, even when the aggregator is in a different country.
Confirm IRB strategy — single IRB vs. site-local IRBs
Why This Matters
The 2018 Common Rule revision (45 CFR 46.114) mandates a single IRB of record for most federally funded multi-site research — but the rule has carve-outs (tribal law, international sites, state law) that reintroduce local review. Federated architectures do not automatically simplify IRB review because the novelty — gradient exchange, differential privacy, secure aggregation — often triggers a full board review at each site rather than the expected expedited path. Budgeting 3–6 months for IRB alignment is realistic; 2 weeks is not.
Note prompts — click to add
+ Has the sIRB (or each local IRB) reviewed the federation protocol, or only the underlying research question?+ Do the IRBs understand that gradient exchange is a form of minimum-necessary disclosure that may still need risk review?+ Which sites' IRBs have prior experience with federated or privacy-preserving protocols we can leverage?Decide whether the study runs under a single IRB of record or parallel site-local IRBs.
Single choice
Map consent posture for each data class
Why This Matters
Consent posture determines what features a site can contribute — a site working under a waiver of authorization may not be able to supply free-text clinical notes, while a site with broad research consent may. HIPAA's 2002 research-use carve-outs (164.512(i) waivers, 164.514(e) limited datasets, and Safe Harbor / Expert Determination de-identification under 164.514(b)) are the primary pathways, and most federated studies mix them across sites. Mis-labeling the pathway at any one site creates a privacy incident under HIPAA and a GDPR Article 9 violation for EU sites.
Note prompts — click to add
+ For each site, which HIPAA research pathway is the legal basis — and has the DPO or privacy officer signed off in writing?+ Do any sites require re-consent for the specific federated analysis, even if broad consent is on file?+ Are we mixing Safe Harbor and Expert Determination across sites, and does our harmonization plan handle the asymmetry?Confirm the legal basis under which each participating site may use data for this study.
Select all that apply
Define data use agreements and BAAs across the federation
Why This Matters
A federation with N sites has up to N*(N-1)/2 bilateral contracting relationships, most of which get collapsed into site-to-aggregator instruments — but only if the aggregator is contractually a business associate of every covered-entity site. Missing a single BAA turns routine gradient exchange into an unpermitted disclosure. Counsel at each site must independently confirm the contracting surface, not rely on the consortium lead's attestation.
Note prompts — click to add
+ Is there a signed BAA between the aggregator operator and every US covered-entity site?+ For site-to-site exchanges (hierarchical topologies), is there a DUA template approved across the consortium?+ What is the termination and data-destruction clause if a site withdraws mid-study?Confirm the contracting surface — every site pair or every site-to-aggregator link needs appropriate instruments.
✓ savedDetermine EU AI Act classification for the federated model
Why This Matters
The EU AI Act (Regulation 2024/1689) entered into force in August 2024 with phased applicability — prohibited practices from February 2025, high-risk obligations for Annex III systems from August 2026. A federated model used for patient-facing diagnostic or triage decisions is almost always high-risk and inherits obligations for risk management, data governance, technical documentation, human oversight, and post-market monitoring. Research-only exemptions apply narrowly: as soon as the model is placed on the market or used in service provision, the research exemption lapses.
Note prompts — click to add
+ Is our intended use purely research, or will the federated model eventually be deployed clinically?+ Who in each EU site's compliance function has signed off on the AI Act classification, and is it documented?+ Does our governance plan already address the Annex III obligations even if we currently classify as research-only?Decide whether the federated model falls under the EU AI Act's high-risk provisions.
Single choice
Establish funding source and grant alignment
Confirm the sponsor(s) and any grant-level data management / sharing requirements.
Select all that apply
Define success metrics and statistical power plan
Why This Matters
Federated studies routinely under-power because each site contributes less than the consortium sum — not every site has every variable, missingness is asymmetric, and DP noise eats into effective N. A study powered for N=50,000 across 10 sites may deliver effective N closer to 30,000 once DP and incomplete feature coverage are accounted for. Declaring the primary endpoint, expected effect size, and required N up front — before seeing any site data — is the pre-registration discipline that keeps the study defensible.
Note prompts — click to add
+ Is our required N computed against the effective N after DP noise, or against nominal record counts?+ Have we documented the minimum detectable effect size and the alpha we will control to?+ Is the primary endpoint pre-registered before any unblinding?State the primary endpoint, expected effect size, and required N across the federation.
✓ saved