Phase 1 of 6
Scoping & Collection Constraints
Define the intelligence requirements, US-persons boundary, non-attributable collection posture, and classification residency that govern every downstream architectural decision.
0/10
Phase Progress
Required Recommended Optional Open-Source Proprietary Trinidy
Intelligence Requirements & Authorities
Identify the authorities under which OSINT collection will operate
Why This Matters
E.O. 12333 (as amended) is the bedrock authority for IC OSINT, and 10 U.S.C. § 467 is the statutory definition DoD uses to distinguish OSINT from publicly available information (PAI) collection outside an intelligence purpose. Conflating these two buckets is the most common compliance failure — PAI collected incidentally by a non-IC component is not OSINT and should not be handled as such. The IC OSINT Strategy 2024-2026 and the DoD OSINT Strategy both expect formal authority mapping before automation is scoped.
Note prompts — click to add
+ Have we formally mapped every collection workflow to a specific E.O. 12333 or 10 U.S.C. § 467 authority?+ Which of our components are IC elements vs. non-IC DoD, and does that change the applicable framework?+ Is OGC / SJA signed off on the authority mapping before automation goes live?
Required
Confirm the statutory and executive authorities that govern your automated OSINT mission.
Select all that apply
E.O. 12333 (as amended) — IC OSINT collection authority
10 U.S.C. § 467 — DoD OSINT statutory definition
NDAA FY24 Sec. 1323 — OSINT modernization mandate
DoDI 3305.12 — DoD OSINT program (May 2018, updated 2022)
ICD 205 — Analytic Standards (OSINT application)
Service-specific OSINT authority (Army / Navy / AF / USMC / USSF)
Combatant command OSINT authority
Law enforcement OSINT (not IC — separate framework)
required
✓ saved
Define the intelligence requirements the automation must answer
Why This Matters
Automated OSINT without explicit, prioritized intelligence requirements degrades into undirected data hoarding — which is both legally risky under E.O. 12333 proportionality expectations and operationally useless to the analyst. ICD 205 (Analytic Standards) presumes a documented requirement chain from consumer to collection. The difference between an OSINT automation program that survives IG review and one that does not is whether every collection rule traces to a numbered requirement.
Note prompts — click to add
+ Which numbered intelligence requirements does our automation answer today?+ Who signs off on new collection rules and maps them to requirements?+ Do we have a process for sunsetting collection when a requirement is retired?
Required
Select the primary intelligence requirement categories driving automated collection.
Select all that apply
Indications & Warning (I&W) — early warning indicators
Adversary information operations / disinformation tracking
Foreign military capability development / S&T intelligence
Counter-terrorism / non-state actor tracking
Counter-proliferation (CBRN / missile)
Political instability / civil unrest monitoring
Economic / acquisition / procurement intelligence
Cyber threat actor tracking (dark web, forums)
Humanitarian / disaster / pandemic monitoring
required
✓ saved
Establish the US persons boundary and filtering strategy
Why This Matters
E.O. 12333 and the attendant AG-approved procedures (DoD Manual 5240.01, AR 381-10, etc.) constrain US-persons collection even when the underlying data is publicly available, if the purpose is intelligence. Automated systems that collect first and filter later create a retention event the moment the data lands — which is precisely the scenario inspectors general and congressional oversight focus on. Filtering at ingest is both compliance and architecture.
Note prompts — click to add
+ Does our filtering happen before storage or after — and is the audit log clear on that sequence?+ Who has access to US-persons-flagged content pre-minimization, and is it role-gated?+ Have we validated our US-persons classifier accuracy (recall in particular) against a labeled test set?
Required
Specify how automated collection will avoid, filter, or handle US-persons information.
Single choice
Hard exclusion — automation never collects US-persons content
Filter-and-route — flagged content goes to minimization review
Collect-and-mask — retained with PII masked pending authority
Collect-with-authority — specific IR has US-persons collection authority
Not yet defined
requiredtrinidy
TrinidyUS-persons filtering must happen before content reaches the analyst queue, not as a post-hoc review. Trinidy runs on-prem NLP classifiers that identify probable US-persons references at ingest and route them through minimization workflows — no US-persons content ever lands in a classified analyst dashboard without authority.
✓ saved
Confirm classified residency requirements for collected data
Required
Select the residency target for raw OSINT, enriched OSINT, and fused OSINT + classified outputs.
Select all that apply
Unclassified / CUI — commercial cloud permitted (DoDI 5200.48)
SECRET — SIPRNet / classified on-prem
TS/SCI — JWICS / SCIF-resident inference
Tearline / cross-domain output to lower classification
Coalition-releasable (FVEY / NATO) downgrade path
Cross-domain transfer via approved CDS only
required
✓ saved
Define non-attributable collection egress posture
Why This Matters
An OSINT collection stream that egresses from a .mil ASN tells adversaries what you are interested in almost as clearly as a HUMINT tasking would. Managed attribution is both a tradecraft question and an infrastructure one — it requires dedicated commercial transit, persona management, and isolation from the analytical enclave where the collected data is processed. Mixing the two networks defeats the purpose.
Note prompts — click to add
+ Does our automated collection egress from commercial transit, or from a government ASN?+ Who owns the attribution surface (CIO, operations, J39) and are they in the architecture review?+ Have we red-teamed our managed attribution posture in the last 12 months?
Required
Specify how automated collection will avoid attribution to the sponsoring agency.
Single choice
Fully non-attributable — commercial ISP / managed attribution
Partially non-attributable — obfuscated but government-linked
Attributable — collection from .mil / .gov egress
Mixed — attributable for passive, non-attributable for active
Not yet defined
requiredtrinidyedge
TrinidyNon-attributable collection requires egress isolation — no IP, no user-agent, no TLS fingerprint trace back to a .mil or .gov ASN. Trinidy supports isolated egress through commercial transit with rotating persona profiles, and keeps the downstream inference pipeline on-prem so the analytical signature never leaves the enclave.
✓ saved
Define latency and refresh tolerance per requirement class
Why This Matters
Latency requirements drive architecture as strongly as in a financial system — a real-time I&W pipeline that must alert within 60 seconds of a trigger event cannot share an orchestration path with hourly narrative clustering. Tiered queues and reserved capacity by requirement class are the common answer; treating every feed as real-time wastes inference capacity on content that was never time-sensitive.
Note prompts — click to add
+ Have we classified our requirement portfolio by latency tier?+ Does our architecture enforce the tiering at the queue level, or only in documentation?+ What is our actual observed lag from source to analyst per tier?
Recommended
Select the freshness SLA that governs each requirement class.
Select all that apply
Real-time (< 60s) — I&W, crisis monitoring
Near-real-time (1-15 min) — IO tracking, event detection
Hourly — topic classification, trend analysis
Daily — S&T tracking, procurement analysis
Weekly or on-demand — deep narrative analysis
required
✓ saved
Scope commercial platform Terms of Service compliance
Why This Matters
Large-scale automated collection from Twitter/X, Telegram, Meta, VK, and similar platforms frequently violates the platform ToS and exposes the sponsoring agency to legal challenge — the hiQ vs. LinkedIn case history continues to evolve and does not cleanly protect IC collection. Authorized commercial firehose access (e.g., X API, Dataminr, Babel Street) is the defensible path; direct scraping against ToS is not, even when technically achievable.
Note prompts — click to add
+ Have we documented which platforms we collect from via authorized API vs. scraping?+ Is OGC / SJA in the loop on each platform category, and is the legal memo current?+ Do we have a vendor strategy (Dataminr, Babel Street, Recorded Future) for authorized firehose access?
Required
Confirm legal review of automated collection against platform ToS.
required
✓ saved
Confirm cross-border and foreign-persons collection posture
Why This Matters
Section 702 FISA is a SIGINT authority and is not an OSINT authority — automating OSINT collection on foreign targets does not require or invoke 702, and conflating the two is a common scoping error that invites both legal and oversight risk. Conversely, GDPR and UK GDPR do apply to EU/UK persons whose content is collected from US territory for intelligence purposes where the data handler has a nexus with the EU/UK. Right-of-publicity laws vary by US state and can constrain automated use of celebrity or public-figure imagery.
Note prompts — click to add
+ Is our legal memo clear that 702 is out of scope for OSINT automation?+ Have we mapped our collection targets to GDPR / UK GDPR applicability?+ Do we handle right-of-publicity risk for high-profile individuals in our collection?
Recommended
Select the jurisdictions where collection is authorized and where GDPR or local law constraints apply.
Select all that apply
EU persons — GDPR constraints apply when collected from US territory
UK persons — UK GDPR constraints
FVEY partner nations — bilateral arrangements
Adversary states — E.O. 12333 collection authorized
Right-of-publicity variations (US state law) for celebrity / public-figure content
Section 702 FISA boundary — not an OSINT authority (do not conflate)
required
✓ saved
Deployment Environment
Specify the deployment topology for inference and fusion
Required
Select the physical/logical deployment target for OSINT inference and classified fusion.
Single choice
Unclassified ingest + classified on-prem fusion (diode)
Fully classified on-prem (JWICS / SIPR SCIF)
Hybrid — commercial cloud for ingest, on-prem for fusion
IL5 / IL6 government cloud (Azure Gov, AWS GovCloud, Oracle Gov)
Coalition-shared infrastructure (FVEY / NATO)
requirededgetrinidy
TrinidyCommercial OSINT ingest is unclassified, but the moment it is correlated with a classified selector the entire fusion stack inherits the higher classification. Trinidy keeps the fusion, enrichment, and analyst-facing layer inside the SCIF / SIPR enclave — commercial feeds terminate at a one-way diode ingest into the classified side.
✓ saved
Confirm JWCC / SEWP contract vehicle for vendor procurement
Why This Matters
JWCC is the primary vehicle for procuring OpenAI, Anthropic, and Google Gemini-class commercial LLMs inside DoD, and Babel Street operates under IDIQ Multiple-Award Contracts that are frequently already in place at IC elements. Procurement vehicle choice drives cybersecurity impact level (IL4/5/6), contract ceiling, and timeline — programs that pick the vehicle late find themselves re-architecting to match what they can actually buy.
Recommended
Identify the procurement vehicle for commercial AI and OSINT services.
Select all that apply
JWCC (Joint Warfighting Cloud Capability) — AWS / Azure / Google / Oracle
SEWP VI / NITAAC — hardware and software
GSA IT Schedule 70 / MAS
Babel Street IDIQ Multiple-Award Contracts
In-Q-Tel strategic investment pipeline
Agency-specific OTA / CSO
recommended
✓ saved