Phase 1 of 6
Scoping & Case Load Profile
Before any agent runs, define the alert queue shape, analyst tiers, investigation SLAs, and SAR filing cadence that will govern orchestration design and human-in-the-loop checkpoints.
0/8
Phase Progress
Required Recommended Optional Open-Source Proprietary Trinidy
Alert Volume & Queue Profile
Baseline daily alert volume and queue composition
Why This Matters
A mid-tier bank running 5M daily transactions at a 1.5% false positive rate generates roughly 75,000 unnecessary alerts per day, and analysts can only review 30–50 per hour — the arithmetic of the queue is what drives the agentic case for pre-investigation. The right orchestration design for 500 alerts/day is almost nothing like the right design for 75,000/day, so sizing the queue honestly is the first decision, not a later optimization. Institutions that under-report alert volume typically do so because they have already silently closed large portions of the queue without investigation.
Note prompts — click to add
+ What is our true daily alert volume, including alerts auto-closed without human review?+ What percentage of our alert queue has never received a substantive investigation?+ Do we measure analyst throughput in alerts/hour, and how close are we to the 30–50/hour ceiling?Select the approximate daily transaction-monitoring alert volume that will enter the agentic investigation workflow.
Single choice
Define target SAR filing volume and cadence
Why This Matters
The SAR is not an optional artifact — it is the primary regulatory deliverable the agent must help produce, and the 30-day initial filing deadline (with 90-day continuing cycles) is non-negotiable under BSA 31 CFR 1020.320. Cyber-SARs specifically grew 30% year-over-year as AI-enabled attacks accelerated, and institutions that have grown volume without agent support typically slip on the 30-day clock first. Dimensioning the SAR workload up front is what lets you design an agent that meets the clock at your actual institutional scale.
Note prompts — click to add
+ What was our SAR filing volume in the last two calendar years, and what was the YoY delta?+ How many SARs did we file late in the last 12 months, and what was the root cause in each case?+ Do we separately track continuing-activity SAR cadence, or is it folded into the same queue?FinCEN received 4.105 million SARs in 2025 (+7.99% YoY), with bank/savings/credit union filings at 2.193 million. Baseline your institution's share.
Single choice
Define analyst tiers and case ownership model
Why This Matters
The agent does not replace a tier — it changes the ratio between tiers, and the institution must decide which tier absorbs the capacity gain before the deployment goes live. McKinsey's 20+ agents per human supervision model implies near-complete elimination of L1 and redeployment of L2 against more complex cases, which is an organizational decision, not a technical one. An agent deployed without a target-state tier model tends to produce throughput that analysts immediately re-absorb into deeper per-case review rather than clearing the queue.
Note prompts — click to add
+ Which tier's headcount is held constant, which grows, and which shrinks post-deployment?+ Has HR / people ops been brought into the tiering conversation, or is it purely an ops design?+ Does our BSA Officer approve the target-state tier ratio, given they sign every SAR?Select the investigation tiering model the agent must integrate with.
Select all that apply
Set investigation SLA and 30-day SAR clock posture
Why This Matters
Running an internal SLA that equals the regulatory ceiling leaves no room for any operational disruption — a single analyst sick day or a system outage can put filings past 30 days, and FinCEN exam findings on late filings are both frequent and publicly visible. Institutions that hit the 30-day window reliably almost always run internal SLAs at 14–21 days to absorb real-world friction, and agentic case assembly is what makes that tighter SLA economically possible at the current staffing footprint.
Note prompts — click to add
+ What was our median case age at SAR filing in the last quarter?+ How many filings went past 30 days in the last year, and what did the remediation cost?+ Is our SLA expressed per case or as a queue-level percentile?BSA requires initial SAR filing within 30 calendar days of detection. Define your internal SLA relative to that regulatory ceiling.
Single choice
Capture FRAML convergence posture
Why This Matters
Siloed fraud and AML systems create blind spots — a money mule network visible in AML transaction data is invisible to the fraud team's case file, and an agent that can only see one side reproduces that blind spot programmatically. Hawk AI's 2024 survey found 77% of FRAML respondents expect >$1M savings within 5 years and 50% of early adopters have already saved >$5M. The convergence decision shapes every downstream data source and graph traversal choice, so it has to be made before agent scope is finalized rather than after.
Note prompts — click to add
+ Is our target-state agent scope fraud-only, AML-only, or converged FRAML?+ Do we currently have a technical path for the agent to traverse both fraud and AML data stores?+ Who owns the convergence P&L — fraud, AML, or a unified financial crime function?93% of US mid-market banks are pursuing fraud + AML convergence (Hawk AI 2024). Define where your program sits.
Single choice
Specify agent latency and streaming UX targets
Why This Matters
The observable difference between an agentic tool analysts adopt and one they abandon is whether output starts appearing on the screen inside the first half-second. A 15–25 second batch wait is perceptually indistinguishable from a frozen UI, and analysts who have to wait through it tend to alt-tab back to their existing 8–12 systems and never return. Token streaming is not a UX nicety; it is the reason the agent ever sees a second case from the same analyst.
Note prompts — click to add
+ What is our current streaming onset on the proposed agent runtime, measured under load?+ Have we piloted streaming vs. batch with real analysts and measured return usage?+ Is our target streaming budget realistic given our chosen LLM and hardware?Token streaming onset at 200–400ms vs. 15–25 second batch delivery is the difference between analyst adoption and abandonment.
Single choice
Trinidy — NEXUS OS inference serving supports token streaming natively — case assembly appears on the analyst screen within 200–400ms of agent initiation rather than after a full batch completion. On-premises token streaming is required for adoption at scale; cloud round-trip jitter alone puts streaming onset past the abandonment threshold.
Confirm case data residency and sovereignty constraints
Investigation data includes PII, transaction history, and unfiled SAR narratives — residency and sovereignty rules often foreclose cloud-hosted agent runtimes.
Select all that apply
Trinidy — Goldman Sachs uses Anthropic Claude via sovereign deployment for fraud triage and compliance — the model reasons over proprietary transaction data, customer records, and investigation histories that never leave the firm's infrastructure. Trinidy is the equivalent substrate for institutions that cannot route case data to a public cloud endpoint.
Define agent autonomy level and human-in-the-loop checkpoints
Why This Matters
McKinsey documents agentic productivity uplifts of 200–2,000% under supervision models where one human oversees 20+ AI agents on triage, evidence gathering, and narrative drafting — but those numbers assume the autonomy level matches the institution's regulatory risk appetite, not just the technical capability. Regulated workflows typically require deterministic graph-based orchestration (LangGraph) with explicit human-in-the-loop breakpoints at SAR filing decisions; purely conversational autonomy (AutoGen) is better suited to hypothesis generation upstream of the filing decision.
Note prompts — click to add
+ At what investigation step does our current design require mandatory human approval?+ Have we documented the autonomy level per typology, or is it uniform across all cases?+ How will we prove to an examiner that a SAR filed under supervised autonomy is not a "robot-filed" SAR?Select the autonomy posture for the agent across investigation stages.
Single choice