WP - MM - Financial

Phase 1 of 6

Scoping & Latency Constraints

Define the channels, time-to-first-token budget, language coverage, PII surface, and regulatory footprint that will govern every architectural decision for the conversational AI stack.

0/8

Phase Progress

Required Recommended Optional Open-Source Proprietary Trinidy

Channels & Interaction Surface

Identify channels the assistant must serve

Required

Confirm every surface on which the chatbot or virtual assistant will answer customer queries.

Select all that apply

Mobile app in-session chat

Online banking web chat

SMS / messaging (iMessage, WhatsApp Business)

IVR / voice assistant (phone channel)

Smart speaker / voice (Alexa, Google Assistant)

Branch kiosk / ATM conversational UI

Agent-assist copilot (internal use)

Email / async case intake

required

✓ saved

Define time-to-first-token (TTFT) SLA by channel

Required

Select the TTFT budget your conversational stack must hold at p95 under peak load.

Single choice

< 200ms TTFT (voice / IVR — aggressive)

< 500ms TTFT (premium mobile experience)

< 1s TTFT (standard in-app chat)

< 2s TTFT (web / async — acceptable)

Tiered by channel (mixed SLA)

requirededgetrinidy

Trinidy — Cloud-routed LLM inference consumes 100–300ms of network round-trip before a single token is produced — often half of the perceived latency budget. Trinidy collocates the serving tier with the semantic cache and RAG retriever, keeping TTFT predictable even during traffic spikes.

✓ saved

Define end-to-end response completion SLA

Required

Specify the p95 full-response latency target distinct from TTFT.

Single choice

< 2 seconds (simple FAQ / account lookup)

< 5 seconds (RAG with citations)

< 10 seconds (multi-turn agentic workflow)

< 30 seconds (complex research / dispute intake)

Not currently measured end-to-end

required

✓ saved

Specify language and dialect coverage

Required

Confirm language support, with particular attention to Spanish and other high-volume non-English segments.

Select all that apply

English (US)

Spanish (US / Latin America)

French (Canadian)

Mandarin / Cantonese

Portuguese (Brazilian)

Tagalog

Korean

Vietnamese

Arabic

Other regional languages

required

✓ saved

Map the PII surface entering LLM context

Required

Inventory every PII category that may be placed into the LLM context window.

Select all that apply

Account numbers / routing numbers

Social Security Number / TIN

Transaction history

Beneficiary designations

Health-related financial data (HSA / FSA)

Biometric identifiers (voiceprint)

Geolocation / device fingerprint

Free-text messages containing any of the above

No PII ever reaches LLM context (scrubbed upstream)

required

✓ saved

Confirm data residency and cross-border constraints

Required

Map conversational context and retrieval corpora to jurisdictional constraints before architecture is finalized.

Select all that apply

US-only deployment (GLBA scope)

EU GDPR — data must remain in EU

UK GDPR — UK residency required

Canada PIPEDA / provincial rules

State-level biometric laws (Illinois BIPA, Texas CUBI)

CCPA / CPRA (California residents)

Colorado AI Act residency preference

Cross-border permitted under SCCs / DPAs

requiredtrinidy

Trinidy — GLBA Safeguards, CCPA/CPRA, BIPA (for voice), and EU GDPR all press against cloud-hosted LLM serving. Trinidy keeps the RAG index, PII scrubbing, and audit logging entirely within the institution's perimeter — no cross-border flow of customer dialogue for any interaction.

✓ saved

Define scope of consumer-facing statutory rights handling

Required

Specify how the assistant recognizes and routes consumer invocations of statutory rights (Reg E dispute, Reg Z billing error, FCRA, etc).

Select all that apply

Reg E EFT error notices (12 CFR 1005.11 — 60-day clock)

Reg Z billing error notices

FCRA dispute intents

SCRA / MLA protected-status invocations

Bankruptcy / cease-communication requests

Death of accountholder notifications

Unauthorized transaction reports

Assistant does not currently route statutory rights

required

✓ saved

Specify deployment topology for the serving plane

Required

Select the physical/logical deployment target for the LLM serving tier and the RAG retriever.

Single choice

On-premises serving (vLLM / TensorRT-LLM on owned GPUs)

Private cloud / VPC in-region

Hybrid: on-prem RAG + cloud LLM for non-PII path

Public cloud API (OpenAI / Anthropic / Gemini)

Multi-model routing across on-prem + cloud

requirededgetrinidy

Trinidy — For PII residency and sub-second TTFT, cloud-API-only serving is economically and regulatorily fragile at Wells Fargo / BofA scale. Trinidy provides on-prem vLLM / TensorRT-LLM serving with the semantic cache and RAG index collocated — the entire hot path stays inside the institution's perimeter.

✓ saved