Phase 1 of 6
Scoping & NOC Use-Case Constraints
Define the NOC personas, query classes, latency budget, and data-residency boundary that will govern every subsequent architectural decision for the GenAI assistant.
0/8
Phase Progress
Required Recommended Optional Open-Source Proprietary Trinidy
NOC Surface & Personas
Identify NOC personas served by the GenAI copilot
Why This Matters
Persona directly dictates prompt framing, tool access, and the groundedness bar — a Tier-1 operator needs a short, prescriptive runbook step while a Tier-3 architect needs citation density and willingness to say "I do not have enough evidence." Mixing personas behind a single system prompt is the fastest way to make the copilot useless to everyone. Public case studies from AT&T Network Copilot and Nokia AVA both report that persona-specific prompt design, not model size, drove adoption.
Note prompts — click to add
+ Which personas are in scope for v1 and which will be deferred to v2?+ Have we interviewed 3–5 engineers per persona to capture the 20 most common queries they run today?+ What is our escape-hatch when the assistant is uncertain — escalate to human, cite source, refuse to answer?Confirm which engineering roles will query the assistant in production — each persona has a different tolerance for latency, depth, and confidence thresholds.
Select all that apply
Classify in-scope query types
Why This Matters
Answer-generating tasks (runbook drafting, RCA hypothesis) carry hallucination risk and need strict RAG grounding, while retrieval-heavy tasks (KB lookup, incident search) are closer to semantic search and can tolerate a lighter model. Mixing them behind one endpoint without query-class routing causes the copilot to over-generate on lookup tasks and under-ground on generative tasks — the worst of both. A query-class router is a first-order design decision, not an afterthought.
Note prompts — click to add
+ Which query classes carry real operational risk if the copilot is wrong, and which are low-stakes?+ Have we defined a refuse-to-answer class for out-of-scope queries (e.g. security, billing, HR)?+ Are runbook-generation responses gated through a human-in-the-loop before automation fires?Select the query classes the assistant is expected to answer in v1. Each class has different retrieval, grounding, and safety requirements.
Select all that apply
Establish query-response latency SLO
Why This Matters
NOC operators abandon tools that show no feedback within ~3 seconds during an active incident — the copilot needs time-to-first-token under 1.5s even if the full response runs longer. Longer-running RCA queries are acceptable when a streamed response shows visible progress, but only if the streamed tokens are already grounded. Setting the SLO before architecture is set prevents the all-too-common pattern of shipping a copilot that is technically brilliant and operationally unused.
Note prompts — click to add
+ What is our measured p95 time-to-first-token today on comparable internal RAG endpoints?+ Have we validated that our vector store can return top-k under 100ms at NOC-scale corpus size?+ Is there a visible streaming UX so engineers see progress before the full answer arrives?Select the P95 time-to-first-token and P95 full-response budget the copilot must hold under peak NOC load.
Single choice
Trinidy — Public LLM APIs routinely show p99 time-to-first-token excursions of 5–15 seconds under load, which is unusable during a P1 incident. Trinidy hosts the model on-node with predictable latency characteristics — 2–15 second responses hold under burst NOC traffic.
Define MTTR / productivity target and measurement plan
Why This Matters
Published operator benchmarks cluster in a narrow range — Telia/Ericsson at 38% MTTR reduction, Nokia/Elisa at 67% Tier-1 auto-resolution, Gartner CSP hype cycle projecting 25–40% MTTR improvement for early-production deployments. Anchoring the business target to these externally validated numbers makes the program defensible to finance and easier to defend to the board when intermediate metrics are still noisy. Without an explicit target, the rollout tends to be evaluated on anecdotes rather than A/B measurement.
Note prompts — click to add
+ What is our measured MTTR baseline per incident severity today, and how reliable is it?+ Who owns the P&L line for NOC productivity and do they agree to the target?+ Do we have an A/B or shadow-cohort plan, or will we infer impact from before/after trends?Quantify the business outcome the copilot is accountable for — without it, rollout becomes a feature demo, not a program.
Single choice
Confirm data residency and sovereignty constraints
Map where network event data, topology, and customer identifiers may legally be processed — the answer determines cloud vs. private deployment.
Select all that apply
Trinidy — GDPR, UK GDPR, and most operator licence conditions create tension with cross-border cloud LLM APIs. Trinidy hosts the full RAG pipeline — retrieval, generation, and session context — inside the operator's own perimeter, so network topology and alarm data never cross a jurisdictional boundary.
Specify deployment topology for the GenAI stack
Select the physical/logical target for model hosting, retrieval, and session state.
Single choice
Map integration with existing OSS / BSS / SMO estate
Why This Matters
The GenAI copilot does not replace OSS/BSS — it is a retrieval-and-reasoning layer on top of existing authoritative systems, and that relationship has to be explicit. Attempting to replicate OSS state inside the vector store creates two sources of truth and eventually silent drift between them. TMF AI/ML guidance and the O-RAN Alliance WG2 architecture both treat the GenAI layer as an overlay that calls into AI/ML functions and SMO APIs rather than owning them.
Note prompts — click to add
+ Which authoritative system owns each data class the copilot cites, and is that ownership documented?+ Do we plan to embed the copilot inside an OSS console or run it as a standalone surface?+ Have we scoped which vendor platforms expose a stable retrieval API vs. which need a bespoke connector?Select the operator platforms the copilot must query or embed within. These are the authoritative data sources — not the corpus itself.
Select all that apply
Define refusal, escalation, and out-of-scope behavior
Why This Matters
Refusal behaviour is where copilots earn trust — engineers will forgive "I don't have enough evidence to answer that" but not a confident fabrication of a cell identifier or an alarm code. Every public deployment that reports high engineer-satisfaction scores also reports investing heavily in refusal UX, not just accuracy. Without an explicit refusal path, groundedness gates (Phase 4) have nowhere to route a failed check.
Note prompts — click to add
+ Is refusal a first-class UX state, or does the copilot always emit an answer?+ When confidence is low, do we surface retrieved evidence so the engineer is not blocked?+ Is there a measured rate of refusal we target, or is every refusal treated as a failure?Specify what the copilot does when it cannot ground an answer, when confidence is low, or when the query is outside its remit.
Select all that apply