Phase 1 of 6
Scoping & Tenant SLA Constraints
Define the tenant mix, latency envelope, sovereignty posture, and commercial metering surface that govern every architectural decision for a multi-tenant edge inference platform.
0/10
Phase Progress
Required Recommended Optional Open-Source Proprietary Trinidy
Tenant Profile & Commercial Surface
Identify tenant archetypes in scope for the edge inference platform
Why This Matters
Tenant archetype dictates isolation posture, SLA envelope, and commercial model — an ISV running a shared-tenancy vision model has nothing in common operationally with a FedRAMP-regulated enterprise tenant requiring dedicated silicon. Mixing these on a single isolation design is the fastest way to be unable to sell to either. STL Partners' 2024 operator survey found that 61% of tier-1 operators with active edge-AI-as-a-service offerings segment tenants by isolation class from day one.
Note prompts — click to add
+ Which tenant archetypes account for our first $1M of ARR and have we designed the isolation boundary for them specifically?+ Do we have different commercial SKUs per archetype or one blended offer that will satisfy neither extreme?+ Who in our organization owns the go-to-market for each archetype — wholesale, enterprise, or developer ecosystem?Confirm which tenant classes your platform must serve on day one and within the first 12 months.
Select all that apply
Define per-tenant SLA latency envelope
Why This Matters
AWS Wavelength documentation cites sub-10ms round-trip inference inside Verizon's 5G edge versus 60–80ms to the nearest central region — the gap that makes a carrier-hosted offer structurally differentiated. Setting an overly permissive SLA makes the offer indistinguishable from hyperscaler central cloud; setting an overly tight SLA across every tenant forces the most expensive hardware posture on every site. The SLA must be tiered by commercial SKU before deployment topology is fixed.
Note prompts — click to add
+ What is the tightest per-tenant SLA we are willing to contractually commit to, and at what price premium?+ Have we measured backhaul latency from our tower footprint to the nearest central cloud region?+ Does our SLA include cold-start or only warm-path inference, and is that difference disclosed in the contract?Select the P95/P99 inference latency targets the platform must hold under peak multi-tenant load.
Single choice
Trinidy — Central-cloud inference from a tower site introduces 40–80ms of backhaul before any compute begins — incompatible with sub-20ms enterprise SLAs. Trinidy runs tenant inference at the tower / MEC site with deterministic latency, keeping P99 predictable even under noisy-neighbor conditions.
Establish metering accuracy and billing-error tolerance
Why This Matters
Billing-grade metering is a first-order architectural concern, not a late-stage instrumentation task — if per-request counts or per-GPU-hour attribution drift by more than a few percent, tenant disputes consume more operations time than the revenue is worth. Carrier finance, revenue assurance, and enterprise procurement all have existing standards for metering error (typically sub-5%) inherited from connectivity billing. An AI-as-a-Service offer that cannot pass revenue assurance review cannot be commercially launched regardless of how good the inference is.
Note prompts — click to add
+ Have we engaged revenue assurance in the platform design, or is metering a purely engineering concern today?+ What is our measured per-request metering error on current pilots, and how was it validated?+ Do we meter both per-request and per-GPU-hour, and are the two independently reconcilable?Per-request and per-GPU-hour metering must meet commercial billing accuracy standards before launch.
Single choice
Define commercial SKU structure
Why This Matters
SKU structure decides the platform's addressable market more than the technology does — a shared on-demand SKU can serve thousands of small developers, while a dedicated-capacity SKU serves a handful of regulated enterprises at 10x margin. Launching with only one SKU forecloses half of each market. Published pricing of $0.50–$2.00/GPU-hour shared and $5–$20/GPU-hour dedicated is consistent with hyperscaler edge benchmarks and is the reference for commercial plausibility.
Note prompts — click to add
+ Do our launch SKUs cover both shared and dedicated tenants, or only one class?+ Have we benchmarked our pricing against AWS Wavelength / Azure Private MEC / Google Distributed Cloud Edge equivalents?+ Is reserved monthly capacity priced per site tier so power-constrained sites are not subsidized by dense ones?Select the pricing and capacity SKUs the platform must support at launch.
Select all that apply
Map data sovereignty and residency posture per tenant class
Why This Matters
Residency is typically a per-tenant contractual boundary, not a platform-wide one — the same platform must enforce EU residency for one tenant while serving a US tenant from US sites and a FedRAMP tenant from government-boundary-only sites. Architecting residency as a policy rather than a physical pinning invites drift — failover, backup, and logging paths all need to respect the boundary. State-level US requirements under CCPA/CPRA and sovereign-cloud contracts materially complicate US-only architectures.
Note prompts — click to add
+ Is residency enforced architecturally (site pinning, region-scoped keys) or by policy assertion?+ Have we mapped failover and backup paths against each tenant's residency contract?+ Does our platform expose per-tenant residency attestation that procurement can validate?Confirm per-tenant data residency and cross-border constraints before site selection is finalized.
Select all that apply
Define tenant onboarding SLA
Time from contract signature to first production inference — a differentiator versus hyperscaler edge.
Single choice
Select deployment topology for inference pool
Physical placement of the tenant inference capacity across the carrier footprint.
Single choice
Trinidy — For sub-20ms enterprise SLAs and sovereignty-constrained tenants, cloud inference and even regional MEC are physically incompatible. Trinidy is the on-tower / on-MEC inference substrate with ETSI MEC-aligned lifecycle and O-RAN SMO integration.
Align platform to 3GPP / O-RAN / ETSI MEC reference architecture
Why This Matters
3GPP Release 18 (5G-Advanced) and the in-progress Release 19 explicitly scope AI/ML functions in the 5G system, and O-RAN Alliance has published rApp / xApp frameworks that define how third-party AI plugs into the SMO and RICs. An edge AI-as-a-Service offer that does not map to these references becomes an integration project for every tenant and every carrier partner. TMF Open APIs specifically drive BSS / ordering / SLA interoperability that determines how fast a tenant can actually buy and consume the service.
Note prompts — click to add
+ Which 3GPP release is our 5G core on today and what is the gating item for Release 18/19 AI/ML features?+ Do we expose O-RAN SMO / RIC integration points, or will every tenant need to build their own integration?+ Are we publishing TMF Open APIs for ordering, billing, and SLA so procurement can consume the service without bespoke glue?Confirm which standards the inference platform must conform to for carrier integration.
Select all that apply
Define tenant mix and site density targets
Why This Matters
Tenant density per site is what converts the footprint from a co-location business (one tenant per site, real-estate margins) to a platform business (many tenants per site, software margins). Carriers have 10,000–100,000 tower sites and the economic question is how many tenants each site can profitably host, not how many sites a tenant occupies. STL Partners documented 7x ARPU uplift from edge AI tenants versus connectivity-only tenants — the multiplier depends on tenant density.
Note prompts — click to add
+ What is our target tenant count per site at 18 months, and is the hardware density plan consistent with that?+ Have we modeled platform economics at target density against co-location economics at one-tenant-per-site?+ Do our sales and operations plans assume the same density number?Target number of tenants per site and site coverage across the footprint.
✓ savedConfirm power, thermal, and space envelope per site
Why This Matters
Tower and MEC sites have finite power and cooling envelopes that were designed for radio equipment, not dense GPU inference — retrofitting a 10kW site to host a multi-tenant GPU pool is a civil and utility project, not a software deployment. Site survey data (power, cooling, HVAC, floor loading) must drive the hardware catalog rather than the reverse. A platform design that assumes hyperscaler-class power per site will never deploy on the actual footprint.
Note prompts — click to add
+ Have we surveyed power and cooling across every site class in our target footprint?+ What is the maximum sustained inference power we can draw per site without a utility upgrade?+ Do we have a tiered hardware catalog that matches each site class, or are we attempting one-size-fits-all?Per-site power budget, cooling capacity, and cabinet space determine GPU density feasibility.
✓ saved