A four-layer framework integrating strategic readiness, ROIC decomposition, stage-gate investment, and risk-adjusted scoring. Validated against 8 real-world deployments.
McKinsey found 80%+ of organizations see no EBIT impact from generative AI. MIT reports 95% of pilots deliver no measurable P&L impact. The tools executives use to evaluate AI investments were built for factories and ERP systems, not for projects where costs are uncertain, benefits are indirect, and organizational readiness is the binding constraint.
Each layer answers a different question. Together, they force decision-makers to systematically confront every factor that determines whether an AI project creates or destroys value.
Strategic readiness filter across five dimensions: Scalability, Constraints, Alignment, Leadership, Efficiency. Runs before any financial analysis. Produces Go / No-Go / Not Yet.
Breaks NOPAT and Invested Capital into granular drivers. Identifies which levers AI affects and applies the growth constraint: g = ROIC × RR. A 50% growth claim on 15% ROIC is an inconsistent promise.
Converts a single large bet into information-revealing smaller bets. Three gates: POC → Pilot → Scale. At each gate, the option to stop preserves capital and captures learning.
Synthesizes SCALE profile, ROIC sustainability, and risk-adjusted discount rate into a single grade (A through C) with specific execution requirements.
Developed by Professor Gad Allon (Wharton). Three dimensions are tailwinds; two are headwinds. If Leadership or Constraints score poorly, fix those before committing capital.
Do revenues grow faster than costs? Is there urgency to grow, and are strategy, market, and business model configured to support scale? Scaling builds a cushion; costs don't shrink as fast as revenues in a downturn.
What resources do you need versus what you have? Is the mitigation path clear for each gap? Data quality, compliance burden, legacy infrastructure, talent gaps all live here.
Is the value proposition differentiated? Are revenue model, resources, and activities aligned with it? The sharpest question: what are we not good at, and can we use that to do something else?
Right people, right structure, right culture? In B2B, assess both vendor-side and buyer-side readiness. CEO governance of AI is the attribute most correlated with financial returns.
Is there a predictable path to profitability? The Rule of 40 (growth rate + margin ≥ 40%) provides the guardrail. Growth for the sake of growth will kill the firm.
Builds the discount rate from the risk-free rate upward. Each category contributes basis points anchored to empirically observable risk premiums. Range: 12% (all Low) to 36% (all High).
Quality, availability, legal status of required data. ML is acutely sensitive; dirty data produces confidently wrong outputs.
Will the AI achieve the accuracy threshold needed? The binary "does it work?" question. Anchored to pharma R&D discount rates.
Per-unit AI costs vs. per-unit revenue at scale. A feature viable at 10M users can be margin-destructive at 700M.
Change management, process alignment, internal friction. Expanded to 8 sub-factors after Apple analysis exposed gaps in simpler models.
Requirements that increase costs, delay deployment, or restrict functionality. Anchored to Basel III compliance premiums.
Risk of being outpaced by competitors, open-source, or foundation model shifts. Easy to launch, hard to scale.
Lock-in to model providers, cloud vendors, or APIs you don't control. The build-vs-buy tradeoff with Category 2 is the central strategic tension.
Ability to hire, retain, and develop AI-specific talent. Knowledge concentration in key individuals creates fragility.
Each axis has a "build" case (full framework application from first principles) and a "validate" case (tests whether the framework generalizes).
AI sold to business buyers. Dominant risks: buyer organizational readiness and regulatory compliance.
Success-driven cannibalization: Agentforce makes reps 3x more efficient, undermining seat-license revenue. The shift to per-action pricing creates its own forecasting uncertainty.
Prove unit economics at deal level within 18 months (fully loaded cost including inference must produce positive NOPAT)
Accelerate pricing transition from per-seat to consumption-based Flex Credits before Rule-of-40 deteriorates
Gate investment on buyer cohort retention: if pilot-to-paid conversion falls below 60%, growth is unsustainable
High-touch delivery (forward-deployed engineers) creates durable moat but structural scalability ceiling. Success depends on whether the deployment trough depth is shrinking over time.
Prioritize expansion revenue over new acquisition (existing-account growth has fundamentally better unit economics)
Reduce deployment trough: per-customer deployment cost decline of 10-15% annually required, or scalability ceiling binds in 3-5 years
Accelerate commercial diversification: government (55% revenue) provides stability but limits velocity
AI embedded in consumer experiences. Dominant risks: inference cost scaling and competitive obsolescence.
In B2C, cost management architecture (model efficiency, edge caching, feature gating) is arguably more important than user experience. Solving cost structure is harder than solving the product.
Validate AI DJ and prompted playlists produce positive unit economics at full MAU scale; gate compute-intensive features to Premium if free-tier unsustainable
Protect proprietary data moat: decade of behavioral data (600B monthly events) is the competitive advantage
Monitor content royalty dynamics: if label renegotiations shift rates upward by 3-5pp, margin structure changes materially
Hybrid build-buy model: Birdbrain (proprietary, low dependency) + GPT-4 (frontier but vendor-dependent). Competitive threat from general-purpose AI as free tutor requires monitoring.
Expand Max tier to all major language pairs within 12 months (highest-ARPU product)
Develop LLM provider optionality within 24 months: serve from 2+ providers to reduce single-vendor dependency
Manage free-tier inference costs: energy system (2-3 lessons/day limit) is correct; monitor conversion vs. attrition
AI serving both sides of a marketplace. Dominant risks: merchant/supply sophistication and inference scaling.
Platform-specific tension: AI creates genuine merchant value but Shopify monetizes through commissions, not AI tools. The gap between value creation and value capture is the central challenge.
Establish AI-to-GMV attribution: track whether AI tool adoption correlates with merchant GMV growth
Validate Agentic Storefronts: move from POC to pilot within 12 months; if buyers don't engage with AI chat, deprioritize
Invest in AI tool simplification for long-tail merchants: adoption among long tail drives aggregate GMV
Bifurcated recommendation: core AI is the most mature and proven in the Platform analysis; AV transition is an existential bet with enormous upside but timing uncertainty and disintermediation risk.
Continue optimizing matching/pricing algorithms (each 1% in driver utilization compounds across 13.6B annual trips)
Gate AV capital on unit economics proof in 3+ metropolitan markets
Protect against disintermediation: ensure AV partnerships preserve Uber's demand-side aggregation role
AI deployed on your own operations. Dominant risks: model/technical (build) or vendor dependency (buy).
Internal AI carries lower aggregate risk than B2B/B2C/Platform because it eliminates customer adoption uncertainty, competitive displacement, and vendor dependency. Risk concentrates in regulatory compliance.
Maintain human-in-the-loop governance as use cases expand 450 to 1,000 (single governance failure exceeds cumulative savings)
Monitor skill dependency in workforce reallocation: 4% ops-to-client shift requires structured capability development
Track diminishing marginal returns: next 550 use cases must be prioritized by ROIC contribution, not deployment ease
Build-vs-buy at highest stakes: the 350bp risk differential between JPMorgan (build) and NatWest (buy) quantifies the architectural tradeoff. NatWest's approach is correct for its scale but creates a dependency premium.
Develop multi-vendor optionality: evaluate alternative LLM providers for 2+ of 275 use cases (leverage, not replacement)
Impose rigorous prioritization on 275-project pipeline: require Gate 1 business case before resources (enthusiasm outpaces discipline)
Accelerate AWS/Accenture cloud migration: legacy infrastructure is the binding bottleneck on every downstream AI project
Three inputs determine the grade: SCALE profile, ROIC sustainability, and risk-adjusted discount rate. Bands overlap intentionally to allow judgment.
4+ tailwinds, proven unit economics, discount rate below 21%. Revenue demonstrably outpaces costs. Monitor gating conditions.
3-4 tailwinds, 1-2 material headwinds, rate 21-23%. Revenue outpacing costs but unit economics partially unproven.
3+ tailwinds, significant but understood headwinds, rate 22-24%. ROIC positive but key sensitivity unresolved.
3 tailwinds, material headwinds, rate 23-26%. Revenue growing but cost sustainability unproven. Address structural risks first.
2 tailwinds, significant headwinds, rate above 24%. Returns positive but constrained. Manage concentration risk.
Fewer than 2 tailwinds or critical headwind unresolved. ROIC negative or deeply uncertain. Fix readiness gaps first.
All 8 companies scored, graded, and assessed. Sort by what matters to you.
| Company | Category | Grade | Discount Rate | Verdict | Top Execution Priority |
|---|---|---|---|---|---|
| Spotify | B2C Build | A− | 19.25% | Invest aggressively | Manage inference scaling; protect data moat |
| JPMorgan | Internal Build | A− | 20.0% | Invest aggressively | Maintain governance at velocity; prioritize by ROIC |
| Duolingo | B2C Validate | B+ | 21.5% | Yes, with vendor management | Expand Max tier; develop LLM provider optionality |
| Shopify | Platform Build | B | 21.25% | Conditionally yes | Establish AI-to-GMV attribution |
| Uber | Platform Validate | B | 22.75% | Yes for core; conditional for AV | Optimize core algorithms; gate AV on unit economics |
| Salesforce | B2B Build | B− | 23.5% | Pursue cautiously | Prove unit economics; accelerate consumption pricing |
| Palantir | B2B Validate | B− | 25.5% | Pursue selectively | Prioritize expansion; reduce deployment trough |
| NatWest | Internal Validate | C+ | 24.25% | Invest steadily + diversify | Multi-vendor optionality; accelerate cloud migration |
The framework is open. The methodology is documented. The question is whether you'll use it before or after the money is spent.
Back to Framework ↑