AI ROI Framework | Should You Invest in AI?

The Problem

$1.8 trillion in AI spending by 2030.
Most of it won't work.

McKinsey found 80%+ of organizations see no EBIT impact from generative AI. MIT reports 95% of pilots deliver no measurable P&L impact. The tools executives use to evaluate AI investments were built for factories and ERP systems, not for projects where costs are uncertain, benefits are indirect, and organizational readiness is the binding constraint.

The Solution

Four layers. One decision.

Each layer answers a different question. Together, they force decision-makers to systematically confront every factor that determines whether an AI project creates or destroys value.

SCALE Assessment

"Should we invest in AI at all?"

Strategic readiness filter across five dimensions: Scalability, Constraints, Alignment, Leadership, Efficiency. Runs before any financial analysis. Produces Go / No-Go / Not Yet.

ROIC Decomposition

"Where does AI create or destroy value?"

Breaks NOPAT and Invested Capital into granular drivers. Identifies which levers AI affects and applies the growth constraint: g = ROIC × RR. A 50% growth claim on 15% ROIC is an inconsistent promise.

Stage-Gate Investment

"How do we phase spending?"

Converts a single large bet into information-revealing smaller bets. Three gates: POC → Pilot → Scale. At each gate, the option to stop preserves capital and captures learning.

Decision Scorecard

"What's the integrated recommendation?"

Synthesizes SCALE profile, ROIC sustainability, and risk-adjusted discount rate into a single grade (A through C) with specific execution requirements.

Layer 0

The SCALE Assessment

Developed by Professor Gad Allon (Wharton). Three dimensions are tailwinds; two are headwinds. If Leadership or Constraints score poorly, fix those before committing capital.

S — Scalability

Do revenues grow faster than costs? Is there urgency to grow, and are strategy, market, and business model configured to support scale? Scaling builds a cushion; costs don't shrink as fast as revenues in a downturn.

C — Constraints

What resources do you need versus what you have? Is the mitigation path clear for each gap? Data quality, compliance burden, legacy infrastructure, talent gaps all live here.

A — Alignment

Is the value proposition differentiated? Are revenue model, resources, and activities aligned with it? The sharpest question: what are we not good at, and can we use that to do something else?

L — Leadership

Right people, right structure, right culture? In B2B, assess both vendor-side and buyer-side readiness. CEO governance of AI is the attribute most correlated with financial returns.

E — Efficiency

Is there a predictable path to profitability? The Rule of 40 (growth rate + margin ≥ 40%) provides the guardrail. Growth for the sake of growth will kill the firm.

Risk Adjustment

Eight categories. One discount rate.

Builds the discount rate from the risk-free rate upward. Each category contributes basis points anchored to empirically observable risk premiums. Range: 12% (all Low) to 36% (all High).

Data Risk

Quality, availability, legal status of required data. ML is acutely sensitive; dirty data produces confidently wrong outputs.

Model & Technical Risk

Will the AI achieve the accuracy threshold needed? The binary "does it work?" question. Anchored to pharma R&D discount rates.

Inference Cost Scaling

Per-unit AI costs vs. per-unit revenue at scale. A feature viable at 10M users can be margin-destructive at 700M.

Organizational Readiness

Change management, process alignment, internal friction. Expanded to 8 sub-factors after Apple analysis exposed gaps in simpler models.

Regulatory & Compliance

Requirements that increase costs, delay deployment, or restrict functionality. Anchored to Basel III compliance premiums.

Competitive & Obsolescence

Risk of being outpaced by competitors, open-source, or foundation model shifts. Easy to launch, hard to scale.

Vendor Dependency

Lock-in to model providers, cloud vendors, or APIs you don't control. The build-vs-buy tradeoff with Category 2 is the central strategic tension.

Talent & Capability

Ability to hire, retain, and develop AI-specific talent. Knowledge concentration in key individuals creates fragility.

Validation

8 companies. 4 deployment axes. Real verdicts.

Each axis has a "build" case (full framework application from first principles) and a "validate" case (tests whether the framework generalizes).

💼

B2B — Enterprise AI Products

AI sold to business buyers. Dominant risks: buyer organizational readiness and regulatory compliance.

Salesforce

Agentforce — AI agents in CRM

Build

B−

Grade

23.5%

Discount Rate

~8%

Base ROI

⚠️ Pursue cautiously

S Scalability A Alignment E Efficiency C Constraints L Leadership*

Key Insight

Success-driven cannibalization: Agentforce makes reps 3x more efficient, undermining seat-license revenue. The shift to per-action pricing creates its own forecasting uncertainty.

Critical Execution Requirements

Prove unit economics at deal level within 18 months (fully loaded cost including inference must produce positive NOPAT)

Accelerate pricing transition from per-seat to consumption-based Flex Credits before Rule-of-40 deteriorates

Gate investment on buyer cohort retention: if pilot-to-paid conversion falls below 60%, growth is unsustainable

Palantir

AIP — Forward-deployed AI platform

Validate

B−

Grade

25.5%

Discount Rate

>115%

Net $ Retention

⚠️ Pursue selectively — expansion over acquisition

A Alignment E Efficiency S Scalability* C Constraints L Leadership*

Key Insight

High-touch delivery (forward-deployed engineers) creates durable moat but structural scalability ceiling. Success depends on whether the deployment trough depth is shrinking over time.

Critical Execution Requirements

Prioritize expansion revenue over new acquisition (existing-account growth has fundamentally better unit economics)

Reduce deployment trough: per-customer deployment cost decline of 10-15% annually required, or scalability ceiling binds in 3-5 years

Accelerate commercial diversification: government (55% revenue) provides stability but limits velocity

👤

B2C — Consumer AI Products

AI embedded in consumer experiences. Dominant risks: inference cost scaling and competitive obsolescence.

Spotify

Recommendation Engine — AI DJ, Discover Weekly

Build

A−

Grade

19.25%

Discount Rate

600B

Events/Month

✅ Invest aggressively — unambiguously yes

S Scalability A Alignment E Efficiency L Leadership C Constraints

Key Insight

In B2C, cost management architecture (model efficiency, edge caching, feature gating) is arguably more important than user experience. Solving cost structure is harder than solving the product.

Critical Execution Requirements

Validate AI DJ and prompted playlists produce positive unit economics at full MAU scale; gate compute-intensive features to Premium if free-tier unsustainable

Protect proprietary data moat: decade of behavioral data (600B monthly events) is the competitive advantage

Monitor content royalty dynamics: if label renegotiations shift rates upward by 3-5pp, margin structure changes materially

Duolingo

Birdbrain + Max Tier — Adaptive AI learning

Validate

B+

Grade

21.5%

Discount Rate

~80

Rule of 40

✅ Yes, with vendor dependency mitigation

S Scalability A Alignment L Leadership E Efficiency C Constraints

Key Insight

Hybrid build-buy model: Birdbrain (proprietary, low dependency) + GPT-4 (frontier but vendor-dependent). Competitive threat from general-purpose AI as free tutor requires monitoring.

Critical Execution Requirements

Expand Max tier to all major language pairs within 12 months (highest-ARPU product)

Develop LLM provider optionality within 24 months: serve from 2+ providers to reduce single-vendor dependency

Manage free-tier inference costs: energy system (2-3 lessons/day limit) is correct; monitor conversion vs. attrition

🌐

Platform — Marketplace AI

AI serving both sides of a marketplace. Dominant risks: merchant/supply sophistication and inference scaling.

Shopify

Shopify Magic — AI tools for merchants

Build

Grade

21.25%

Discount Rate

15-20h

Saved/Wk

⚠️ Conditionally yes — invest with value-capture validation

A Alignment S Scalability C Constraints E Efficiency* L Leadership*

Key Insight

Platform-specific tension: AI creates genuine merchant value but Shopify monetizes through commissions, not AI tools. The gap between value creation and value capture is the central challenge.

Critical Execution Requirements

Establish AI-to-GMV attribution: track whether AI tool adoption correlates with merchant GMV growth

Validate Agentic Storefronts: move from POC to pilot within 12 months; if buyers don't engage with AI chat, deprioritize

Invest in AI tool simplification for long-tail merchants: adoption among long tail drives aggregate GMV

Uber

Core AI + AV Transition — Matching, pricing, autonomy

Validate

Grade

22.75%

Discount Rate

30M

Predictions/Min

⚠️ Yes for core AI; conditionally yes for AV

S Scalability A Alignment L Leadership E Efficiency C Constraints

Key Insight

Bifurcated recommendation: core AI is the most mature and proven in the Platform analysis; AV transition is an existential bet with enormous upside but timing uncertainty and disintermediation risk.

Critical Execution Requirements

Continue optimizing matching/pricing algorithms (each 1% in driver utilization compounds across 13.6B annual trips)

Gate AV capital on unit economics proof in 3+ metropolitan markets

Protect against disintermediation: ensure AV partnerships preserve Uber's demand-side aggregation role

🏢

Internal — Enterprise Self-Deployment

AI deployed on your own operations. Dominant risks: model/technical (build) or vendor dependency (buy).

JPMorgan Chase

COiN + LLM Suite — 450 use cases, 200K daily users

Build

A−

Grade

20.0%

Discount Rate

$2B

Annual Value

✅ Invest aggressively — unambiguously yes

S Scalability A Alignment L Leadership E Efficiency C Constraints

Key Insight

Internal AI carries lower aggregate risk than B2B/B2C/Platform because it eliminates customer adoption uncertainty, competitive displacement, and vendor dependency. Risk concentrates in regulatory compliance.

Critical Execution Requirements

Maintain human-in-the-loop governance as use cases expand 450 to 1,000 (single governance failure exceeds cumulative savings)

Monitor skill dependency in workforce reallocation: 4% ops-to-client shift requires structured capability development

Track diminishing marginal returns: next 550 use cases must be prioritized by ROIC contribution, not deployment ease

NatWest

Cora+ & AI Suite — Fraud, dev tools, 275 projects

Validate

C+

Grade

24.25%

Discount Rate

135%

Scam Detection

⚠ Invest steadily with vendor diversification

A Alignment L Leadership* S Scalability* C Constraints E Efficiency*

Key Insight

Build-vs-buy at highest stakes: the 350bp risk differential between JPMorgan (build) and NatWest (buy) quantifies the architectural tradeoff. NatWest's approach is correct for its scale but creates a dependency premium.

Critical Execution Requirements

Develop multi-vendor optionality: evaluate alternative LLM providers for 2+ of 275 use cases (leverage, not replacement)

Impose rigorous prioritization on 275-project pipeline: require Gate 1 business case before resources (enthusiasm outpaces discipline)

Accelerate AWS/Accenture cloud migration: legacy infrastructure is the binding bottleneck on every downstream AI project

Grading System

The Decision Scorecard

Three inputs determine the grade: SCALE profile, ROIC sustainability, and risk-adjusted discount rate. Bands overlap intentionally to allow judgment.

Invest Aggressively

4+ tailwinds, proven unit economics, discount rate below 21%. Revenue demonstrably outpaces costs. Monitor gating conditions.

B+

Invest with Mitigation

3-4 tailwinds, 1-2 material headwinds, rate 21-23%. Revenue outpacing costs but unit economics partially unproven.

Invest Selectively

3+ tailwinds, significant but understood headwinds, rate 22-24%. ROIC positive but key sensitivity unresolved.

B−

Invest Cautiously

3 tailwinds, material headwinds, rate 23-26%. Revenue growing but cost sustainability unproven. Address structural risks first.

C+

Invest Steadily + Diversify

2 tailwinds, significant headwinds, rate above 24%. Returns positive but constrained. Manage concentration risk.

Delay or Restructure

Fewer than 2 tailwinds or critical headwind unresolved. ROIC negative or deeply uncertain. Fix readiness gaps first.

All Results

The full verdict table

All 8 companies scored, graded, and assessed. Sort by what matters to you.

Company	Category	Grade	Discount Rate	Verdict	Top Execution Priority
Spotify	B2C Build	A−	19.25%	Invest aggressively	Manage inference scaling; protect data moat
JPMorgan	Internal Build	A−	20.0%	Invest aggressively	Maintain governance at velocity; prioritize by ROIC
Duolingo	B2C Validate	B+	21.5%	Yes, with vendor management	Expand Max tier; develop LLM provider optionality
Shopify	Platform Build	B	21.25%	Conditionally yes	Establish AI-to-GMV attribution
Uber	Platform Validate	B	22.75%	Yes for core; conditional for AV	Optimize core algorithms; gate AV on unit economics
Salesforce	B2B Build	B−	23.5%	Pursue cautiously	Prove unit economics; accelerate consumption pricing
Palantir	B2B Validate	B−	25.5%	Pursue selectively	Prioritize expansion; reduce deployment trough
NatWest	Internal Validate	C+	24.25%	Invest steadily + diversify	Multi-vendor optionality; accelerate cloud migration

Should you invest in AI?And if so, how?

$1.8 trillion in AI spending by 2030.Most of it won't work.

Four layers. One decision.

SCALE Assessment

ROIC Decomposition

Stage-Gate Investment

Decision Scorecard

The SCALE Assessment

S — Scalability

C — Constraints

A — Alignment

L — Leadership

E — Efficiency

Eight categories. One discount rate.

Data Risk

Model & Technical Risk

Inference Cost Scaling

Organizational Readiness

Regulatory & Compliance

Competitive & Obsolescence

Vendor Dependency

Talent & Capability

8 companies. 4 deployment axes. Real verdicts.

B2B — Enterprise AI Products

Critical Execution Requirements

Critical Execution Requirements

B2C — Consumer AI Products

Critical Execution Requirements

Critical Execution Requirements

Platform — Marketplace AI

Critical Execution Requirements

Critical Execution Requirements

Internal — Enterprise Self-Deployment

Critical Execution Requirements

Critical Execution Requirements

The Decision Scorecard

Invest Aggressively

Invest with Mitigation

Invest Selectively

Invest Cautiously

Invest Steadily + Diversify

Delay or Restructure

The full verdict table

Run this analysis on your AI investments.

Should you invest in AI?
And if so, how?

$1.8 trillion in AI spending by 2030.
Most of it won't work.