Wharton Research — WEMBA 50

Should you invest in AI?
And if so, how?

A four-layer framework integrating strategic readiness, ROIC decomposition, stage-gate investment, and risk-adjusted scoring. Validated against 8 real-world deployments.

80%
AI projects fail
4
Analytical layers
8
Company cases
8
Risk categories
Explore the Framework ↓
The Problem

$1.8 trillion in AI spending by 2030.
Most of it won't work.

McKinsey found 80%+ of organizations see no EBIT impact from generative AI. MIT reports 95% of pilots deliver no measurable P&L impact. The tools executives use to evaluate AI investments were built for factories and ERP systems, not for projects where costs are uncertain, benefits are indirect, and organizational readiness is the binding constraint.

The Solution

Four layers. One decision.

Each layer answers a different question. Together, they force decision-makers to systematically confront every factor that determines whether an AI project creates or destroys value.

0

SCALE Assessment

"Should we invest in AI at all?"

Strategic readiness filter across five dimensions: Scalability, Constraints, Alignment, Leadership, Efficiency. Runs before any financial analysis. Produces Go / No-Go / Not Yet.

1

ROIC Decomposition

"Where does AI create or destroy value?"

Breaks NOPAT and Invested Capital into granular drivers. Identifies which levers AI affects and applies the growth constraint: g = ROIC × RR. A 50% growth claim on 15% ROIC is an inconsistent promise.

2

Stage-Gate Investment

"How do we phase spending?"

Converts a single large bet into information-revealing smaller bets. Three gates: POC → Pilot → Scale. At each gate, the option to stop preserves capital and captures learning.

3

Decision Scorecard

"What's the integrated recommendation?"

Synthesizes SCALE profile, ROIC sustainability, and risk-adjusted discount rate into a single grade (A through C) with specific execution requirements.

Layer 0

The SCALE Assessment

Developed by Professor Gad Allon (Wharton). Three dimensions are tailwinds; two are headwinds. If Leadership or Constraints score poorly, fix those before committing capital.

S — Scalability

Do revenues grow faster than costs? Is there urgency to grow, and are strategy, market, and business model configured to support scale? Scaling builds a cushion; costs don't shrink as fast as revenues in a downturn.

C — Constraints

What resources do you need versus what you have? Is the mitigation path clear for each gap? Data quality, compliance burden, legacy infrastructure, talent gaps all live here.

A — Alignment

Is the value proposition differentiated? Are revenue model, resources, and activities aligned with it? The sharpest question: what are we not good at, and can we use that to do something else?

L — Leadership

Right people, right structure, right culture? In B2B, assess both vendor-side and buyer-side readiness. CEO governance of AI is the attribute most correlated with financial returns.

E — Efficiency

Is there a predictable path to profitability? The Rule of 40 (growth rate + margin ≥ 40%) provides the guardrail. Growth for the sake of growth will kill the firm.

Risk Adjustment

Eight categories. One discount rate.

Builds the discount rate from the risk-free rate upward. Each category contributes basis points anchored to empirically observable risk premiums. Range: 12% (all Low) to 36% (all High).

1

Data Risk

Quality, availability, legal status of required data. ML is acutely sensitive; dirty data produces confidently wrong outputs.

2

Model & Technical Risk

Will the AI achieve the accuracy threshold needed? The binary "does it work?" question. Anchored to pharma R&D discount rates.

3

Inference Cost Scaling

Per-unit AI costs vs. per-unit revenue at scale. A feature viable at 10M users can be margin-destructive at 700M.

4

Organizational Readiness

Change management, process alignment, internal friction. Expanded to 8 sub-factors after Apple analysis exposed gaps in simpler models.

5

Regulatory & Compliance

Requirements that increase costs, delay deployment, or restrict functionality. Anchored to Basel III compliance premiums.

6

Competitive & Obsolescence

Risk of being outpaced by competitors, open-source, or foundation model shifts. Easy to launch, hard to scale.

7

Vendor Dependency

Lock-in to model providers, cloud vendors, or APIs you don't control. The build-vs-buy tradeoff with Category 2 is the central strategic tension.

8

Talent & Capability

Ability to hire, retain, and develop AI-specific talent. Knowledge concentration in key individuals creates fragility.

Validation

8 companies. 4 deployment axes. Real verdicts.

Each axis has a "build" case (full framework application from first principles) and a "validate" case (tests whether the framework generalizes).

💼

B2B — Enterprise AI Products

AI sold to business buyers. Dominant risks: buyer organizational readiness and regulatory compliance.

Salesforce
Agentforce — AI agents in CRM
Build
B−
Grade
23.5%
Discount Rate
~8%
Base ROI
⚠️ Pursue cautiously
S Scalability A Alignment E Efficiency C Constraints L Leadership*
Key Insight

Success-driven cannibalization: Agentforce makes reps 3x more efficient, undermining seat-license revenue. The shift to per-action pricing creates its own forecasting uncertainty.

Critical Execution Requirements
1

Prove unit economics at deal level within 18 months (fully loaded cost including inference must produce positive NOPAT)

2

Accelerate pricing transition from per-seat to consumption-based Flex Credits before Rule-of-40 deteriorates

3

Gate investment on buyer cohort retention: if pilot-to-paid conversion falls below 60%, growth is unsustainable

Palantir
AIP — Forward-deployed AI platform
Validate
B−
Grade
25.5%
Discount Rate
>115%
Net $ Retention
⚠️ Pursue selectively — expansion over acquisition
A Alignment E Efficiency S Scalability* C Constraints L Leadership*
Key Insight

High-touch delivery (forward-deployed engineers) creates durable moat but structural scalability ceiling. Success depends on whether the deployment trough depth is shrinking over time.

Critical Execution Requirements
1

Prioritize expansion revenue over new acquisition (existing-account growth has fundamentally better unit economics)

2

Reduce deployment trough: per-customer deployment cost decline of 10-15% annually required, or scalability ceiling binds in 3-5 years

3

Accelerate commercial diversification: government (55% revenue) provides stability but limits velocity

👤

B2C — Consumer AI Products

AI embedded in consumer experiences. Dominant risks: inference cost scaling and competitive obsolescence.

Spotify
Recommendation Engine — AI DJ, Discover Weekly
Build
A−
Grade
19.25%
Discount Rate
600B
Events/Month
Invest aggressively — unambiguously yes
S Scalability A Alignment E Efficiency L Leadership C Constraints
Key Insight

In B2C, cost management architecture (model efficiency, edge caching, feature gating) is arguably more important than user experience. Solving cost structure is harder than solving the product.

Critical Execution Requirements
1

Validate AI DJ and prompted playlists produce positive unit economics at full MAU scale; gate compute-intensive features to Premium if free-tier unsustainable

2

Protect proprietary data moat: decade of behavioral data (600B monthly events) is the competitive advantage

3

Monitor content royalty dynamics: if label renegotiations shift rates upward by 3-5pp, margin structure changes materially

Duolingo
Birdbrain + Max Tier — Adaptive AI learning
Validate
B+
Grade
21.5%
Discount Rate
~80
Rule of 40
Yes, with vendor dependency mitigation
S Scalability A Alignment L Leadership E Efficiency C Constraints
Key Insight

Hybrid build-buy model: Birdbrain (proprietary, low dependency) + GPT-4 (frontier but vendor-dependent). Competitive threat from general-purpose AI as free tutor requires monitoring.

Critical Execution Requirements
1

Expand Max tier to all major language pairs within 12 months (highest-ARPU product)

2

Develop LLM provider optionality within 24 months: serve from 2+ providers to reduce single-vendor dependency

3

Manage free-tier inference costs: energy system (2-3 lessons/day limit) is correct; monitor conversion vs. attrition

🌐

Platform — Marketplace AI

AI serving both sides of a marketplace. Dominant risks: merchant/supply sophistication and inference scaling.

Shopify
Shopify Magic — AI tools for merchants
Build
B
Grade
21.25%
Discount Rate
15-20h
Saved/Wk
⚠️ Conditionally yes — invest with value-capture validation
A Alignment S Scalability C Constraints E Efficiency* L Leadership*
Key Insight

Platform-specific tension: AI creates genuine merchant value but Shopify monetizes through commissions, not AI tools. The gap between value creation and value capture is the central challenge.

Critical Execution Requirements
1

Establish AI-to-GMV attribution: track whether AI tool adoption correlates with merchant GMV growth

2

Validate Agentic Storefronts: move from POC to pilot within 12 months; if buyers don't engage with AI chat, deprioritize

3

Invest in AI tool simplification for long-tail merchants: adoption among long tail drives aggregate GMV

Uber
Core AI + AV Transition — Matching, pricing, autonomy
Validate
B
Grade
22.75%
Discount Rate
30M
Predictions/Min
⚠️ Yes for core AI; conditionally yes for AV
S Scalability A Alignment L Leadership E Efficiency C Constraints
Key Insight

Bifurcated recommendation: core AI is the most mature and proven in the Platform analysis; AV transition is an existential bet with enormous upside but timing uncertainty and disintermediation risk.

Critical Execution Requirements
1

Continue optimizing matching/pricing algorithms (each 1% in driver utilization compounds across 13.6B annual trips)

2

Gate AV capital on unit economics proof in 3+ metropolitan markets

3

Protect against disintermediation: ensure AV partnerships preserve Uber's demand-side aggregation role

🏢

Internal — Enterprise Self-Deployment

AI deployed on your own operations. Dominant risks: model/technical (build) or vendor dependency (buy).

JPMorgan Chase
COiN + LLM Suite — 450 use cases, 200K daily users
Build
A−
Grade
20.0%
Discount Rate
$2B
Annual Value
Invest aggressively — unambiguously yes
S Scalability A Alignment L Leadership E Efficiency C Constraints
Key Insight

Internal AI carries lower aggregate risk than B2B/B2C/Platform because it eliminates customer adoption uncertainty, competitive displacement, and vendor dependency. Risk concentrates in regulatory compliance.

Critical Execution Requirements
1

Maintain human-in-the-loop governance as use cases expand 450 to 1,000 (single governance failure exceeds cumulative savings)

2

Monitor skill dependency in workforce reallocation: 4% ops-to-client shift requires structured capability development

3

Track diminishing marginal returns: next 550 use cases must be prioritized by ROIC contribution, not deployment ease

NatWest
Cora+ & AI Suite — Fraud, dev tools, 275 projects
Validate
C+
Grade
24.25%
Discount Rate
135%
Scam Detection
Invest steadily with vendor diversification
A Alignment L Leadership* S Scalability* C Constraints E Efficiency*
Key Insight

Build-vs-buy at highest stakes: the 350bp risk differential between JPMorgan (build) and NatWest (buy) quantifies the architectural tradeoff. NatWest's approach is correct for its scale but creates a dependency premium.

Critical Execution Requirements
1

Develop multi-vendor optionality: evaluate alternative LLM providers for 2+ of 275 use cases (leverage, not replacement)

2

Impose rigorous prioritization on 275-project pipeline: require Gate 1 business case before resources (enthusiasm outpaces discipline)

3

Accelerate AWS/Accenture cloud migration: legacy infrastructure is the binding bottleneck on every downstream AI project

Grading System

The Decision Scorecard

Three inputs determine the grade: SCALE profile, ROIC sustainability, and risk-adjusted discount rate. Bands overlap intentionally to allow judgment.

A

Invest Aggressively

4+ tailwinds, proven unit economics, discount rate below 21%. Revenue demonstrably outpaces costs. Monitor gating conditions.

B+

Invest with Mitigation

3-4 tailwinds, 1-2 material headwinds, rate 21-23%. Revenue outpacing costs but unit economics partially unproven.

B

Invest Selectively

3+ tailwinds, significant but understood headwinds, rate 22-24%. ROIC positive but key sensitivity unresolved.

B−

Invest Cautiously

3 tailwinds, material headwinds, rate 23-26%. Revenue growing but cost sustainability unproven. Address structural risks first.

C+

Invest Steadily + Diversify

2 tailwinds, significant headwinds, rate above 24%. Returns positive but constrained. Manage concentration risk.

C

Delay or Restructure

Fewer than 2 tailwinds or critical headwind unresolved. ROIC negative or deeply uncertain. Fix readiness gaps first.

All Results

The full verdict table

All 8 companies scored, graded, and assessed. Sort by what matters to you.

Company Category Grade Discount Rate Verdict Top Execution Priority
Spotify B2C Build A− 19.25% Invest aggressively Manage inference scaling; protect data moat
JPMorgan Internal Build A− 20.0% Invest aggressively Maintain governance at velocity; prioritize by ROIC
Duolingo B2C Validate B+ 21.5% Yes, with vendor management Expand Max tier; develop LLM provider optionality
Shopify Platform Build B 21.25% Conditionally yes Establish AI-to-GMV attribution
Uber Platform Validate B 22.75% Yes for core; conditional for AV Optimize core algorithms; gate AV on unit economics
Salesforce B2B Build B− 23.5% Pursue cautiously Prove unit economics; accelerate consumption pricing
Palantir B2B Validate B− 25.5% Pursue selectively Prioritize expansion; reduce deployment trough
NatWest Internal Validate C+ 24.25% Invest steadily + diversify Multi-vendor optionality; accelerate cloud migration
What's Next

Run this analysis on your AI investments.

The framework is open. The methodology is documented. The question is whether you'll use it before or after the money is spent.

Back to Framework ↑