Luca Engel

Books to Blockbusters: Data-Driven Advice for Adaptations

Dec 20, 2024

Books to Blockbusters: Data-Driven Advice for AdaptationsCover image generated with OpenAI's GPT-5 model.

TL;DR

At a glance

Problem

Studios must decide whether to adapt a book—and how. Headline revenue is influenced by many confounders (budget, popularity, genres, release year). A naïve Bob vs Nob average is misleading. We need a like-for-like comparison and actionable heuristics for book selection and adaptation fidelity.

Solution overview

A two-part analysis + interactive data story:

  1. Do adaptations outperform? Use propensity score matching to compare Bobs to similar Nobs controlling for key confounders.
  2. Which choices matter? Model revenue with linear regression & random forests; interpret with SHAP. Add book-movie plot similarity (NLP) to quantify fidelity effects.

Architecture

High level flow: data cleaning & joins (films + books + summaries) → feature engineering (inflation adjustment, genres, popularity, ratings mix) → matching (PSM) for Bob–Nob pairs → modeling (LR/RF + SHAP) → plot-similarity analysis (book vs movie summaries) → interactive story.

Data

Method

Propensity Score Matching (Bobs vs Nobs)

Modeling & Explainability

Adaptation Fidelity (Books ↔ Movies)

Experiments & Results

Matched uplift (Bobs vs Nobs)

MetricResult
Win rate (Bobs > Nobs)58%
Median uplift (revenue diff)+$10.7M
Significancep < 0.001

Drivers of revenue (modeling snapshot)

Evaluation protocol.

Product & UX

System & Operations


Impact

What I learned

Future Work

References