What is the Most Divisive Film?

Summary

Based on data from Letterboxd, the most divisive film is either Salò, or the 120 Days of Sodom (a gruesome horror film), Triumph of the Will (a Nazi propaganda film), or Wavelength (an experimental film where not much happens). The film which actually has the highest ratings variance is Love on a Leash, which is one of those “so-bad-it’s-good” films.

I’ve seen none of those films, and don’t plan on changing that fact — but the most divisive film I have watched is The Magic Roundabout which I vaguely remember enjoying as a child.

Introduction

The film Megalopolis came out recently. The response is certainly mixed: most people seem to dislike it, but some think it’s a surrealist masterpiece. I haven’t seen it. But it did get me wondering: what is the most divisive film?

Given a distribution of ratings of a film, we can identify divisive films as those for which there are two camps: those who hate the film and give it a low rating; and those who love it and rate it highly. If we have a dataset of film ratings and film

Data

I acquired a dataset of ratings by scraping1 Letterboxd. Each Letterboxd page (e.g. that for Megalopolis has a histogram of user ratings, which I was able to process using python. There are a lot of films on Letterboxd, so I considered only the top 3% most popular films. Letterboxd rates films from ½★ to 5★, but we can map this onto a rating from 0 to 9.

Analysis

Once I had the data, it was just a matter of doing some analysis. I’m still a big fan of Julia for data analysis, so I used that. The main remaining problem is choosing a metric to compare the divisiveness of films. I considered a few options: the variance of the rating distribution, the mean absolute deviation, the kurtosis, the KL divergence, and some bimodality metrics. I had the idea that a good measure of divisiveness would be something like

\[ \text{Divisiveness}(R_i) = \min_{\alpha \geq 1 \vee \beta \geq 1}{\text{KL}(\text{BetaBinomial}(9, \alpha, \beta)||R_i)}, \]

where \(R_i\) is the user rating distribution for film \(i\). This metric seems complicated and possibly costly to compute, and after qualitatively comparing my options, the good old variance came out on top, though the rankings were generally similar between different methods I tried2.

I then ranked films by divisiveness, choosing to limit myself to films with more than 7000 reviews. Here’s the resulting divisiveness table:

Film Year Rating Variance
Love on a Leash 2011 16.234
Fateful Findings 2013 14.455
Twisted Pair 2018 13.791
Ratatoing 2007 13.344
The Room 2003 13.236
[CENSORED] 1992 12.582
Double Down 2005 12.529
Goat Story 2008 11.185
Llamageddon 2015 10.935
The Amazing Bulk 2012 10.225

I’ll leave the [CENSORED] film as a puzzle for the reader, as the title is too offensive for me to print here. The main theme of this table is “so-bad-it’s-good” meme films. Titan of the genre Neil Breen appears thrice on this list. Note how they’ve mostly been released after 2000. The censored 1992 film is also a meme film.

Ok, so a very bad film about a woman who falls in love with a man (who is, during the daytime, a dog) stakes a claim on being the most divisive film. But there’s something unsatisfying about this: so-bad-it’s-good films are in a class of their own. I’m looking for films that people genuinely love when others hate, and vice versa.

I took the quick route to this: let’s just look at films made before 2000. Then we get:

Film Year Rating Variance
[CENSORED] 1992 12.582
Troll 2 1990 10.188
Samurai Cop 1991 9.163
Spice World 1997 7.476
Wavelength 1967 7.404
Triumph of the Will 1935 7.268
Belle’s Magical World 1998 6.913
Salò, or the 120 Days of Sodom 1975 6.845
Midori 1992 6.770
Mac and Me 1998 6.566

Most of these are still meme films. There are a few interesting exceptions though:

Making a final judgement based on eyeballing the ratings histograms for these films, I’d say that it’s a tie between Triumph of the Will and Salò, or the 120 Days of Sodom as most divisive. Though Wavelength takes the variance metric crown for films which are not so-bad-they’re-good, and Love on a Leash has the highest variance in ratings.

Have I watched any of these films? No. I don’t intend on ever watching any film in either table. The most divisive film I have watched is The Magic Roundabout, at #250 on the list. I watched it as a child, but remember mildly enjoying it.

Megalopolis is currently at (#141). Will I go to see Megalopolis? Maybe.

Code

import time
import locale
from tqdm import tqdm
import requests
from bs4 import BeautifulSoup

locale.setlocale(locale.LC_ALL, "en_US.UTF-8")

FILM_IDS = "letterboxd_ids.txt"
SAVE_FILE = "letterboxd.tsv"
RATE_LIMIT = 1.5  # Minimum time between requests
TIMEOUT = 60

with open(MISSING_FILE, "r", encoding="utf-8") as f:
    film_list = f.readlines()
    film_list = [l.strip() for l in film_list]

failed_ids = []
last_request_time = 0
for film_id in tqdm(film_list):
    histogram_link = (
        f"https://letterboxd.com/csi/film/{film_id}/rating-histogram"
    )
    elapsed_time = time.time() - last_request_time
    try:
        if elapsed_time < RATE_LIMIT:
            time.sleep(RATE_LIMIT - elapsed_time)
        histogram_request = requests.get(histogram_link, timeout=TIMEOUT).text
        last_request_time = time.time()
        histogram_soup = BeautifulSoup(histogram_request, "html.parser")
        rating_histogram = [
            locale.atoi(
                "0" if not h.a else h.a.text.replace("\xa0", " ").split(" ")[0]
            )
            for h in histogram_soup.select("li.rating-histogram-bar")
        ]
        assert len(rating_histogram) > 0
    except:
        failed_ids.append(film_id)
        continue
    with open(SAVE_FILE, mode="a", encoding="utf-8") as f:
        f.write(film_id + "\t" + "\t".join(map(str, rating_histogram)) + "\n")
print("FAILED:", failed_ids)

  1. I couldn’t find anything forbidding doing this, but I’m also not going to share the scraped dataset just in case. ↩︎

  2. Except kurtosis. The mean absolute deviation and the variance are reasonable metrics here in the sense that they’re maximised by distributions which are evenly split between the maximum and minimum ratings. Due to the way that kurtosis (and more generally, moments of order greater than 2) weighs the residuals, it’s actually optimal to have a mean near the edge of the distribution in order to get longer residuals. This means that films people love (or hate) on average but have more than zero haters (or lovers) rank highly on kurtosis. These films are the opposite of divisive. ↩︎