What is the Most Divisive Film?

Summary

Based on data from Letterboxd, the most divisive film is either Salò, or the 120 Days of Sodom (a gruesome horror film), Triumph of the Will (a Nazi propaganda film), or Wavelength (an experimental film where not much happens). The film which actually has the highest ratings variance is Love on a Leash, which is one of those “so-bad-it’s-good” films.

I’ve seen none of those films, and don’t plan on changing that fact — but the most divisive film I have watched is The Magic Roundabout which I vaguely remember enjoying as a child.

Introduction

The film Megalopolis came out recently. The response is certainly mixed: most people seem to dislike it, but some think it’s a surrealist masterpiece. I haven’t seen it. But it did get me wondering: what is the most divisive film?

Given a distribution of ratings of a film, we can identify divisive films as those for which there are two camps: those who hate the film and give it a low rating; and those who love it and rate it highly. If we have a dataset of film ratings and film

Data

I acquired a dataset of ratings by scraping¹ Letterboxd. Each Letterboxd page (such as that for Megalopolis) has a histogram of user ratings, which I was able to process using python. There are a lot of films on Letterboxd, so I considered only the top 3% most popular films. Letterboxd rates films from ½★ to 5★, but we can map this onto a rating from 0 to 9.

Analysis

Once I had the data, it was just a matter of doing some analysis. I’m still a big fan of Julia for data analysis, so I used that. The last problem was choosing a metric to compare the divisiveness of films. I considered a few options: the variance of the rating distribution, the mean absolute deviation, the kurtosis, the KL divergence, and some bimodality metrics. I had the idea that a good measure of divisiveness would be something like

\[ \text{Divisiveness}(R_i) = \min_{\alpha \leq 1 \vee \beta \leq 1}{\text{KL}(\text{BetaBinomial}(9, \alpha, \beta)||R_i)}, \]

where \(R_i\) is the user rating distribution for film \(i\). This metric seems complicated and possibly costly to compute, and after qualitatively comparing my options, the good old variance came out on top, though the rankings were generally similar between different methods I tried².

I then ranked films by divisiveness, choosing to limit myself to films with more than 7,000 reviews. Here’s the resulting divisiveness table:

Film	Year	Rating Variance
Love on a Leash	2011	16.234
Fateful Findings	2013	14.455
Twisted Pair	2018	13.791
Ratatoing	2007	13.344
The Room	2003	13.236
[CENSORED]	1992	12.582
Double Down	2005	12.529
Goat Story	2008	11.185
Llamageddon	2015	10.935
The Amazing Bulk	2012	10.225

I’ll leave the [CENSORED] film as a puzzle for the reader, as the title is too offensive for me to print here. The main theme of this table is “so-bad-it’s-good” meme films. Titan of the genre Neil Breen appears thrice on this list. Note how they’ve mostly been released after 2000. The censored 1992 film is also a meme film.

Ok, so a very bad film about a woman who falls in love with a man (who is, during the daytime, a dog) stakes a claim on being the most divisive film. But there’s something unsatisfying about this: so-bad-it’s-good films are in a class of their own. I’m looking for films that people genuinely love when others hate, and vice versa.

I took the quick route to this: let’s just look at films made before 2000. Then we get:

Film	Year	Rating Variance
[CENSORED]	1992	12.582
Troll 2	1990	10.188
Samurai Cop	1991	9.163
Spice World	1997	7.476
Wavelength	1967	7.404
Triumph of the Will	1935	7.268
Belle’s Magical World	1998	6.913
Salò, or the 120 Days of Sodom	1975	6.845
Midori	1992	6.770
Mac and Me	1998	6.566

Most of these are still meme films. There are a few interesting exceptions though:

Wavelength is an experimental film in which not much happens. Very avant-garde cinema.
Triumph of the Will is a Nazi propaganda film, but one that seems to have won critical acclaim (such as winning the Venice film festival).
Salò, or the 120 Days of Sodom is a gruesome horror film, and famous for being very divisive. It was banned in the UK for 25 years.
Midori is film people describe as very disturbing.

Making a final judgement based on eyeballing the ratings histograms for these films, I’d say that it’s a tie between Triumph of the Will and Salò, or the 120 Days of Sodom as most divisive. Though Wavelength takes the variance metric crown for films which are not so-bad-they’re-good, and Love on a Leash has the highest variance in ratings.

Have I watched any of these films? No. I don’t intend on ever watching any film in either table. The most divisive film I have watched is The Magic Roundabout, at #250 on the list. I watched it as a child, but remember mildly enjoying it.

Megalopolis is currently at (#141). Will I go to see Megalopolis? Maybe.

Code

import time
import locale
from tqdm import tqdm
import requests
from bs4 import BeautifulSoup

locale.setlocale(locale.LC_ALL, "en_US.UTF-8")

FILM_IDS = "letterboxd_ids.txt"
SAVE_FILE = "letterboxd.tsv"
RATE_LIMIT = 1.5  # Minimum time between requests
TIMEOUT = 60

with open(MISSING_FILE, "r", encoding="utf-8") as f:
    film_list = f.readlines()
    film_list = [l.strip() for l in film_list]

failed_ids = []
last_request_time = 0
for film_id in tqdm(film_list):
    histogram_link = (
        f"https://letterboxd.com/csi/film/{film_id}/rating-histogram"
    )
    elapsed_time = time.time() - last_request_time
    try:
        if elapsed_time < RATE_LIMIT:
            time.sleep(RATE_LIMIT - elapsed_time)
        histogram_request = requests.get(histogram_link, timeout=TIMEOUT).text
        last_request_time = time.time()
        histogram_soup = BeautifulSoup(histogram_request, "html.parser")
        rating_histogram = [
            locale.atoi(
                "0" if not h.a else h.a.text.replace("\xa0", " ").split(" ")[0]
            )
            for h in histogram_soup.select("li.rating-histogram-bar")
        ]
        assert len(rating_histogram) > 0
    except:
        failed_ids.append(film_id)
        continue
    with open(SAVE_FILE, mode="a", encoding="utf-8") as f:
        f.write(film_id + "\t" + "\t".join(map(str, rating_histogram)) + "\n")
print("FAILED:", failed_ids)

I couldn’t find anything forbidding doing this, but I’m also not going to share the scraped dataset just in case. ↩︎
Except kurtosis. The mean absolute deviation and the variance are reasonable metrics here in the sense that they’re maximised by distributions which are evenly split between the maximum and minimum ratings. Due to the way that kurtosis (and more generally, moments of order greater than 2) weighs the residuals, it’s actually optimal to have a mean near the edge of the distribution in order to get longer residuals. This means that films people love (or hate) on average but have more than zero haters (or lovers) rank highly on kurtosis. These films are the opposite of divisive. ↩︎