How machines can help us discover overlooked films

Feeling like I’d burned through my standard sources for movie recommendations, I recently decided to turn to box office failures. I was seeking out an automated way to explore the world of such movies and find “overlooked” films that are actually very good, but were ignored in theaters and dismissed by critics.

Using Nathan Rabin’s popular “My Year of Flops” series on The AV Club and follow-up book as a starting point, I designed an algorithm to predict whether a box office failure is actually a film worth seeing. The algorithm examines multiple aspects of a movie’s cultural response to make its prediction – such as applying sentiment analysis to capture the tone of reviews, and understanding whether critics and audiences responded differently to a movie. The output is a list of 100+ movies released over the past decade with high likelihood of being quality, “overlooked” films.

Here’s how it works…


In 1994, Forrest Gump made over $300M at the domestic box office, won six Oscars, and spawned a murderer’s row of pop culture references.

The Shawshank Redemption also came out that year. It had a confusing name, won exactly zero Oscars, and made only $16M in its initial run – an amount outdistanced by House Party 3, Kid ‘n Play’s capstone installment in their “living-situation-oriented festival” trilogy.

Yet flip on TNT on a random Saturday night, and you’re more likely to be greeted by Andy and Red than by Forrest and Jenny.

Because it flopped in theaters, people had to discover Shawshank organically on video. And not only did its reputation grow, but fans felt a sense of personal ownership and evangelism. Nearly everyone I know who’s seen the movie first watched it because of a recommendation, and fiercely loyal IMDb users have even rated it the best movie of all time.

One of the earliest customer reviews for The Shawshank Redemption.

One of the earliest customer reviews for The Shawshank Redemption.

Continue reading

Pilot season

Growing up, I enjoyed writing code and messing around with technology, but my first love was always pop culture — books, film, tv, movies. So I always thought tech would be a hobby, while my career would involve trying to climb the ladder in the television, or music, or movie industry.

Fortunately for me, I happened to grow up at a time when the existing media landscape was undergoing massive upheaval, and when tech companies were shouldering their way into music, books, film, and television in a major way. In the past decade, Apple, Amazon, and DVRs have had  as big an impact on how media is created as music labels, publishing houses, and television networks have. And they’ve been able to do so quickly, unconstrained by the decades of legacy and bureaucracy that paralyze many media companies.

I don’t think this change is an unqualified good. But as someone with interest in both camps, I definitely think the change is a fascinating one.

So here, I write about the shore where technology smashes up against creation. I describe tools that help us better understand and analyze works of creation, but also the gaps that such technology can’t ever fill. I think about how new tech business models shape new kinds of art that it’s now possible to create and distribute… for better or worse. And of course, I  write about new technology and new art generally, and what they mean to the world at large.