I have a tough time making decisions, especially when it comes to which movie I should watch. I know the general genres I tend to enjoy most. I usually know if I want to watch a more recent movie or perhaps a movie from a few years ago. Sometimes I want to view movies with good ratings, sometimes I like exploring movies that are highly rated but not as well known. So, how do you go about choosing? Use the Movie Generator.
CHECK OUT THE VIZ HERE
NOTE: be patient & remember there’s over 9 million rows of data working. If needed & you have Tableau, please feel free to download the workbook & explore.
COLLECTING THE DATA
I started working with over 20 million rows of data of movie ratings from the website MovieLens.org. The dataset held over 20,000 movies from January 1995 to March 2015. I wanted to restructure and reorganize this data to develop a really slick visualization. To best handle the large merge, pivot and reorganization of this data, I looked to Alteryx.
One way Alteryx really helped speed up the data manipulation process was the pivot feature. Each movie had a unique ID and a genre was also associate with each movie. However the genres were pipe-separated. See example below:
I wanted each movie to be separate by genre. To do this, I had to first create a calculation for each genre:
…and so on.
I excluded all genres that do not apply to the movie (ie is[GENRE] = 0).
Then, I filtered the data to only view movies that were released in or after the year 2000 and movies that had at least 250 reviews, as well as filtering the users that have reviewed over 250 movies.
CREATING THE VISUALIZATION
I usually like to draw my idea out on paper before diving into a visualization. Here is my rough draft:
I wanted the user to be able to explore the different genre options first. That is what the first section is meant to do.
Select the pictures on the top to explore the genre. Here we are looking at the adventure genre.
To find the movie that best interests you, follow these steps:
- Choose 3 genres that you generally enjoy watching:
(I tend to like Adventure, Fantasy and Mystery)
- The scatterplot now shows all movies that have either Adventure, Fantasy or Mystery.
The number of users is the x-axis and the average rating is the y-axis.
The colors will be explained in number 4.
- To narrow your search, use the right section. First start by completing the sentence:
“Please show me the movies release before/during/after the year [2000-2014] with a(n) high avg/avg/low avg rating and has been rated by a(n) high avg/avg/low avg number of MovieLens users.”
My example below:
Please show me the movies released AFTER the year 2005 with a high avg rating and has been rated by a low avg number of MovieLens users.
- Now choose which genre, you would like to view. For the colors, I used the primary colors. Genre 1 (adventure) is blue, Genre 2 (fantasy) is yellow and Genre 3 (mystery) is Red. If a movie’s genre is adventure and a non-selected genre (let’s say children’s) then it will fall under “ADVENTURE”. If a movie is both a Children’s, Adventure and Fantasy then it will be categorized as ADVENTURE & FANTASY (green – both the blue genre and the yellow genre). If a movie has all three genres, then it’s grey.
You can select a genre to view all of the movies that fall under that particular genre. Let’s see which movies have all three genres:
- To view the movies, that filter for both the sentence you made & the genre you want, Click the button from #3.
One movie is left.
- Let’s explore what movie this is, click on the circle that is highlighted.
Clicking shows the title, release year, avg rating and user count for this movie since the release date.
- Now, go to the IMDB website to explore the movie even more and watch the trailer:
If you have any questions, ask me on twitter: @alexduke