If you remember the first day of the 30 Day AI Challenge, I used ChatGPT to discover interesting movies that were otherwise hard to find using other sources like IMDB. Now, with Code Interpreter available, I wanted to experiment and take this to a whole new level. What if I could train ChatGPT to understand my movie and TV preferences better…and then analyze thousands of titles to recommend the best ones (kind of….Everything, Everywhere, All At Once?!).
That’s just what I did…and as a result I now have a database of over 32k movie and TV show titles, ranked by my very own custom preference scores!
Here’s how I did it:
Training ChatGPT On My Preferences
The first thing I did was provide ChatGPT data on over 450 different movies and TV shows I had seen and rated personally.
This was uploaded in CSV format and ChatGPT then went about better understanding what I like (see the full video walkthrough below).
It also helped visualize various findings:
Analyzing Over 32k Movie and TV Titles!
Next, it was time to provide ChatGPT with a much larger database of movies and TV Shows. For this I started with over 2.5 million titles from IMDB and filtered out everything that had less than a 1000 user ratings. With various other tweaks and clean-ups, I finally had a CSV file with about 32,770 titles.
I then had ChatGPT mark the titles I had already seen (using data from the first file) and create our very own custom recommendation score, half based on my unique personal preferences and half based on IMDB ratings and number of user votes.
Success…My Very Own Recommendations Database
This worked quite well…and I finally had my very own database of over 32k titles that I could sort through to find other hidden gems for future viewing. The great thing is that you can both chat with GPT using natural language to find answers to your questions…or just play with the resulting CSV file on Excel / Apple Numbers.
The recommendations above are pretty solid…of course these can get even better if I provide more info on which titles I’ve seen (e.g. there’s a few above that I’ve actually seen already but not yet rated) and better classifiers for language, sentiment, plot lines, etc. I also had the system run some exploratory analysis and visualizations…
Here’s what a portion of the full dataset looks like, complete with our custom scoring included:
Tips, If You Try This
This took a bit of experimentation to get just right…here’s what will give you the best results:
- Quantity of Data – The bigger your training dataset, the better…this will help ChatGPT better understand your preferences.
- Quality of Data – As with most things, the output you get will be subject to the ‘Garbage In, Garbage Out’ principle, so ensuring the data you feed in is clean and well labelled is critical.
- Promptcraft – Knowing what to ask and how to ask is half the battle…this is where some iterative testing and dialogue will come in handy to ensure you get as close as possible to what you want.
Remember that you can also get very different results out of the system based on the assumptions and weights you feed in, e.g. what you tell the model to pay more attention to. Is it IMDB votes & ratings? Is it your own leaning to specific genres, directors and actors? See example below…
Full Video Walkthrough
Here’s a blow by blow overiew of the process, if you’re interested:
Hope you enjoyed this latest experiment in analytics and personalization…I think this approach to be applied to quite a few things. What do you think? Leave a comment or DM me if you have thoughts or try something similar.
Pingback: ChatGPT Crunched 305,989 Song Rankings Over 1,095 Days to Uncover What’s Really Hot in Music – Hotel Marketing, Technology and Loyalty