Actor Network Visualization#
Find the shortest path between Govinda & Angelina Jolie using IMDb data using Python: networkx or scikit-network.
- Notebook: How this video was created
- The data used to visualize the network
- The shortest path between actors
- IMDb data
- Codebase
You’ll learn how to analyze and visualize the social network of actors using IMDb data, covering:
- Data Acquisition and Preparation: Learn how to download and work with IMDb’s non-commercial datasets. This includes using Python to read and process large TSV files.
- Data Filtering and Wrangling: Understand how to clean and filter large datasets to focus on relevant information, such as specific actor categories (actors/actresses) and movie titles, to make the data more manageable for analysis.
- Actor Pairing and Network Creation: Discover how to identify significant collaborations between actors by setting thresholds for the minimum number of films and co-appearances. You will learn to create a network of actors based on these pairings.
- Handling Data Inconsistencies: Learn techniques to manage real-world data issues, such as variations in actor names, to ensure the accuracy of your network.
- Network Analysis with Python: Get introduced to powerful Python libraries for network analysis like
networkxandscikit-networkto build, manipulate, and study the structure of complex networks. - Finding the Shortest Path: Learn how to apply graph algorithms to find the shortest path between two actors in the network, demonstrating the “degrees of separation” concept.
- Jupyter Notebook for Data Storytelling: See how to use Jupyter Notebooks to document and present a data analysis project, combining code, text, and visualizations to tell a compelling story.
- Exporting Results: Learn how to export your findings, such as the actor pairings, into formats like Excel for further use or presentation.
