Skip to content

Latest commit

 

History

History
274 lines (196 loc) · 5.28 KB

File metadata and controls

274 lines (196 loc) · 5.28 KB

🎬 Netflix Recommendation System

Machine Learning Based Content Recommendation Engine

Developed as a University Project for Vivekananda Global University (VGU)

Python Machine Learning Streamlit Scikit-Learn Status


📖 Project Overview

Netflix hosts thousands of movies and TV shows, making it difficult for users to find content that matches their interests. This project presents a Content-Based Recommendation System that recommends similar movies and TV shows based on their content characteristics.

The recommendation engine analyzes information such as:

  • 🎭 Genre
  • 📝 Description
  • 🎬 Director
  • 🎤 Cast Members
  • 📅 Release Year

Using Natural Language Processing (NLP) and Machine Learning, the system identifies content similarities and recommends the most relevant titles.


🎯 Objectives

  • Build an intelligent recommendation system
  • Apply Natural Language Processing techniques
  • Implement TF-IDF Vectorization
  • Calculate similarity using Cosine Similarity
  • Develop an interactive Streamlit application
  • Improve content discovery for users

🏗️ System Architecture

Netflix Dataset
       │
       ▼
Data Preprocessing
       │
       ▼
Feature Engineering
       │
       ▼
TF-IDF Vectorization
       │
       ▼
Cosine Similarity Matrix
       │
       ▼
Recommendation Engine
       │
       ▼
Streamlit Web Application

🚀 Features

  • 🎬 Content-Based Recommendation System
  • 🤖 Machine Learning Powered Recommendations
  • ⚡ Fast Similarity Search
  • 🎨 Interactive Streamlit Interface
  • 📊 Clean and Modular Project Structure
  • 🔍 Recommendation Based on Movie Metadata

🛠️ Technologies Used

Technology Purpose
Python Programming Language
Pandas Data Processing
NumPy Numerical Operations
Scikit-Learn Machine Learning
Streamlit Web Application
TF-IDF Text Vectorization
Cosine Similarity Recommendation Engine
Pickle Model Serialization

📊 Dataset

The project uses the Netflix Titles Dataset containing metadata about Netflix movies and TV shows.

Features Used

  • Title
  • Genre
  • Description
  • Director
  • Cast
  • Release Year
  • Rating

⚙️ Methodology

1. Data Preprocessing

  • Handle missing values
  • Clean textual data
  • Prepare dataset for analysis

2. Feature Engineering

Important features such as genres, descriptions, directors, and cast members are combined into a single text feature.

3. TF-IDF Vectorization

The combined text is converted into numerical vectors using TF-IDF Vectorizer.

4. Similarity Calculation

Cosine Similarity is used to measure similarity between titles.

5. Recommendation Generation

The system returns the most similar movies or TV shows based on the selected title.


📂 Project Structure

Netflix-Recommendation-System/
│
├── data/
│   └── netflix_titles.csv
│
├── notebooks/
│   └── EDA.ipynb
│
├── src/
│   ├── preprocessing.py
│   ├── feature_engineering.py
│   ├── vectorizer.py
│   ├── recommender.py
│   └── utils.py
│
├── models/
│   ├── similarity.pkl
│   └── tfidf.pkl
│
├── app/
│   ├── app.py
│   ├── recommendation.py
│   └── poster_fetcher.py
│
├── model.py
├── requirements.txt
├── README.md
└── .gitignore

🔧 Installation

Clone the Repository

git clone https://github.com/mukeshsharma99/Netflix-Recommendation-System.git

Navigate to Project Directory

cd Netflix-Recommendation-System

Create Virtual Environment

python -m venv venv

Activate Virtual Environment

Windows

venv\Scripts\activate

Linux/macOS

source venv/bin/activate

Install Dependencies

pip install -r requirements.txt

▶️ Run the Project

Build Recommendation Model

python model.py

Launch Streamlit Application

streamlit run app/app.py

📈 Future Enhancements

  • Movie Poster Integration using TMDB API
  • Personalized Recommendations
  • Trending Content Section
  • User Authentication
  • AWS Cloud Deployment
  • Advanced Filtering Options

🎓 Learning Outcomes

This project helped in understanding:

  • Data Preprocessing
  • Feature Engineering
  • Natural Language Processing (NLP)
  • TF-IDF Vectorization
  • Cosine Similarity
  • Recommendation Systems
  • Streamlit Development
  • End-to-End Machine Learning Projects

👨‍💻 Developer

Mukesh Kumar

B.Tech – Computer Science & Engineering

Vivekananda Global University (VGU)

GitHub: https://github.com/mukeshsharma99


⭐ Support

If you found this project useful, please consider giving it a Star ⭐ on GitHub.


Made with ❤️ using Python, Machine Learning, and Streamlit