US-Airline-Sentiment-Analysis

This project implements a data analytics pipeline that compares real-time streaming and offline batch processing using the Twitter US Airline Sentiment dataset. Leveraging tools such as Apache Kafka, Apache Spark, and PostgreSQL, I captured and processed tweet data to analyze sentiment distribution, hashtag frequency, and airline-specific sentiment trends.

This was done as a Mini project for the course Database Technologies (UE22CS343BB3).

The streaming component ingested tweet data through Kafka and processed it using Spark Structured Streaming. In parallel, the batch mode processed the same dataset offline using Spark and Pandas for comparison. Both modes stored their outputs in PostgreSQL, enabling visual and statistical comparisons.

System Requirements

Python >= 3.8+
Apache Kafka and Zookeeper
Apache Spark
PostgreSQL installed and running
Java 8+
Python libraries (pandas, time, pyspark.sql, kafka, json, psycopg2, tabulate, matplotlib)

General Steps to Run

Start Zookeeper and Kafka
Create Kafka topic
Start PostgreSQL
Create DB
Load Tweets into Kafka
Perform Real-Time Analysis (Spark Streaming)
Perform Batch Analysis
Comparison Script (Optional)

Notes: Project was developed and tested in WSL2 (Ubuntu) but instructions are platform-agnostic. You may adapt DB credentials and file paths per your OS.

Feel free to fork, extend, and use this for streaming & data engineering demos!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
airline_comparison.png		airline_comparison.png
batch_processor.py		batch_processor.py
consumer.py		consumer.py
extra_spark_stream.py		extra_spark_stream.py
hashtag_comparison.png		hashtag_comparison.png
producer.py		producer.py
sentiment_comparison.png		sentiment_comparison.png
simple_comparision.py		simple_comparision.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

US-Airline-Sentiment-Analysis

System Requirements

General Steps to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

US-Airline-Sentiment-Analysis

System Requirements

General Steps to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages