Enhancing Cyberbullying Detection: A Multi-Algorithmic Approach

Code for our peer-reviewed paper published at IEEE ADICS 2024. 📄 Paper: https://ieeexplore.ieee.org/document/10533585

Overview

A comparative study of machine-learning and deep-learning methods for detecting text-based cyberbullying on social media. We benchmark classical ML pipelines against a deep-learning embedding model to find the most accurate, efficient detector for multi-class harassment text.

Dataset

~47,000 labeled tweets across 6 classes: religion, age, gender, ethnicity, other-cyberbullying, and not-cyberbullying. (Public Twitter cyberbullying dataset — Source: https://www.kaggle.com/datasets/andrewmvd/cyberbullying-classification.)

Methods

TF-IDF + Support Vector Classifier (SVC) — pipeline method
TF-IDF + Multinomial Naive Bayes (MNB) — pipeline method
GloVe word embeddings + deep-learning model (separate approach)
Metrics: accuracy, precision, recall, F1, specificity, training time

Results

Approach	Accuracy	Precision	Recall	F1	Specificity
SVC + TF-IDF	79.8%	0.826	0.798	0.796	99.45%
MNB + TF-IDF	70.9%	0.841	0.709	0.746	93.56%
GloVe (DL)	73.3%	0.755	0.733	0.742	98.37%

Finding: the SVC + TF-IDF pipeline beat the deep-learning model on accuracy and specificity, at a small training-time cost (7.8s vs 5.77s).

Repository

SVM-NaiveBayes.ipynb — TF-IDF + SVC / MNB pipelines
Glove.ipynb — GloVe embedding deep-learning model

How to run

pip install scikit-learn pandas numpy tensorflow (+ GloVe vectors for the DL notebook)
Download the dataset (link above), place in the repo root
Open the notebooks in Jupyter and run top-to-bottom

Citation

N. P. S. Pendela et al., "Enhancing Cyberbullying Detection: A Multi-Algorithmic Approach," 2024 IEEE ADICS. DOI: 10.1109/ADICS58448.2024.10533585

Authors

Naga Prem Sai Pendela (first author) and team — Vardhaman College of Engineering.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Glove.ipynb		Glove.ipynb
README.md		README.md
SVM-NaiveBayes.ipynb		SVM-NaiveBayes.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing Cyberbullying Detection: A Multi-Algorithmic Approach

Overview

Dataset

Methods

Results

Repository

How to run

Citation

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enhancing Cyberbullying Detection: A Multi-Algorithmic Approach

Overview

Dataset

Methods

Results

Repository

How to run

Citation

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages