🎵 N-Gram Melody Generator 🎹

A Python-based AI music generation model using N-Grams, Lidstone Smoothing, and Perplexity evaluation.

The N-Gram Melody Generator is an intelligent system based on Statistical Language Modeling principles, adapted for music generation. By training on a dataset of musical notes (music.csv), the model learns note transition probabilities and generates completely new, harmonic sequences based on initial starting notes (seeds).

✨ Key Features

Multiple N-Gram Architectures: Supports Uni-gram (baseline note frequencies), Bi-gram (1st-order Markov chain), and Tri-gram (2nd-order Markov chain) models.
Lidstone Smoothing: Implements an $\alpha$ smoothing parameter to handle zero-probability edge cases for unseen note sequences.
Modifier Handling: Option to include or ignore musical modifiers (like flats b and half-flats/korons k) to test how vocabulary size affects model performance.
Perplexity Evaluation: Automatically calculates the sequence Perplexity to measure how well the model predicts the generated melody (lower is better).
Grid Search Optimization: Includes a module to iteratively search for the optimal $\alpha$ parameter that minimizes validation perplexity.

🛠️ Installation & Usage

You only need Python 3 and the pandas library to run this project.

Install dependencies:

pip install pandas

Run the model: Ensure your dataset is named music.csv (containing a column named note) and is located in the same directory.

python ngram_melody_generator.py

📊 Performance Summary & Results

Based on the model's evaluation, we observed how the complexity of the model (N), the smoothing parameter ($\alpha$), and vocabulary reduction (ignoring modifiers) affect the output quality.

🏆 Top Configurations (Lowest Perplexity)

Here is a structured breakdown of the best-performing models from the experiments. Lower perplexity indicates higher predictability and musical coherence.

Model Rank	N-Gram Type	$\alpha$ Value	Ignore Modifiers	Best Seed Tested	Lowest Perplexity
🥇 1st	Tri-gram	0.5	True	`[ A Bb C D ]`	4.55
🥈 2nd	Tri-gram	0.5	True	`[ C B C D ]`	4.83
🥉 3rd	Bi-gram	0.5	True	`[ C B C D ]`	5.11
🏅 4th	Tri-gram	0.5	False	`[ A Bb C D ]`	6.00

💡 Key Insight: The 3-gram model combined with $\alpha = 0.5$ and ignore_modifiers=True yields the lowest perplexity. Reducing the vocabulary size (removing flats/korons) significantly improves the model's confidence.

🔬 Detailed Experiment Logs

Below are the structured logs from the two main experiments conducted by the generator. Click to expand and view the generated melodies.

Experiment 1: Standard Evaluation (Alpha 1.0 vs 0.5)

Testing basic Uni, Bi, and Tri-gram models across predefined seeds.

👀 Click to view Output 1

==================================================
 Running with ignore_modifiers = False
==================================================

--- Model: Uni-gram | Alpha: 1.0 ---
Seed: [  C B C D   ] -> Gen: C B C D C S B G A G G C G F | Perplexity: 8.77

--- Model: Bi-gram | Alpha: 0.5 ---
Seed: [  A Bb C D  ] -> Gen: A Bb C D C A A C C D E E D C | Perplexity: 6.60

--- Model: Tri-gram | Alpha: 0.5 ---
Seed: [  A Bb C D  ] -> Gen: A Bb C D C C C G C C G G G D | Perplexity: 6.00

==================================================
 Running with ignore_modifiers = True
==================================================

--- Model: Bi-gram | Alpha: 0.5 ---
Seed: [  C B C D   ] -> Gen: C B C D D D E F F E S B A A | Perplexity: 5.11

--- Model: Tri-gram | Alpha: 0.5 ---
Seed: [  C B C D   ] -> Gen: C B C D D C B D E A A B C D | Perplexity: 6.34
Seed: [  A Bb C D  ] -> Gen: A B C D C B C D F E D E B D | Perplexity: 4.55

Experiment 2: Grid Search & Optimal Alpha Generation

In this experiment, the system first searched for the best continuous $\alpha$ value before generating sequences.

👀 Click to view Output 2

============================================================
 OPTIONAL PART: Grid Search for Optimal Alpha
============================================================

--- Searching Optimal Alpha for 2-gram (Ignore Modifiers: False) ---
>> BEST ALPHA FOUND: 0.81 | Min Validation Perplexity: 8.79

--- Searching Optimal Alpha for 3-gram (Ignore Modifiers: False) ---
>> BEST ALPHA FOUND: 0.66 | Min Validation Perplexity: 10.10

============================================================
 MAIN PART: Generating Melodies (Ignore_Modifiers = False)
============================================================

--- Model: Bi-gram | Alpha: 0.5 ---
Seed: [    C B C D    ] -> Gen: C B C D E S D Eb D C C D C C | Perplexity: 5.15

--- Model: Tri-gram | Alpha: 0.5 ---
Seed: [   A Bb C D    ] -> Gen: A Bb C D D E F S G E B Ab F Bb | Perplexity: 9.19

============================================================
 MAIN PART: Generating Melodies (Ignore_Modifiers = True)
============================================================

--- Model: Tri-gram | Alpha: 1.0 ---
Seed: [    C B C D    ] -> Gen: C B C D E C B G A G A D C B | Perplexity: 6.72
Seed: [   A S B Db    ] -> Gen: A S B D C B D A G A B C S C | Perplexity: 6.04

--- Model: Tri-gram | Alpha: 0.5 ---
Seed: [    C B C D    ] -> Gen: C B C D E C E D C E G F G E | Perplexity: 4.83
Seed: [   A Bb C D    ] -> Gen: A B C D D E F S G C C C D C | Perplexity: 4.65

🧠 Conclusion & Takeaways

Context Matters: Tri-gram models consistently outperform Uni-gram and Bi-gram models. By looking at the two previous notes, the AI maintains a much better harmonic context.
Smoothing is Crucial: Lowering Lidstone's $\alpha$ from 1.0 (Laplace) to 0.5 or using Grid Search values (e.g., 0.66) tightens the probability distribution, preventing the model from assigning too much weight to completely random, unseen notes.
Curse of Dimensionality: By ignoring flat (b) and koron (k) modifiers, the vocabulary shrinks. This makes the transition matrix denser, which helps the N-gram model generalize better, leading to significantly lower perplexity scores.

👨‍💻 Author

Developed by Sayyed Hossein Hosseini for Computational Music and Natural Language Processing (NLP) experimentation.

📄 License

This project is licensed under the MIT License. Feel free to use, modify, and distribute it as per the license terms. ⭐ Feel free to fork and contribute!

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Dataset		Dataset
Documentation		Documentation
.gitignore		.gitignore
LICENSE		LICENSE
NGram_Melody_Generator.ipynb		NGram_Melody_Generator.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎵 N-Gram Melody Generator 🎹

✨ Key Features

🛠️ Installation & Usage

📊 Performance Summary & Results

🏆 Top Configurations (Lowest Perplexity)

🔬 Detailed Experiment Logs

Experiment 1: Standard Evaluation (Alpha 1.0 vs 0.5)

Experiment 2: Grid Search & Optimal Alpha Generation

🧠 Conclusion & Takeaways

👨‍💻 Author

📄 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎵 N-Gram Melody Generator 🎹

✨ Key Features

🛠️ Installation & Usage

📊 Performance Summary & Results

🏆 Top Configurations (Lowest Perplexity)

🔬 Detailed Experiment Logs

Experiment 1: Standard Evaluation (Alpha 1.0 vs 0.5)

Experiment 2: Grid Search & Optimal Alpha Generation

🧠 Conclusion & Takeaways

👨‍💻 Author

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages