Persian Language Speech Recognition System

Overview

This project implements a speech recognition system for Persian language commands using a Convolutional Neural Network (CNN). The system is developed in Python, leveraging TensorFlow and Keras, and it classifies spoken commands based on an audio dataset.

Features

CNN Model: Trained using a dataset of Persian commands.
Dataset: Custom Persian voice commands in WAV format.
Example Audio: Includes a sample audio file (roshan.wav) for testing.

File Structure

CI_Lab_Project.py: Python script that builds and trains the CNN model.
classification_model35.h5: Pretrained CNN model for classification.
Voices-wav.zip: Compressed dataset containing voice recordings.
roshan.wav: Sample audio file used for testing the model.

Installation

To run this project, you'll need Python and several libraries. Follow these steps:

Prerequisites

Python 3.x
TensorFlow
Keras
NumPy
LibROSA

Steps

Clone the repository

git clone https://github.com/AminLari/Persian-Language-Speech-Recognition-System.git

Install the dependencies

pip install tensorflow librosa keras numpy

Usage

Extract the dataset: After downloading the repository, you need to extract the voice command dataset.
```
unzip Voices-wav.zip
```
Run the training and evaluation script: To train the CNN model on the dataset, run the following Python script. The script will load the dataset, preprocess the audio data, and train the model.
```
python CI_Lab_Project.py
```
Test the model with a sample audio: After training the model, you can test it with a sample audio file. The provided sample file is roshan.wav. Modify the script to classify your own audio files if needed.
```
# Modify script to test a different file
test_audio = 'path/to/your/audio.wav'
prediction = model.predict(test_audio)
print(f'Predicted Class: {prediction}')
```

Results

The developed framework extracts time-frequency domain features from audio signals using MFCC.

The model can be evaluated using the provided sample audio file (roshan.wav).

Sample Input: roshan.wav
Predicted Output: The model predicts the class label for the input audio file.

📞 Contact

If you have any questions, feedback, or suggestions regarding this project, feel free to reach out:

Name: Mohammadamin Lari
Email: mohammadamin.lari@gmail.com
GitHub: AminLari

You are welcome to create issues or pull requests to improve the project. Contributions are highly appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Persian Language Speech Recognition System

Overview

Features

File Structure

Installation

Prerequisites

Steps

Usage

Results

📞 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
CI_Lab_Project.py		CI_Lab_Project.py
README.md		README.md
Voices-wav.zip		Voices-wav.zip
classification_model35.h5		classification_model35.h5
roshan.wav		roshan.wav

Folders and files

Latest commit

History

Repository files navigation

Persian Language Speech Recognition System

Overview

Features

File Structure

Installation

Prerequisites

Steps

Usage

Results

📞 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages