Skip to content

jozrftamson/persian-language-speech-recognition-system

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Persian Language Speech Recognition System

Overview

This project implements a speech recognition system for Persian language commands using a Convolutional Neural Network (CNN). The system is developed in Python, leveraging TensorFlow and Keras, and it classifies spoken commands based on an audio dataset.

Features

  • CNN Model: Trained using a dataset of Persian commands.
  • Dataset: Custom Persian voice commands in WAV format.
  • Example Audio: Includes a sample audio file (roshan.wav) for testing.

File Structure

  • CI_Lab_Project.py: Python script that builds and trains the CNN model.
  • classification_model35.h5: Pretrained CNN model for classification.
  • Voices-wav.zip: Compressed dataset containing voice recordings.
  • roshan.wav: Sample audio file used for testing the model.

Installation

To run this project, you'll need Python and several libraries. Follow these steps:

Prerequisites

  • Python 3.x
  • TensorFlow
  • Keras
  • NumPy
  • LibROSA

Steps

  1. Clone the repository
    git clone https://github.com/AminLari/Persian-Language-Speech-Recognition-System.git
    
  2. Install the dependencies
    pip install tensorflow librosa keras numpy
    

Usage

  1. Extract the dataset: After downloading the repository, you need to extract the voice command dataset.
    unzip Voices-wav.zip
    
  2. Run the training and evaluation script: To train the CNN model on the dataset, run the following Python script. The script will load the dataset, preprocess the audio data, and train the model.
    python CI_Lab_Project.py
    
  3. Test the model with a sample audio: After training the model, you can test it with a sample audio file. The provided sample file is roshan.wav. Modify the script to classify your own audio files if needed.
    # Modify script to test a different file
    test_audio = 'path/to/your/audio.wav'
    prediction = model.predict(test_audio)
    print(f'Predicted Class: {prediction}')
    

Results

The developed framework extracts time-frequency domain features from audio signals using MFCC.

The model can be evaluated using the provided sample audio file (roshan.wav).

  • Sample Input: roshan.wav
  • Predicted Output: The model predicts the class label for the input audio file.

📞 Contact

If you have any questions, feedback, or suggestions regarding this project, feel free to reach out:

You are welcome to create issues or pull requests to improve the project. Contributions are highly appreciated!

About

A CNN based Voice Recognition System for Persian language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%