Audio & Speech Processing

Speech recognition, text-to-speech, audio classification, music generation, and audio ML.

Hugging Face: Audio Course - Comprehensive course on audio processing with transformers. Beginner
Coursera: Audio Signal Processing for Music Applications - Stanford/UPF course on audio analysis fundamentals. Intermediate
Python for Audio Signal Processing (YouTube) - Practical video series on audio ML with Python. Beginner
MIT: Music Technology - Free MIT course on the intersection of music and technology. Beginner

SpeechBrain - Open-source PyTorch toolkit for speech and audio processing. Intermediate
Mozilla Common Voice - Free open speech dataset for building voice applications. All Levels
AssemblyAI Tutorials - Practical guides on speech-to-text and audio AI. Beginner
Whisper (OpenAI) - Open-source speech recognition model with documentation and examples. Intermediate
Librosa Documentation - Python library for audio and music analysis with tutorials. Beginner
ESPnet - End-to-end speech processing toolkit with recipes and tutorials. Advanced

Music Information Retrieval (MIR) Tutorials - Free interactive notebooks on audio features and analysis. Intermediate
Kaldi Documentation - Toolkit for speech recognition research with extensive guides. Advanced
Awesome Speech Recognition (GitHub) - Curated list of speech recognition and synthesis papers. All Levels

Provide feedback