Speech recognition, text-to-speech, audio classification, music generation, and audio ML.
- Hugging Face: Audio Course - Comprehensive course on audio processing with transformers.
Beginner - Coursera: Audio Signal Processing for Music Applications - Stanford/UPF course on audio analysis fundamentals.
Intermediate - Python for Audio Signal Processing (YouTube) - Practical video series on audio ML with Python.
Beginner - MIT: Music Technology - Free MIT course on the intersection of music and technology.
Beginner
- SpeechBrain - Open-source PyTorch toolkit for speech and audio processing.
Intermediate - Mozilla Common Voice - Free open speech dataset for building voice applications.
All Levels - AssemblyAI Tutorials - Practical guides on speech-to-text and audio AI.
Beginner - Whisper (OpenAI) - Open-source speech recognition model with documentation and examples.
Intermediate - Librosa Documentation - Python library for audio and music analysis with tutorials.
Beginner - ESPnet - End-to-end speech processing toolkit with recipes and tutorials.
Advanced
- Music Information Retrieval (MIR) Tutorials - Free interactive notebooks on audio features and analysis.
Intermediate - Kaldi Documentation - Toolkit for speech recognition research with extensive guides.
Advanced - Awesome Speech Recognition (GitHub) - Curated list of speech recognition and synthesis papers.
All Levels