Bingsong Bai ShawnPi233

Bingsong Bai

I am a Foundation Model Algorithm Engineer at ModelBest (面壁智能，VoxCPM). I received my Master's degree from the School of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT). My research interests lie in the intersection of Large Speech Models (LSM), Automatic Speech Recognition (ASR), Singing Voice Conversion (SVC), and Expressive Text-to-Speech (TTS).

Prior to joining ModelBest, I earned my Bachelor's degree in Computer Science and Technology from Ningbo University (Yangming Innovation Class). I have gained extensive industry experience through research and engineering internships at Zhipu AI (智谱AI语音输入法), Tencent Music Entertainment (TME, Lyra Lab / 天琴实验室), and Momo (陌陌).

I have been awarded the Zhejiang Government Scholarship (3 times) and the BUPT First-Class Scholarship (2 times). My research has been accepted for top-tier conferences such as AAAI, Interspeech, ICASSP, and ISCSLP.

🔥 News

2026.03: 🚀 Joined ModelBest as a Large Speech Foundation Model Researcher.
2026.01: 🎉 One paper (SynParaSpeech) accepted by ICASSP 2026 as the first author!
2025.12: 🎉 One paper (HQ-SVC) accepted by AAAI 2026 as the first author!
2025.10: 🚀 Joined Zhipu AI as a Speech Large Model Research Intern.
2025.07: 🎸 Joined Tencent Music (QQ Music) focusing on multi-speaker conversational podcast TTS.
2025.03: 👫 Joined Momo focusing on paralinguistic TTS and understanding.
2024.06: 🎉 One paper (SPA-SVC) accepted by Interspeech 2024 as the first author.

📑 Selected Research Papers

📎 For a full list of publications, please visit my Google Scholar.

HQ-SVC: High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios, Bingsong Bai, et al., AAAI 2026. [CCF-A]

SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding, Bingsong Bai, et al., ICASSP 2026. [CCF-B]

SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion, Bingsong Bai, et al., Interspeech 2024. [CCF-B]

ExpressiveSinger: Synthesizing Expressive Singing Voice as an Instrument, Fengping Wang, Bingsong Bai, et al., ISCSLP 2024.

🗣 Large Speech Models & TTS

GLM-ASR Nano: Participated in training and SFT of the SOTA open-source ASR model, reaching #1 on Hugging Face speech model download charts (440k+ downloads in 2 weeks).
Multi-Speaker Conversational TTS: Improving rhythm/pauses by 68.49% in AI Podcasts (Internal Project @ Tencent Music). Participated in QinYu-TTS

🏆 Awards & Honors

2023, 2024: BUPT First-Class Academic Scholarship
2020, 2021, 2022: Zhejiang Provincial Government Scholarship (3 consecutive years)
2021: Mathematical Contest in Modeling (MCM) - International Second Prize
2020: Contemporary Undergraduate Mathematical Contest in Modeling (CUMCM) - Provincial Second Prize

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bingsong Bai ShawnPi233

Achievements

Achievements

Highlights

Block or report ShawnPi233

Bingsong Bai

🔥 News

📑 Selected Research Papers

🗣 Large Speech Models & TTS

🏆 Awards & Honors

Pinned Loading

Uh oh!