[Model][MIMO Audio] Unify sync/async code2wav into single path by NickCao · Pull Request #4359 · vllm-project/vllm-omni

NickCao · 2026-06-11T18:48:45Z

Purpose

Models with both custom_process_next_stage_input_func and async_chunk_process_next_stage_input_func maintain two separate producer functions and two decoder paths that are ~80% identical. This PR eliminates the duplication for MIMO Audio by routing both sync and async modes through the single async_chunk path.

Test Plan

vLLM Version: 0.23.0

vLLM-Omni Commit: d80d796

# Unit tests — stage input processors (15 passed)
pytest tests/model_executor/stage_input_processors/test_mimo_audio_llm2code2wav.py -v
pytest tests/model_executor/stage_input_processors/test_mimo_audio_flush_remaining_codes.py -v

# Unit tests — full streaming helpers suite (40 passed, includes updated full_payload tests)
pytest tests/model_executor/stage_input_processors/test_qwen3_omni_streaming_helpers.py -v

# Unit tests — batch decode (25 passed, sync and streaming parametrized)
pytest tests/model_executor/models/mimo_audio/test_mimo_audio_code2wav_batch_decode.py -v

# E2e — online serving (4 passed, both async_chunk and no_async_chunk)
pytest tests/e2e/online_serving/test_mimo_audio.py -v

Test Result

PASSED

chatgpt-codex-connector · 2026-06-11T18:48:50Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

NickCao · 2026-06-15T18:11:54Z

Rebased and retested.

hsliuustc0106

Posting one inline finding from review.

hsliuustc0106 · 2026-06-19T06:51:28Z

@qibaoyuan PTAL

Point both custom_process_next_stage_input_func and async_chunk_process_next_stage_input_func to the same llm2code2wav_async_chunk function. Add unflatten_payload call so it handles both transport modes (full_payload accumulator flattens dict keys; async_chunk transport does not). Remove the duplicate _batch_decode_waveforms decoder and the if-is_async_chunk branch — the streaming decoder with left_context_size=0 produces identical output to the old sync path. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

…arams Use module-level _DEFAULT_CODEC_CHUNK_FRAMES and _DEFAULT_CODEC_LEFT_CONTEXT_FRAMES as parameter defaults so callers don't need to repeat them. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

…ing paths Update test_mimo_audio_code2wav_batch_decode to exercise _batch_chunked_decode_streaming (the unified decode path) instead of the removed _batch_decode_waveforms. Every test is parametrized with left_context in {0, 1, 5} covering sync (no strip), partial strip, and full strip. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

…_chunk Add --no-async-chunk variant to test_params so both sync and streaming code2wav paths are exercised by the existing e2e tests. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

Signed-off-by: Nick Cao <ncao@redhat.com>

The upstream "Output Processor Phase 2" refactor renamed pooling_output → multimodal_output in the chunk_transfer_adapter but not in the full-payload mixin. Restore llm2code2wav_full_payload as a thin wrapper that bridges the kwarg mismatch so the non-async path works with both callers on vllm 0.23.0. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

…ter in async_chunk The old llm2code2wav_full_payload truncated flat_codes at MAX_CODE2WAV_TOKENS and filtered zero-padded codec rows via _filter_zero_codec_rows before flattening. Both guards were lost when the function was replaced with a delegation to llm2code2wav_async_chunk. Restore them and drop the tensor-list-tensor round-trip. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

Provide a transfer_manager mock with code_prompt_token_ids, add req_id to request fixtures, and switch from dict access to OmniPayloadStruct attribute access to match the new return type. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

chatgpt-codex-connector · 2026-06-22T15:27:16Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

NickCao · 2026-06-22T17:19:41Z

Passed unit tests and manually examined the output audio.

NickCao requested review from ZeldaHuang, gcanlin, linyueqian, princepride, tzhouam, yenuo26 and yuanheng-zhao as code owners June 11, 2026 18:48

NickCao force-pushed the worktree-unified-chunk-path branch 2 times, most recently from 9343112 to 06d7946 Compare June 15, 2026 17:14

hsliuustc0106 reviewed Jun 19, 2026

View reviewed changes

Comment thread vllm_omni/model_executor/stage_input_processors/mimo_audio.py

hsliuustc0106 added the tts code related to tts models label Jun 19, 2026

NickCao and others added 6 commits June 22, 2026 09:49

[Test][MIMO Audio] Unskip test_audio_to_text_audio_001

cadff43

Signed-off-by: Nick Cao <ncao@redhat.com>

NickCao force-pushed the worktree-unified-chunk-path branch from 06d7946 to ca4162c Compare June 22, 2026 14:13

NickCao marked this pull request as draft June 22, 2026 14:15

NickCao added 2 commits June 22, 2026 10:18

NickCao force-pushed the worktree-unified-chunk-path branch from ca4162c to d80d796 Compare June 22, 2026 14:26

NickCao marked this pull request as ready for review June 22, 2026 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model][MIMO Audio] Unify sync/async code2wav into single path#4359

[Model][MIMO Audio] Unify sync/async code2wav into single path#4359
NickCao wants to merge 8 commits into
vllm-project:mainfrom
NickCao:worktree-unified-chunk-path

NickCao commented Jun 11, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

NickCao commented Jun 15, 2026

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

hsliuustc0106 commented Jun 19, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 22, 2026

Uh oh!

NickCao commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NickCao commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

NickCao commented Jun 15, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hsliuustc0106 commented Jun 19, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 22, 2026

Uh oh!

NickCao commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NickCao commented Jun 11, 2026 •

edited

Loading