You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Defines the audio output service: rendering pipeline, the sequential
playback queue shared by speech and sound effects, remote-client rendering
(ovos.utterance.speak.b64 -> ovos.audio.speech), output lifecycle signals,
speaking-status query, stop integration, and the listen-triggered
ovos.mic.listen follow-up.
- §4 — renumber the Listen flag section from §4.5 to §4.4 (no §4.4 existed);
update its eight in-document references.
- §5.3 — ovos.audio.is_speaking: an absent or "default" session_id asks
about the device-local default session (SESSION-1 §3.1), not a wildcard
over all sessions.
The §9.6 listen field and the speak payload live in the PIPELINE-1 PR.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This project was funded through the [NGI0 Commons Fund](https://nlnet.nl/commonsfund),
181
+
a fund established by [NLnet](https://nlnet.nl) with financial support from the
182
+
European Commission's [Next Generation Internet](https://ngi.eu) programme, under
183
+
the aegis of [DG Communications Networks, Content and Technology](https://commission.europa.eu/about-european-commission/departments-and-executive-agencies/communications-networks-content-and-technology_en)
184
+
under grant agreement No [101135429](https://cordis.europa.eu/project/id/101135429).
Version numbers in this repository carry compatibility semantics anchored to
4
+
the pre-specification behavior of the OVOS stack:
5
+
6
+
| Version | Meaning |
7
+
| --- | --- |
8
+
|**V0**| The de facto, undocumented status quo — the behavior the stack ships before a subsystem is formalized. V0 is never written down as a spec; it is the reference point. |
9
+
|**V1**| A formalization of behavior that is **compatible with V0**. A V0 component keeps working against a V1 implementation, even if degraded (missing optional fields, reduced guarantees, legacy namespaces honored). |
10
+
|**V2**| Behavior that is **not backwards compatible** with V0. Adopting it requires coordinated migration (e.g. the `legacy_namespace` configuration gate). |
11
+
12
+
Until launch day, every spec in this repository MUST be classified as V1 or
13
+
V2. The classification is part of the spec header. Rules of thumb:
14
+
15
+
- A spec that documents existing message flows, adds optional fields, or
16
+
introduces parallel namespaces while the legacy ones keep working → **V1**.
17
+
- A spec that renames or removes message types, changes payload semantics, or
18
+
requires consumers to change before producers (or vice versa) → **V2**.
19
+
- A single spec MAY contain V1 sections and V2 sections only if the V2 parts
20
+
are explicitly gated (configuration flag) and the ungated behavior is V1.
21
+
22
+
Within a class, editorial revisions bump the spec's own revision number in
23
+
its header; compatibility class changes (V1 → V2) are a new spec version, not
24
+
a revision.
25
+
26
+
## The 1.0 definition
27
+
28
+
The compatibility classes define the project roadmap. The stack starts at V0
29
+
(the undocumented status quo — beta). Each subsystem is formalized as V1, then
30
+
migrated to V2 where the spec demands incompatible change. **OVOS is fully
31
+
spec compliant when every subsystem operates on V2 — that state is the
32
+
"breakthrough" in "from beta to breakthrough", and it is the 1.0 release
Copy file name to clipboardExpand all lines: ovos-pipeline-1.md
-18Lines changed: 0 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1130,7 +1130,6 @@ audio-capable deployment.
1130
1130
|-------|------|----------|---------|
1131
1131
|`utterance`| string | yes | The natural-language response string. |
1132
1132
|`lang`| string | no | BCP-47 tag of the response language. When absent, the output stage resolves language from the session per OVOS-SESSION-1 §3.2. |
1133
-
|`listen`| bool | no | When `true`, the handler expects a follow-up utterance from the user after this response is delivered. Output consumers **SHOULD** re-open the user input channel (microphone, chat input affordance, etc.) once delivery is complete. Absent or `false` means no follow-up is expected. |
1134
1133
1135
1134
**Derivation and session propagation.** A handler **MUST** derive each
1136
1135
`ovos.utterance.speak` emission from the dispatch Message (§7) it
@@ -1147,26 +1146,9 @@ acts silently (playing a sound, toggling a device, queuing media) is
1147
1146
conformant. When a handler emits multiple, the order of emission is the
1148
1147
intended delivery order; the output stage **SHOULD** preserve it.
1149
1148
1150
-
**The `listen` flag and follow-up flows.** When a handler emits
1151
-
`ovos.utterance.speak` as the prompt in a `get_response` flow
1152
-
(OVOS-CONVERSE-1 §5), it **MUST** set `listen: true` on that Message.
1153
-
The flag is a protocol-level statement that the handler expects a
1154
-
follow-up utterance; every output consumer — audio, chat, any other
1155
-
delivery channel — reads it and re-opens the user input channel
1156
-
accordingly. Omitting the flag in a `get_response` flow is
1157
-
non-conformant: the user is asked a question but the input channel
1158
-
is never re-opened.
1159
-
1160
1149
**Broadcast.**`ovos.utterance.speak` carries no `destination` — it is
1161
1150
broadcast. Any output component subscribed to the topic may consume it.
1162
1151
1163
-
**Remote-client variant.** When the intended recipient cannot render
1164
-
audio locally (e.g. a satellite without TTS), a handler or bridge MAY
1165
-
emit `ovos.utterance.speak.b64` instead. The audio output service
1166
-
processes this through the same TTS pipeline and emits
1167
-
`ovos.audio.speech` with base64-encoded audio for the client to play
0 commit comments