perf: warm-boot fast path + atomic cloud request ids#29
Merged
Conversation
Concurrent send() calls (MQTT thread + polling + entity commands) read self._id outside the lock and incremented it after the call, so two in-flight requests could share the same id and the cloud rejects the duplicate with an empty result (silent, DEBUG-only). Also documents why the first-refresh property fetch stays sequential with 50-key batches: the robot rejects larger get_properties payloads (100/200 fail) and answers the cloud bridge serially, so parallel batches bring zero wall-clock gain (measured on device, 2026-07-02).
First refresh previously blocked on 4 serial robot round-trips (~5-8 s depending on Dreame cloud latency). The coordinator now persists the device's answered-property inventory (model/firmware keyed, homeassistant.helpers.storage.Store). On warm boots only the priority batch (every property DreameVacuumDeviceCapability.load() reads — some flags are absence-based — plus the primary state) is fetched synchronously; the remaining present properties load from a background thread. Deferred entities are created identically (existence = inventory) but stay unavailable until their first value arrives, i.e. exactly the timeline they had before. Cold boots (first setup, firmware change) keep the full synchronous load and then publish the fresh inventory. Measured on HA dev (same degraded-cloud window): cold 10.74 s -> warm 5.84 s entry setup; priority batch 1.8-2.0 s vs 4.7-8.4 s full. Proof of no loss: entity registry diff empty (258/258, unique_id and disabled state included) and zero entities downgraded from valued to unavailable/unknown across the change (146 recorder states compared).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
👮 Files not reviewed due to content moderation or server errors (9)
📝 Walkthrough
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Résumé
Deux changements issus de l'investigation « temps de boot » :
fix : request ids cloud atomiques
send()lisaitself._idhors verrou et l'incrémentait après l'appel : deux requêtes simultanées (thread MQTT, polling, commandes d'entités) pouvaient partir avec le même id, que le cloud Dreame rejette silencieusement (résultat vide, DEBUG only). Ids désormais alloués atomiquement, test de concurrence à l'appui (8 threads → 8 ids uniques).perf : warm boot via inventaire de propriétés persisté
Le premier refresh bloquait sur 4 allers-retours robot séquentiels (~5-8 s selon la latence cloud ; plafond firmware : 50 clés/requête, lots de 87/100/200 rejetés — mesuré sur appareil). Le coordinator persiste maintenant l'inventaire des propriétés présentes (Store, clé model+firmware) :
capability.load(), dont les flags à sémantique d'absence, + état principal) est chargé en bloquant ; le reste arrive d'un thread d'arrière-planunavailablejusqu'à leur première valeur — chronologie inchangéeMesuré (même fenêtre réseau) : setup entry 10,74 s → 5,84 s ; lot bloquant 8,4 s → 1,8-2,0 s.
Preuve « rien perdu » : diff du registre vide (258/258 entités, unique_id + disabled inclus) ; 0 entité rétrogradée de « valorisée » vers unavailable/unknown sur 146 états comparés avant/après ; 1 897 tests verts, mypy vert.