Skip to content

Commit d857e20

Browse files
psjostromclaude
andauthored
refactor: move SyncOrchestrator off Dispatchers.Main onto a dedicated IO scope (#220)
* refactor: move SyncOrchestrator off Dispatchers.Main onto a dedicated IO scope Previously `SyncOrchestrator` inherited the foreground service's Main scope. Two problems with that: - A slow Room query or HTTP retry inside a `scope.launch` block runs on Main and blocks the UI. - Any exception escaping a network/DB catch hits Android's default Main-thread uncaught handler and kills the foreground service. The fix in #218 widened every leaf catch in the network clients, but the underlying architectural smell — sync work running on Main — remained. `SyncOrchestrator` now owns its own `CoroutineScope(SupervisorJob() + Dispatchers.IO + handler)`, mirroring the pattern in `NightscoutPusher` and `TidepoolUploader`. The `start()` signature loses its `scope` parameter; the service no longer leaks its Main scope into background work. `stop()` cancels the internal scope so a service shutdown propagates correctly. A `CoroutineExceptionHandler` is installed on the new scope as defense in depth — independent from the per-scope handlers in #218 (which protect StrimmaService/Pusher/Uploader scopes; this one protects the new sync scope). Test plan: - `./gradlew testDebugUnitTest` — full suite green - `./gradlew detekt` — clean - `./gradlew assembleRelease` — clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor: extract scopeCrashHandler factory + idempotence guard + tests Address review feedback for #220 (now that #218 is merged to main): - Extract `scopeCrashHandler(name)` factory in `receiver/`. Replaces 4 near-identical `CoroutineExceptionHandler { _, t -> DebugLog.log(...) }` blocks across StrimmaService, NightscoutPusher, TidepoolUploader, and SyncOrchestrator. One log format, one truncation length, one source. - Add idempotence guard to `SyncOrchestrator.start()` (mirrors UpdateChecker's `if (checkJob != null) return` pattern). Stray double-call no longer duplicates the periodic loops or stacks two DataStore collectors. `started` is `@VisibleForTesting`-exposed for the new test class. - Reorder `SyncOrchestrator.stop()` and add a comment explaining why `updateChecker.stop()` MUST be called explicitly (its internal `checkJob` field needs nulling — `scope.cancel()` cancels the launched coroutine but leaves the Job reference live, which breaks the next `start()`). - Trim redundant `treatmentSyncJob?.cancel()` / `exerciseSyncJob?.cancel()` in `stop()` — they're children of `scope` and get cancelled transitively. Keep the `= null` reassignments for clarity. - Add `@Volatile` to `var scope` and `var job` so a future cross-thread caller can't observe a torn write. - Rewrite KDoc to stop overselling — CalendarPoller still runs on Main scope. Comment the asymmetry at the call site in StrimmaService. - Update `@Suppress("LongParameterList")` rationale to acknowledge the dispatcher param is plumbing, not a collaborator. New `SyncOrchestratorTest` covers: idempotent start, scope-rebuild re-entry after stop, stop-without-start safety. Uses real collaborators across the graph; `SilentNightscoutClient` returns null/empty so launches exit cleanly without real HTTP. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 38725e0 commit d857e20

6 files changed

Lines changed: 268 additions & 39 deletions

File tree

app/src/main/java/com/psjostrom/strimma/network/NightscoutPusher.kt

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import com.psjostrom.strimma.data.SettingsRepository
66
import com.psjostrom.strimma.di.IoDispatcher
77
import com.psjostrom.strimma.notification.AlertManager
88
import com.psjostrom.strimma.receiver.DebugLog
9+
import com.psjostrom.strimma.receiver.scopeCrashHandler
910
import kotlinx.coroutines.*
1011
import kotlinx.coroutines.flow.MutableStateFlow
1112
import kotlinx.coroutines.flow.StateFlow
@@ -29,12 +30,8 @@ class NightscoutPusher @Inject constructor(
2930
private const val PUSH_FAIL_ALERT_MS = 15 * 60 * 1000L // 15 minutes
3031
}
3132

32-
// Last-resort safety net for anything that escapes the leaf catches in NightscoutClient.
33-
// Without this, an uncaught throwable on the IO scope still kills the process via the
34-
// JVM default handler. The leaf catches are still the primary mechanism; this is the belt.
35-
private val crashHandler = CoroutineExceptionHandler { _, t ->
36-
DebugLog.log("Pusher scope uncaught: ${t.javaClass.simpleName}: ${t.message?.take(80)}")
37-
}
33+
// Belt for anything that escapes the leaf catches in NightscoutClient.
34+
private val crashHandler = scopeCrashHandler("Pusher")
3835
private var job = SupervisorJob()
3936
private var scope = CoroutineScope(job + dispatcher + crashHandler)
4037

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
package com.psjostrom.strimma.receiver
2+
3+
import kotlinx.coroutines.CoroutineExceptionHandler
4+
5+
private const val SCOPE_CRASH_LOG_LENGTH = 80
6+
7+
/**
8+
* Coroutine exception handler factory for service-owned scopes. Logs uncaught
9+
* throwables to [DebugLog] instead of letting them propagate to the JVM /
10+
* Android default uncaught handler — which on Main = process death and a
11+
* foreground-service restart loop.
12+
*
13+
* Use as a context element on every scope a service owns:
14+
* ```
15+
* CoroutineScope(SupervisorJob() + dispatcher + scopeCrashHandler("Service"))
16+
* ```
17+
*
18+
* The leaf catches in network / DB code remain the primary mechanism — this
19+
* is the belt against future code added without one.
20+
*/
21+
internal fun scopeCrashHandler(scopeName: String): CoroutineExceptionHandler =
22+
CoroutineExceptionHandler { _, t ->
23+
DebugLog.log("$scopeName scope uncaught: ${t.javaClass.simpleName}: ${t.message?.take(SCOPE_CRASH_LOG_LENGTH)}")
24+
}

app/src/main/java/com/psjostrom/strimma/service/StrimmaService.kt

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ import com.psjostrom.strimma.notification.NotificationHelper
3131
import com.psjostrom.strimma.receiver.DebugLog
3232
import com.psjostrom.strimma.receiver.GlucoseNotificationListener
3333
import com.psjostrom.strimma.receiver.XdripBroadcastReceiver
34+
import com.psjostrom.strimma.receiver.scopeCrashHandler
3435
import com.psjostrom.strimma.data.calendar.CalendarPoller
3536
import com.psjostrom.strimma.data.calendar.WorkoutEvent
3637
import com.psjostrom.strimma.data.workout.WorkoutMode
@@ -96,14 +97,9 @@ class StrimmaService : Service() {
9697
@Inject lateinit var syncOrchestrator: SyncOrchestrator
9798
@Inject lateinit var workoutModeManager: WorkoutModeManager
9899

99-
// Last-resort safety net: if anything escapes the leaf catches in network clients
100-
// (or future code added without one), log it instead of letting it propagate to the
101-
// Android default handler — which on Main = process death and a foreground-service
102-
// restart loop. The leaf catches are still the primary mechanism; this is the belt.
103-
private val crashHandler = CoroutineExceptionHandler { _, t ->
104-
DebugLog.log("Service scope uncaught: ${t.javaClass.simpleName}: ${t.message?.take(80)}")
105-
}
106-
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.Main + crashHandler)
100+
// Belt for anything that escapes the leaf catches in network clients (or future
101+
// code added without one). On Main an uncaught throwable is process death.
102+
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.Main + scopeCrashHandler("Service"))
107103
private var xdripReceiver: XdripBroadcastReceiver? = null
108104
private var followerJob: Job? = null
109105
private var lluFollowerJob: Job? = null
@@ -189,7 +185,9 @@ class StrimmaService : Service() {
189185

190186
scope.launch { updateNotification() }
191187
startStaleCheckLoop()
192-
syncOrchestrator.start(scope)
188+
syncOrchestrator.start()
189+
// CalendarPoller stays on Main scope — its blocking ContentResolver work
190+
// uses `withContext(IO)` internally, so the poll loop itself can stay on Main.
193191
calendarPollJob = calendarPoller.start(scope)
194192
observeCalendarForAlarms()
195193
observeWorkoutModeForNotificationRefresh()

app/src/main/java/com/psjostrom/strimma/service/SyncOrchestrator.kt

Lines changed: 60 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,22 @@ import com.psjostrom.strimma.data.MS_PER_HOUR
44
import com.psjostrom.strimma.data.MS_PER_MINUTE
55
import com.psjostrom.strimma.data.ReadingDao
66
import com.psjostrom.strimma.data.SettingsRepository
7+
import androidx.annotation.VisibleForTesting
78
import com.psjostrom.strimma.data.health.ExerciseSyncer
9+
import com.psjostrom.strimma.di.IoDispatcher
810
import com.psjostrom.strimma.network.NightscoutPuller
911
import com.psjostrom.strimma.network.NightscoutPusher
1012
import com.psjostrom.strimma.network.TreatmentSyncer
1113
import com.psjostrom.strimma.receiver.DebugLog
14+
import com.psjostrom.strimma.receiver.scopeCrashHandler
1215
import com.psjostrom.strimma.tidepool.TidepoolUploader
1316
import com.psjostrom.strimma.update.UpdateChecker
1417
import com.psjostrom.strimma.webserver.LocalWebServer
18+
import kotlinx.coroutines.CoroutineDispatcher
1519
import kotlinx.coroutines.CoroutineScope
1620
import kotlinx.coroutines.Job
21+
import kotlinx.coroutines.SupervisorJob
22+
import kotlinx.coroutines.cancel
1723
import kotlinx.coroutines.delay
1824
import kotlinx.coroutines.flow.combine
1925
import kotlinx.coroutines.isActive
@@ -24,8 +30,12 @@ import javax.inject.Singleton
2430
/**
2531
* Manages periodic background jobs: push/upload retry, data pruning, treatment sync,
2632
* web server lifecycle, exercise sync, update checker, and initial push/pull.
33+
*
34+
* Owns its own IO-backed scope so the orchestrated jobs run off the main thread.
35+
* `CalendarPoller` is owned by `StrimmaService` and stays on Main — its blocking
36+
* work uses `withContext(IO)` internally — and so is not covered by this scope.
2737
*/
28-
@Suppress("LongParameterList") // DI constructor — each dependency is a distinct background concern
38+
@Suppress("LongParameterList") // 9 distinct background collaborators + 1 dispatcher for scope ownership
2939
@Singleton
3040
class SyncOrchestrator @Inject constructor(
3141
private val pusher: NightscoutPusher,
@@ -36,7 +46,8 @@ class SyncOrchestrator @Inject constructor(
3646
private val localWebServer: LocalWebServer,
3747
private val exerciseSyncer: ExerciseSyncer,
3848
private val nightscoutPuller: NightscoutPuller,
39-
private val updateChecker: UpdateChecker
49+
private val updateChecker: UpdateChecker,
50+
@IoDispatcher private val dispatcher: CoroutineDispatcher,
4051
) {
4152
companion object {
4253
private const val RETRY_INTERVAL_MINUTES = 5
@@ -45,14 +56,34 @@ class SyncOrchestrator @Inject constructor(
4556
private const val HOURS_PER_DAY = 24
4657
}
4758

59+
// Belt for anything that escapes the network/DB catches in this scope's launches.
60+
private val crashHandler = scopeCrashHandler("Sync")
61+
62+
// @Volatile because `start()` and `stop()` are invoked from the service
63+
// lifecycle (today: Main thread only), but the scope reference is read from
64+
// inside coroutines running on the IO dispatcher. Adds a memory barrier so
65+
// a future caller from another thread can't observe a torn write.
66+
@Volatile private var job = SupervisorJob()
67+
@Volatile private var scope = CoroutineScope(job + dispatcher + crashHandler)
68+
4869
private var treatmentSyncJob: Job? = null
4970
private var exerciseSyncJob: Job? = null
5071

51-
fun start(scope: CoroutineScope) {
52-
startRetryLoop(scope)
53-
startPruneLoop(scope)
54-
startTreatmentSyncLifecycle(scope)
55-
startWebServerLifecycle(scope)
72+
@VisibleForTesting
73+
internal var started: Boolean = false
74+
private set
75+
76+
fun start() {
77+
// Idempotent: a stray double-call must not duplicate the periodic loops
78+
// or stack two DataStore collectors. Mirrors UpdateChecker.start()'s
79+
// `if (checkJob != null) return` guard.
80+
if (started) return
81+
started = true
82+
83+
startRetryLoop()
84+
startPruneLoop()
85+
startTreatmentSyncLifecycle()
86+
startWebServerLifecycle()
5687
exerciseSyncJob = exerciseSyncer.start(scope)
5788
updateChecker.start(scope)
5889

@@ -62,17 +93,29 @@ class SyncOrchestrator @Inject constructor(
6293
}
6394

6495
fun stop() {
65-
treatmentSyncJob?.cancel()
66-
treatmentSyncJob = null
67-
exerciseSyncJob?.cancel()
68-
exerciseSyncJob = null
69-
localWebServer.stop()
96+
started = false
97+
98+
// updateChecker.stop() MUST be called explicitly — `scope.cancel()` cancels
99+
// the launched coroutine but leaves UpdateChecker's internal `checkJob`
100+
// reference live (just cancelled). The next `start()` would then early-return
101+
// on its `if (checkJob != null) return` guard.
102+
updateChecker.stop()
70103
pusher.stop()
71104
tidepoolUploader.stop()
72-
updateChecker.stop()
105+
localWebServer.stop()
106+
107+
// `treatmentSyncJob` and `exerciseSyncJob` are children of `scope` —
108+
// `scope.cancel()` below cancels them transitively. Just null the
109+
// references so a re-start() doesn't observe stale Jobs.
110+
treatmentSyncJob = null
111+
exerciseSyncJob = null
112+
113+
scope.cancel()
114+
job = SupervisorJob()
115+
scope = CoroutineScope(job + dispatcher + crashHandler)
73116
}
74117

75-
private fun startRetryLoop(scope: CoroutineScope) {
118+
private fun startRetryLoop() {
76119
scope.launch {
77120
while (isActive) {
78121
delay(RETRY_INTERVAL_MINUTES * MS_PER_MINUTE)
@@ -82,7 +125,7 @@ class SyncOrchestrator @Inject constructor(
82125
}
83126
}
84127

85-
private fun startPruneLoop(scope: CoroutineScope) {
128+
private fun startPruneLoop() {
86129
scope.launch {
87130
while (isActive) {
88131
val thirtyDaysAgo = System.currentTimeMillis() - PRUNE_RETENTION_DAYS * HOURS_PER_DAY * MS_PER_HOUR
@@ -92,7 +135,7 @@ class SyncOrchestrator @Inject constructor(
92135
}
93136
}
94137

95-
private fun startTreatmentSyncLifecycle(scope: CoroutineScope) {
138+
private fun startTreatmentSyncLifecycle() {
96139
scope.launch {
97140
combine(
98141
settings.treatmentsSyncEnabled,
@@ -119,7 +162,7 @@ class SyncOrchestrator @Inject constructor(
119162
}
120163
}
121164

122-
private fun startWebServerLifecycle(scope: CoroutineScope) {
165+
private fun startWebServerLifecycle() {
123166
scope.launch {
124167
settings.webServerEnabled.collect { enabled ->
125168
if (enabled) {

app/src/main/java/com/psjostrom/strimma/tidepool/TidepoolUploader.kt

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ import androidx.annotation.VisibleForTesting
99
import com.psjostrom.strimma.data.ReadingDao
1010
import com.psjostrom.strimma.data.SettingsRepository
1111
import com.psjostrom.strimma.receiver.DebugLog
12+
import com.psjostrom.strimma.receiver.scopeCrashHandler
1213
import dagger.hilt.android.qualifiers.ApplicationContext
13-
import kotlinx.coroutines.CoroutineExceptionHandler
1414
import kotlinx.coroutines.CoroutineScope
1515
import kotlinx.coroutines.Dispatchers
1616
import kotlinx.coroutines.NonCancellable
@@ -67,12 +67,9 @@ class TidepoolUploader @Inject constructor(
6767
}
6868
}
6969

70-
// Last-resort safety net for anything that escapes the leaf catches in TidepoolClient
71-
// and the executeUpload outer catch. The leaf catches are still the primary mechanism;
72-
// this is the belt against future code added without one.
73-
private val crashHandler = CoroutineExceptionHandler { _, t ->
74-
DebugLog.log("Uploader scope uncaught: ${t.javaClass.simpleName}: ${t.message?.take(MAX_LOG_ERROR_LENGTH)}")
75-
}
70+
// Belt for anything that escapes the leaf catches in TidepoolClient and the
71+
// executeUpload outer catch.
72+
private val crashHandler = scopeCrashHandler("Uploader")
7673
private var job = SupervisorJob()
7774
private var scope = CoroutineScope(job + Dispatchers.IO + crashHandler)
7875
private val uploadMutex = Mutex()

0 commit comments

Comments
 (0)