Skip to content

Commit 79d885f

Browse files
authored
Remove ThreadLocal from ThreadLocalMap when finishing UndispatchedCor… (#3593)
* Remove ThreadLocal from ThreadLocalMap when finishing UndispatchedCoroutine * It addresses the problem with ThreadLocalMap.entries that may outlive the coroutine lifecycle and interfere with CPU consumption of other thread-locals on the same thread * No test provided as this is a non-functioanl change. The only reasonable way to check it is to reflectively walk over Thread class which is prohibited by Java since 11+. The only way is to eyeball Thread.currentThread().threadLocals size in debugger in the properly crafted unit test * Do not touch thread-locals if they were never set in UndispatchedCoroutine Fixes #3592
1 parent 6a6e62d commit 79d885f

File tree

1 file changed

+47
-10
lines changed

1 file changed

+47
-10
lines changed

kotlinx-coroutines-core/jvm/src/CoroutineContext.kt

+47-10
Original file line numberDiff line numberDiff line change
@@ -167,9 +167,11 @@ internal actual class UndispatchedCoroutine<in T>actual constructor (
167167
uCont: Continuation<T>
168168
) : ScopeCoroutine<T>(if (context[UndispatchedMarker] == null) context + UndispatchedMarker else context, uCont) {
169169

170-
/*
171-
* The state is thread-local because this coroutine can be used concurrently.
172-
* Scenario of usage (withContinuationContext):
170+
/**
171+
* The state of [ThreadContextElement]s associated with the current undispatched coroutine.
172+
* It is stored in a thread local because this coroutine can be used concurrently in suspend-resume race scenario.
173+
* See the followin, boiled down example with inlined `withContinuationContext` body:
174+
* ```
173175
* val state = saveThreadContext(ctx)
174176
* try {
175177
* invokeSmthWithThisCoroutineAsCompletion() // Completion implies that 'afterResume' will be called
@@ -178,8 +180,40 @@ internal actual class UndispatchedCoroutine<in T>actual constructor (
178180
* thisCoroutine().clearThreadContext() // Concurrently the "smth" could've been already resumed on a different thread
179181
* // and it also calls saveThreadContext and clearThreadContext
180182
* }
183+
* ```
184+
*
185+
* Usage note:
186+
*
187+
* This part of the code is performance-sensitive.
188+
* It is a well-established pattern to wrap various activities into system-specific undispatched
189+
* `withContext` for the sake of logging, MDC, tracing etc., meaning that there exists thousands of
190+
* undispatched coroutines.
191+
* Each access to Java's [ThreadLocal] leaves a footprint in the corresponding Thread's `ThreadLocalMap`
192+
* that is cleared automatically as soon as the associated thread-local (-> UndispatchedCoroutine) is garbage collected.
193+
* When such coroutines are promoted to old generation, `ThreadLocalMap`s become bloated and an arbitrary accesses to thread locals
194+
* start to consume significant amount of CPU because these maps are open-addressed and cleaned up incrementally on each access.
195+
* (You can read more about this effect as "GC nepotism").
196+
*
197+
* To avoid that, we attempt to narrow down the lifetime of this thread local as much as possible:
198+
* * It's never accessed when we are sure there are no thread context elements
199+
* * It's cleaned up via [ThreadLocal.remove] as soon as the coroutine is suspended or finished.
200+
*/
201+
private val threadStateToRecover = ThreadLocal<Pair<CoroutineContext, Any?>>()
202+
203+
/*
204+
* Indicates that a coroutine has at least one thread context element associated with it
205+
* and that 'threadStateToRecover' is going to be set in case of dispatchhing in order to preserve them.
206+
* Better than nullable thread-local for easier debugging.
207+
*
208+
* It is used as a performance optimization to avoid 'threadStateToRecover' initialization
209+
* (note: tl.get() initializes thread local),
210+
* and is prone to false-positives as it is never reset: otherwise
211+
* it may lead to logical data races between suspensions point where
212+
* coroutine is yet being suspended in one thread while already being resumed
213+
* in another.
181214
*/
182-
private var threadStateToRecover = ThreadLocal<Pair<CoroutineContext, Any?>>()
215+
@Volatile
216+
private var threadLocalIsSet = false
183217

184218
init {
185219
/*
@@ -213,19 +247,22 @@ internal actual class UndispatchedCoroutine<in T>actual constructor (
213247
}
214248

215249
fun saveThreadContext(context: CoroutineContext, oldValue: Any?) {
250+
threadLocalIsSet = true // Specify that thread-local is touched at all
216251
threadStateToRecover.set(context to oldValue)
217252
}
218253

219254
fun clearThreadContext(): Boolean {
220-
if (threadStateToRecover.get() == null) return false
221-
threadStateToRecover.set(null)
222-
return true
255+
return !(threadLocalIsSet && threadStateToRecover.get() == null).also {
256+
threadStateToRecover.remove()
257+
}
223258
}
224259

225260
override fun afterResume(state: Any?) {
226-
threadStateToRecover.get()?.let { (ctx, value) ->
227-
restoreThreadContext(ctx, value)
228-
threadStateToRecover.set(null)
261+
if (threadLocalIsSet) {
262+
threadStateToRecover.get()?.let { (ctx, value) ->
263+
restoreThreadContext(ctx, value)
264+
}
265+
threadStateToRecover.remove()
229266
}
230267
// resume undispatched -- update context but stay on the same dispatcher
231268
val result = recoverResult(state, uCont)

0 commit comments

Comments
 (0)