Skip to content

Data race on BufferedChannelIterator.continuation #3834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chaoren opened this issue Aug 1, 2023 · 2 comments
Closed

Data race on BufferedChannelIterator.continuation #3834

chaoren opened this issue Aug 1, 2023 · 2 comments
Assignees
Labels

Comments

@chaoren
Copy link
Contributor

chaoren commented Aug 1, 2023

Our internal TSan tests found the following data race. I patched a fix locally to put @Volatile on BufferedChannelIterator.continuation, but I'm not sure if that's appropriate to upstream.

Details
==================
WARNING: ThreadSanitizer: data race (pid=9770)
  Write of size 4 at 0x0000cfe3a350 by thread T78:
    #0 kotlinx.coroutines.channels.BufferedChannel$BufferedChannelIterator.tryResumeHasNextOnClosedChannel()V BufferedChannel.kt:1697 
    #1 kotlinx.coroutines.channels.BufferedChannel.resumeWaiterOnClosedChannel(Lkotlinx/coroutines/Waiter;Z)V BufferedChannel.kt:2174 
    #2 kotlinx.coroutines.channels.BufferedChannel.resumeReceiverOnClosedChannel(Lkotlinx/coroutines/Waiter;)V BufferedChannel.kt:2161 
    #3 kotlinx.coroutines.channels.BufferedChannel.cancelSuspendedReceiveRequests(Lkotlinx/coroutines/channels/ChannelSegment;J)V BufferedChannel.kt:2154 
    #4 kotlinx.coroutines.channels.BufferedChannel.completeClose(J)Lkotlinx/coroutines/channels/ChannelSegment; BufferedChannel.kt:1931 
    #5 kotlinx.coroutines.channels.BufferedChannel.isClosed(JZ)Z BufferedChannel.kt:2210 
    #6 kotlinx.coroutines.channels.BufferedChannel.isClosedForSend0(J)Z BufferedChannel.kt:2185 
    #7 kotlinx.coroutines.channels.BufferedChannel.isClosedForSend()Z BufferedChannel.kt:2182 
    #8 kotlinx.coroutines.channels.BufferedChannel.completeCloseOrCancel()V BufferedChannel.kt:1903 
    #9 kotlinx.coroutines.channels.BufferedChannel.closeOrCancelImpl(Ljava/lang/Throwable;Z)Z BufferedChannel.kt:1796 
    #10 kotlinx.coroutines.channels.BufferedChannel.close(Ljava/lang/Throwable;)Z BufferedChannel.kt:1755 
    #11 kotlinx.coroutines.channels.SendChannel$DefaultImpls.close$default(Lkotlinx/coroutines/channels/SendChannel;Ljava/lang/Throwable;ILjava/lang/Object;)Z Channel.kt:98 
    #12 kotlinx.coroutines.channels.ProducerCoroutine.onCompleted(Lkotlin/Unit;)V Produce.kt:143 
    #13 kotlinx.coroutines.channels.ProducerCoroutine.onCompleted(Ljava/lang/Object;)V Produce.kt:136 
    #14 kotlinx.coroutines.AbstractCoroutine.onCompletionInternal(Ljava/lang/Object;)V AbstractCoroutine.kt:93 
    #15 kotlinx.coroutines.JobSupport.tryFinalizeSimpleState(Lkotlinx/coroutines/Incomplete;Ljava/lang/Object;)Z JobSupport.kt:296 
    #16 kotlinx.coroutines.JobSupport.tryMakeCompleting(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; JobSupport.kt:860 
    #17 kotlinx.coroutines.JobSupport.makeCompletingOnce$third_party_kotlin_kotlinx_coroutines_kotlinx_coroutines_jvm(Ljava/lang/Object;)Ljava/lang/Object; JobSupport.kt:832 
    #18 kotlinx.coroutines.AbstractCoroutine.resumeWith(Ljava/lang/Object;)V AbstractCoroutine.kt:100 
    #19 kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(Ljava/lang/Object;)V ContinuationImpl.kt:46 
    #20 kotlinx.coroutines.DispatchedTask.run()V DispatchedTask.kt:108 
    #21 kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(Lkotlinx/coroutines/scheduling/Task;)V CoroutineScheduler.kt:584 
    #22 kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(Lkotlinx/coroutines/scheduling/Task;)V CoroutineScheduler.kt:793 
    #23 kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker()V CoroutineScheduler.kt:697 
    #24 kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run()V CoroutineScheduler.kt:684 
    #25 (Generated Stub) <null> 

  Previous read of size 4 at 0x0000cfe3a350 by thread T4:
    #0 kotlinx.coroutines.channels.BufferedChannel$BufferedChannelIterator.invokeOnCancellation(Lkotlinx/coroutines/internal/Segment;I)V BufferedChannel.kt:1647 
    #1 kotlinx.coroutines.channels.BufferedChannel.prepareReceiverForSuspension(Lkotlinx/coroutines/Waiter;Lkotlinx/coroutines/channels/ChannelSegment;I)V BufferedChannel.kt:718 
    #2 kotlinx.coroutines.channels.BufferedChannel.access$prepareReceiverForSuspension(Lkotlinx/coroutines/channels/BufferedChannel;Lkotlinx/coroutines/Waiter;Lkotlinx/coroutines/channels/ChannelSegment;I)V BufferedChannel.kt:37 
    #3 kotlinx.coroutines.channels.BufferedChannel$BufferedChannelIterator.hasNextOnNoWaiterSuspend(Lkotlinx/coroutines/channels/ChannelSegment;IJLkotlin/coroutines/Continuation;)Ljava/lang/Object; BufferedChannel.kt:3116 
    #4 kotlinx.coroutines.channels.BufferedChannel$BufferedChannelIterator.hasNext(Lkotlin/coroutines/Continuation;)Ljava/lang/Object; BufferedChannel.kt:1610 
    #5 kotlinx.coroutines.channels.ChannelsKt__Channels_commonKt.toList(Lkotlinx/coroutines/channels/ReceiveChannel;Lkotlin/coroutines/Continuation;)Ljava/lang/Object; Channels.common.kt:160 
    #6 kotlinx.coroutines.channels.ChannelsKt.toList(Lkotlinx/coroutines/channels/ReceiveChannel;Lkotlin/coroutines/Continuation;)Ljava/lang/Object; ??:1 
    #7 kotlinx.coroutines.channels.ChannelsKt__Channels_commonKt$toList$1.invokeSuspend(Ljava/lang/Object;)Ljava/lang/Object; Channels.common.kt 
    #8 kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(Ljava/lang/Object;)V ContinuationImpl.kt:33 
    #9 kotlinx.coroutines.DispatchedTask.run()V DispatchedTask.kt:108 
    #10 kotlinx.coroutines.EventLoopImplBase.processNextEvent()J EventLoop.common.kt:280 
    #11 kotlinx.coroutines.BlockingCoroutine.joinBlocking()Ljava/lang/Object; Builders.kt:85 
    #12 kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Lkotlin/coroutines/CoroutineContext;Lkotlin/jvm/functions/Function2;)Ljava/lang/Object; Builders.kt:59 
    #13 kotlinx.coroutines.BuildersKt.runBlocking(Lkotlin/coroutines/CoroutineContext;Lkotlin/jvm/functions/Function2;)Ljava/lang/Object; ??:1 
    #14 kotlinx.coroutines.TestBase.runTest(Lkotlin/jvm/functions/Function1;Ljava/util/List;Lkotlin/jvm/functions/Function2;)V TestBase.kt:224 
    #15 kotlinx.coroutines.TestBase.runTest$default(Lkotlinx/coroutines/TestBase;Lkotlin/jvm/functions/Function1;Ljava/util/List;Lkotlin/jvm/functions/Function2;ILjava/lang/Object;)V TestBase.kt:216 
    #16 kotlinx.coroutines.channels.TickerChannelCommonTest.testComplexOperator()V TickerChannelCommonTest.kt:97 
@chaoren chaoren added the bug label Aug 1, 2023
@ndkoval ndkoval self-assigned this Aug 2, 2023
@qwwdfsad
Copy link
Collaborator

qwwdfsad commented Aug 4, 2023

The data race is benign on the JVM and for the buffered channel:

T1:

  1. reads from a channel via an iterator, invokes hasNext
  2. parks the iterator into an internal queue (receiveImplOnNoWaiter call)
  3. sets up a cancellation handler ("remove the current cont from segment's waiters"), reading the continuation field if it's not null

T2:

  1. Starts closing sequence
  2. Finds T1's iterator in the receiver's queue
  3. Invokes iterator's cleanup sequence that implies setting continuation to null and invoking resumeWith(false), right between T1#2 and T1#3

The race is benign because it is allowed an expected to see null in T1#3 step -- the only scenario when it's possible is when close() already cleaned up all the segments from waiters and resumed the waiters, rendering the T1#3 step useless.

The race is not suppressable though, because this very code is also executable on K/N where there is no [yet] bening races; We either have to have a clear guidance ("it's allowed to do that, there is no UB/OoTA") or just put @Volatile on it

@qwwdfsad
Copy link
Collaborator

@chareon this is a benign race, feel free to safely mute it.

#3837 explains it and #3873 marks it explicitly

@qwwdfsad qwwdfsad removed the bug label Aug 31, 2023
igoriakovlev pushed a commit that referenced this issue Nov 2, 2023
* Mark BufferedChannelIterator.continuation as @BeningDataRace to address potential UB and OoTA on K/N
* Explain benign data-race on SelectImplementation.clauses and mark it with @BenignDataRace
* Explain benign data-race on SelectImplementation.internalResult and mark it with @BenignDataRace


Fixes #3834
Fixes #3843
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants