Skip to content

runBlocking can cause other coroutines to be dispatched on the calling Thread which should be blocked? #3204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
steve-the-edwards opened this issue Feb 24, 2022 · 18 comments
Labels

Comments

@steve-the-edwards
Copy link
Contributor

The kdoc for runBlocking claims it blocks the thread from which it is called (interuptibly). But I have observed behaviour where other coroutines waiting for dispatch on that thread (e.g. on the main thread) will be dispatched by the event loop because runBlocking falls back to using the calling thread's eventLoop even when a specific other dispatcher (which may be busy) is specified for that which to run the coroutine launched by runBlocking on. Why is this and is this a bug? If not what is the explanation for this choice of behaviour?

E.g. This code can reproduce the issue:

class RunBlockingTester {
​
  val handlerThread: HandlerThread =
    HandlerThread("MyTestHandlerThread", Process.THREAD_PRIORITY_BACKGROUND)
​
  val executor: Executor = object : Executor {
​
    var handler: Handler? = null
​
    override fun execute(runnable: Runnable) {
      if (handler == null) {
        handler = Handler(handlerThread.looper)
      }
      handler!!.post(runnable)
    }
  }
​
  val testThreadDispatcher = executor.asCoroutineDispatcher()
​
  @Before fun setup() {
    handlerThread.start()
  }
​
  @Test fun orderingOfExecution(): Unit {
    runBlocking {
      println("A. Start ${Thread.currentThread()}")
      launch {
        println("B. Start main thread task. ${Thread.currentThread()}")
        yield()
        println("C. Finish main thread task (was a queued task!). ${Thread.currentThread()}")
      }
      println("D. Before 2nd runBlocking. ${Thread.currentThread()}")
      runBlocking(testThreadDispatcher) {
        println("E. This thread can't get going anyway. ${Thread.currentThread()}")
      }
      println("F. After 2nd runBlocking. ${Thread.currentThread()}")
    }
  }
}

This prints: A, D, B ,C then hangs. I would have thought the coroutine for B and C would never have gotten dispatched but runBlocking picks up the calling threads eventLoop to processNextEvent while waiting?? It should be blocked though?

I realize runBlocking has a note about not using it from within a coroutine - but I'm not sure it's possible to know that as you could be running on any coroutine but not within a suspend function and you wouldn't know your execution context, as any coroutine could call any non-suspend function.

Can we clarify the documentation here or enforce the constraints?

runBlocking has useful utility when calling coroutine based APIs from a blocking API.

@steve-the-edwards
Copy link
Contributor Author

Discussion in kotlinlang slack here: https://kotlinlang.slack.com/archives/C1CFAFJSK/p1645717325416749

@steve-the-edwards
Copy link
Contributor Author

steve-the-edwards commented Feb 25, 2022

Another way of asking this question with the underlying code would be to say why does runBlocking pull out the EventLoop of the ThreadLocal?

   if (contextInterceptor == null) {
        // create or use private event loop if no dispatcher is specified
        eventLoop = ThreadLocalEventLoop.eventLoop
        newContext = GlobalScope.newCoroutineContext(context + eventLoop)
    } else {
        // See if context's interceptor is an event loop that we shall use (to support TestContext)
        // or take an existing thread-local event loop if present to avoid blocking it (but don't create one)
        eventLoop = (contextInterceptor as? EventLoop)?.takeIf { it.shouldBeProcessedFromContext() }
            ?: ThreadLocalEventLoop.currentOrNull()
        newContext = GlobalScope.newCoroutineContext(context)
    }

Since it processes the next event on that EventLoop while blocking if it is available we by definition won't be blocking the ThreadLocal thread, which it claims it is.

                while (true) {
                    @Suppress("DEPRECATION")
                    if (Thread.interrupted()) throw InterruptedException().also { cancelCoroutine(it) }
                    val parkNanos = eventLoop?.processNextEvent() ?: Long.MAX_VALUE
                    // note: process next even may loose unpark flag, so check if completed before parking
                    if (isCompleted) break
                    parkNanos(this, parkNanos)
                }

My understanding would have been that we would only processNextEvent on the EventLoop if was passed in as the contextInterceptor and shouldBeProcessedFromContext(). That way we don't block that Dispatcher/EventLoop but we do block the ThreadLocal thread, which is what the context specifies we should do.

CC @elizarov are just supposed to consider runBlocking danger code? Because the API is not marked as such.

@steve-the-edwards
Copy link
Contributor Author

Another, simpler way to demonstrate the unexpected behaviour:

   runBlocking {
      println("A. Start ${Thread.currentThread()}")
      launch {
        println("B. Start scheduled coroutine. ${Thread.currentThread()}")
        yield()
        println("C. Finish scheduled coroutine. ${Thread.currentThread()}")
      }
      println("D. Before 2nd runBlocking. ${Thread.currentThread()}")
      runBlocking {
        println("E. Inside runBlocking. ${Thread.currentThread()}")
      }
      println("F. After 2nd runBlocking. ${Thread.currentThread()}")
    }

Prints:

A. Start Thread[Test worker @coroutine#1,5,main]
D. Before 2nd runBlocking. Thread[Test worker @coroutine#1,5,main]
B. Start scheduled coroutine. Thread[Test worker @coroutine#2,5,main]
E. Inside runBlocking. Thread[Test worker @coroutine#3,5,main]
F. After 2nd runBlocking. Thread[Test worker @coroutine#1,5,main]
C. Finish scheduled coroutine. Thread[Test worker @coroutine#2,5,main]

No expectation here that B should be dispatched before E.

@steve-the-edwards
Copy link
Contributor Author

And a simpler reproduction of the original case:

val testCoroutineDispatcher = TestCoroutineDispatcher().apply{
      pauseDispatcher()
    }
    runBlocking {
      println("A. Start ${Thread.currentThread()}")
      launch {
        println("B. Start scheduled coroutine. ${Thread.currentThread()}")
        yield()
        println("C. Finish scheduled coroutine. ${Thread.currentThread()}")
      }
      println("D. Before 2nd runBlocking. ${Thread.currentThread()}")
      runBlocking(testCoroutineDispatcher) {
        println("E. Inside runBlocking. ${Thread.currentThread()}")
      }
      println("F. After 2nd runBlocking. ${Thread.currentThread()}")
    }

Still dispatches B and C even though it should be blocked after D.

@steve-the-edwards
Copy link
Contributor Author

Thinking about this more ....

runBlocking is supposed to block the calling thread but to do so it tries to use the EventLoop of the calling thread to run other coroutines queued for dispatch on the thread.

Presumably it needs to use the EventLoop of the dispatcher it runs on to keep processing events - any coroutines waiting for dispatch - within the scope of the runBlocking lambda.

But that this would include other coroutines on the same thread (i.e. that we would pull in the ThreadLocal EventLoop) is unexpected behavior; at least to me!

If you are blocking the thread why run other coroutines waiting to run on that thread that are not within what you specified to be run in runBlocking?

Shouldn’t we have a sub-scoped EventLoop strictly for anything launch within runBlocking?

@qwwdfsad am I understanding this correctly?

@qwwdfsad
Copy link
Collaborator

qwwdfsad commented Mar 2, 2022

runBlocking blocks the thread, but while doing so, it progress all the coroutines that have been dispatched to this runBlocking loop. Otherwise, with structured concurrency, any non-trivial use of runBlocking would caused a deadlock that is, apart from being just a regular source of bugs, will also be hard to track.

This is not the bug, this is our intentional design and it's reflected in our documentation (though it probably might be improved):

The default CoroutineDispatcher for this builder is an internal implementation of event loop that processes continuations in this blocked thread until the completion of this coroutine

@steve-the-edwards
Copy link
Contributor Author

it progress all the coroutines that have been dispatched to this runBlocking loop.

Yes this makes sense to me and is necessary - my issue is that it also progresses other coroutines dispatched outside of the runBlocking loop. Why not create a separate EventLoop just for coroutines within this runBlocking? Is that impossible because they could still deadlock?

@qwwdfsad
Copy link
Collaborator

qwwdfsad commented Mar 3, 2022

Is that impossible because they could still deadlock?

Indeed. We've added it as part of #860

The core problem with runBlocking that works in a non-cooperative way (e.g. does not have all event-loops linked with each other) is that it is non-composable.

Meaning that when you have more than one library that use runBlocking and call each other, this will become a huge issue with potential deadlocks that cannot be workarounded apart from fixing the library. Actually, the change was driven by such cases and we decided that overall composability over all the ecosystem is much more important here

@arberg
Copy link

arberg commented Mar 3, 2022

@steve-the-edwards :

Yes this makes sense to me and is necessary - my issue is that it also progresses other coroutines dispatched outside of the runBlocking loop. Why not create a separate EventLoop just for coroutines within this runBlocking? Is that impossible because they could still deadlock?

Actually I was expected runBlocking { } if suspended would execute anything on the same dispatcher, I was expecting pretty much the only guarantee to be that runBlocking would not return, until its inner coroutine completed.

I was wrong.

It's as @qwwdfsad said:

runBlocking blocks the thread, but while doing so, it progress all the coroutines that have been dispatched to this runBlocking loop.

With my emphasis on THIS runBlocking loop.

So in this example

CoroutineScope(Dispatchers.Main).launch {
    println("MainCoordinator Outer - A. ${Thread.currentThread()}")
    delay(100)
    println("MainCoordinator Outer - B. ${Thread.currentThread()}")
}
// here we are on Main-thread
runBlocking {
    println("MainCoordinator C. Start ${Thread.currentThread()}")
    launch {
        println("MainCoordinator  D. Start main thread task. ${Thread.currentThread()}")
    }
    delay(10000)
    println("MainCoordinator  F. After 2nd runBlocking. ${Thread.currentThread()}")
}

The first coroutine with Outer A+B does not run until runBlocking returns and C+D+F has been printed. This surprised me.

Can someone confirm this is correct (and thus that my interpretation of @qwwdfsad is correct). I know that's what the code does, but would like confirmation that is the intended behavior.

@steve-the-edwards
Copy link
Contributor Author

Meaning that when you have more than one library that use runBlocking and call each other, this will become a huge issue with potential deadlocks that cannot be workarounded apart from fixing the library. Actually, the change was driven by such cases and we decided that overall composability over all the ecosystem is much more important here

Thanks @qwwdfsad that helps me understand and makes a lot of sense.

So how would you suggest calling a suspending API in a blocking way that wouldn't co-operate with the outer event loop? Example use case: Your process is crashing and you want one suspend API to be called but not to dispatch any other coroutines on the main thread.

@steve-the-edwards
Copy link
Contributor Author

Can someone confirm this is correct (and thus that my interpretation of @qwwdfsad is correct). I know that's what the code does, but would like confirmation that is the intended behavior.

@arberg I don't think this is true - this is exactly what we are talking about. (This won't work in a unit test unless you Dispatchers.setMain so that might be giving you bad results, but if you do, then the first scope can be dispatched for A. to run until it yields.

@arberg
Copy link

arberg commented Mar 4, 2022

@steve-the-edwards
I tested in in main thread in an android app, not in a Unit-test. I know I'm right about how it behaved. I do not know if it is intended behavior or a bug.

What I'm talking about is a slightly different variant of the initial question, though it was unclear to me whether that was part of your question.

Actually I'm not sure what you mean by first scope can be dispatched for A. Yes, first scope could run first, and would if using Main.immediate, or start=UNDISPATCHED, but (apparently?) won't as I wrote it.

@pacher
Copy link

pacher commented Mar 4, 2022

@arberg At least I understand it this way and think it is correct.

We are on the main thread launching a coroutine to be dispatched later on the same main thread and before this coroutine gets a chance to be dispatched we are blocking said main thread with runBlocking.

If we replace runBlocking with a call to a normal function doing the printing the result will be the same. Why should runBlocking be different and somehow be aware of what we are doing outside of it?

@arberg
Copy link

arberg commented Mar 9, 2022

@pacher
The way I understand it jobs are like a tree. runBlocking is one such job-tree, and coroutines dispatched outside the runBlocking are obviously not children of the runBlocking-node. According to my tests, and my understanding of what @qwwdfsad said, runBlocking blocks the thread so that only child-jobs of the runBlocking node can run.

@qwwdfsad: runBlocking blocks the thread, but while doing so, it progress all the coroutines that have been dispatched to this runBlocking loop.

@pacher We are on the main thread launching a coroutine to be dispatched later on the same main thread and before this coroutine gets a chance to be dispatched we are blocking said main thread with runBlocking.

I agree that is what happens.

@pacher If we replace runBlocking with a call to a normal function doing the printing the result will be the same. Why should runBlocking be different and somehow be aware of what we are doing outside of it?

Yes I agree that is a reasonable way to look at it. The difference though being that runBlocking is actually a coroutine that runs other coroutines and thus allows running any 'other' (child) coroutines. That includes everything that is a coroutine such as executing sub-coroutines on other threads (such as withContext), so I thought it would also allow other coroutines dispatched on same thread to run. But it won't, and I agree with you it makes total sense in that its 'Blocking'. I really shouldn't have been that surprised... but I was because I thought of it as a coroutine loop-handler, which it is, it just only handles it own children.

@steve-the-edwards
Copy link
Contributor Author

@arberg no that is wrong and that is exactly what my example is showing.

when runBlocking suspends other coroutines launched outside of its 'job tree' on that same thread can be dispatched. This is @qwwdfsad's point - runBlocking absorbs the outer event loop in order to prevent deadlock.

@arberg
Copy link

arberg commented Mar 13, 2022

@steve-the-edwards I suggest you try the code I included, and wonder why the Outer - A does not get executed, if it doesn't run before runBlocking starts.

@qwwdfsad
Copy link
Collaborator

So how would you suggest calling a suspending API in a blocking way that wouldn't co-operate with the outer event loop?

No idea honestly, never give it a proper thought. The name should clearly reflect that is not intended to be used on libraries (note: the util module in a large enough code bases is also considered to be a library) and that's deadlock-prone (here, specific OptIn annotation can be used).

@faucct
Copy link

faucct commented Mar 9, 2023

What about using CoroutineStart.UNDISPATCHED inside runBlocking if there is no CoroutineDispatcher specified, so the blocks without suspensions would be done before giving chances to external tasks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants