Exception handling with structured concurrency (rx & async) #691

viorgu · 2018-10-09T14:56:42Z

Hi, I've been playing around with the new structured concurrency model and I noticed some unexpected behavior in how exceptions are handled for some coroutine builders.

rx2 builders

@Test
fun rxError() = runBlocking {
    val completable = rxCompletable {
        error("error in rx builder")
    }
    completable.onErrorComplete().await()
}

In this case I would expect the exception that occurred in the rxCompletable to be passed to rx's onError and be handled by onErrorComplete() but at the moment the exception is thrown and the current scope is canceled

async builder

@Test
fun asyncError() = runBlocking {
    val deferred = async {
        error("error in async")
    }

    try {
        deferred.await()
    } catch (e: Throwable) {
        println("Exception: ${e.message}")
    }
    println("Complete")
}

Here I would have expected the exception thrown in the async block to be handled by the try-catch, and while that does seem to happen, the exception is still thrown after the scope completes it's execution, so I'm seeing something like:

Exception: error in async
Complete

java.lang.IllegalStateException: error in async
...

Both of these issues are solved if I execute the builder on the GlobalScope or if I wrap them in a supervisorScope but the default behavior feels a bit strange and I'm not sure if it's intended

Kotlin version: 1.3.0-rc-131
kotlinx-coroutines version: 0.30.2-eap13

The text was updated successfully, but these errors were encountered:

elizarov · 2018-10-10T15:34:59Z

This is does so by design. The basic idea is that by default exceptions should always be thrown into the outer scope to immediately cancel all the other activities that might have been going on in this scope.

The slightly different behavior with async/await is a due to a special code that makes sure that if you were invoking await from the parent you still can catch the right exception so that you can use await { ... }.await() somewhat similarly to withContext { ... }.

This is the default behavior. You can override this behavior in one of two ways:

Use rxCompletable(NonCancellable) { .. } -- this creates a Completable instance that is not bound to the parent context at all (just like it is usual in Rx world), so cancelling parent does not affect it and its failure does not affect parent)
Use supervisorScope { rxCompletable { ... } } -- this creates a Completable under a supervisor that is still cancelled when parent is cancelled, but its failure does not affect the parent.

Does it help?

viorgu · 2018-10-10T22:16:41Z

I can see how the default behavior would make sense for something like launch or withContext but the current behavior of async is quite strange and difficult to understand.

In my example the execution of the code inside runBlocking completed successfully but I still get an exception afterwards and it's the same exception that I already "handled" in the try-catch block.
This can lead to some really head-scratching moments when debugging issues where the same exception seems both handled and unhandled at the same time.

Regarding the Rx example, I can't see a situation where I'd actually prefer the default behavior over the supervisorScope case. Since Rx already provides an exception handling mechanism I would intuitively imagine that I can handle any exception that occured in the rxCompletable block using the builtin methods.
With the default scope behavior, all of Rx's error handling operators become virtually useless as there's no way to pass an error down the subscriber chain.

The main purpose of a Rx Completable is to signal if an operation has completed successfully or not, and I feel this should happen regardless of the scope the builder was executed on.

My main concern at the moment is that people that are not very familiar with the way scopes work will just start adding GlobalScope. whenever something doesn't work the way they expect

elizarov · 2018-10-10T22:48:18Z

@viorgu async is used only if you need to run multiple things concurrenty. Consider the following code:

runBlocking {
    val d1 = async { foo1() }
    val d2 = async { foo2() }
    val r1 = d1.await()  // waiting for foo1 here
    val r2 = d2.await()    
    combine(r1, r2)
}

Now consider a case when foo1 works for a long time, while foo2 quickly throws an exception (while the code is still waiting for foo1. The failure of foo2 must immediately cancel runBlocking and cancel foo1 so that this error can be processed. There is simply no other way to it.

If you don't need to do multiple concurrent things, then you should not be using async, rxCompletable, future and other similar things like that. Does it help?

viorgu · 2018-10-15T10:27:15Z

I've given this issue some thought and I think I can understand the reasoning behind this design decision.

However, I recommend updating the documentation to somehow highlight the following facts (assuming I understand them correctly):

While in a standard CoroutineScope any unhandled exception encountered will immediately cancel the current and outer scope, regardless of the builder used
The async (and other similar) builders do not wait for or require an await to throw the exception

runBlocking {
    val deferred = async {
        error("error in async")
    }
    yield() // will evaluate the async block and throw the exception
    deferred.await() // doesn't reach this line
}

Coroutine builders will not handle the exceptions that occurred in their respective block

runBlocking {
    // this will still throw an exception and cancel the scope
    rxCompletable { error("error") }.onErrorComplete().await()
}

This behavior can be controlled by using a supervisorScope or a SupervisorJob

runBlocking {
    // the exception will be passed to the coroutine builder and handled by onErrorComplete()
    rxCompletable(SupervisorJob()) { error("error") }.onErrorComplete().await()
}

While I understand this is not the ideal type of code people should be writing using coroutine, I still think the behavior should be clearly documented for when someone inevitably runs into similar issues.

Sorry for being so insistent on this issue but it really caught me by surprise when some tests failed after upgrading them to the structured concurrency model and I'd like to help reduce the chance that it confuses others in the future.

kmansoft · 2018-10-24T16:28:46Z

Related: #753 I think...

kmansoft · 2018-10-24T16:38:28Z

This makes sense, but maybe...

Now consider a case when foo1 works for a long time, while foo2 quickly throws an exception (while the code is still waiting for foo1. The failure of foo2 must immediately cancel runBlocking and cancel foo1 so that this error can be processed. There is simply no other way to it.

... should require some explicit syntax on the developer's part?

If you look at #753 this kind of automatic "bubble up / cancel / kill everything" is very unexpected - and is also wrong (I think) when there is a coroutine (somewhere in the work tree) that does a catch / ignore / retry / workaround for an exception from one of its children?

The way things are now - if a coroutine launches some children that it knows can fail, and has code to catch and handle any potential exception (if any is thrown), it will still get killed / cancelled, even though the developer has explicitly and intentionally added code (try / catch) in his/her coroutine to catch and do whatever is appropriate.

And yet, the current design assumes that it knows better - that any exception anywhere in the tree is fatal and everything needs to be cancelled / killed. Isn't this too "arrogant"? :)

Second, the problem is made worse by Kotlin's decision to not have checked exceptions (I'm not trying to discuss that, just pointing it out as a fact) - which means that any catastrophic failure scenarios like this could lurk in the code for a long time / not caught by testing / during beta releases etc.

jcornaz · 2018-10-25T15:02:42Z

@kmansoft,

this kind of automatic "bubble up / cancel / kill everything" is very unexpected

I love it personally. I'm always sure to not leak coroutines/resources without the need to add any boilerplate code. If something goes wrong everything will be cancelled and I can still catch errors to do something relevant like reporting and/or retry.

the current design assumes that it knows better - that any exception anywhere in the tree is fatal and everything needs to be cancelled / killed

Not at all. The developer may choose to use a SupervisorScope or SupervisorJob in order to say "any exception in a child is not fatal for this scope/job".

viorgu changed the title ~~Exception handling with structured concurrency (rx & aync)~~ Exception handling with structured concurrency (rx & async) Oct 9, 2018

elizarov added question waiting for clarification labels Oct 10, 2018

qwwdfsad mentioned this issue Nov 6, 2018

Async builder and cancellation in structured concurrency #763

Closed

qwwdfsad closed this as completed Apr 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception handling with structured concurrency (rx & async) #691

Exception handling with structured concurrency (rx & async) #691

viorgu commented Oct 9, 2018

elizarov commented Oct 10, 2018 •

edited

Loading

viorgu commented Oct 10, 2018

elizarov commented Oct 10, 2018 •

edited

Loading

viorgu commented Oct 15, 2018

kmansoft commented Oct 24, 2018

kmansoft commented Oct 24, 2018 •

edited

Loading

jcornaz commented Oct 25, 2018

Exception handling with structured concurrency (rx & async) #691

Exception handling with structured concurrency (rx & async) #691

Comments

viorgu commented Oct 9, 2018

elizarov commented Oct 10, 2018 • edited Loading

viorgu commented Oct 10, 2018

elizarov commented Oct 10, 2018 • edited Loading

viorgu commented Oct 15, 2018

kmansoft commented Oct 24, 2018

kmansoft commented Oct 24, 2018 • edited Loading

jcornaz commented Oct 25, 2018

elizarov commented Oct 10, 2018 •

edited

Loading

elizarov commented Oct 10, 2018 •

edited

Loading

kmansoft commented Oct 24, 2018 •

edited

Loading