Skip to content

runBlocking executes other tasks instead of its own #4215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dkhalanskyjb opened this issue Aug 14, 2024 · 4 comments
Open

runBlocking executes other tasks instead of its own #4215

dkhalanskyjb opened this issue Aug 14, 2024 · 4 comments
Labels

Comments

@dkhalanskyjb
Copy link
Collaborator

dkhalanskyjb commented Aug 14, 2024

Describe the bug

A guy promises to an old lady that he'll fix a broken staircase, but then arrives to the house and notices that it's in complete disrepair. He calls his old friend to come and redo the circuitry. He doesn't touch the instruments: "My friend will need them, and after he's done, then I'll fix the staircase." The friend arrives and notices that the house is in complete disrepair. He calls another guy to come and fix a door that's hanging on one hinge. Meanwhile, he doesn't touch the instruments: "Let my friend fix the door first, and then I'll do my job". Repeat this situation enough times, and you get hundreds of people in one room, with no one doing anything. Repeat this more, and the house blows up because of how many people are squeezed in there.

These guys were runBlocking. The house is the stack space, and the set of tools is the current thread. The old lady is an unfortunate user of our code.

Reported in https://youtrack.jetbrains.com/issue/KT-66219.

Related to #3982: if we fix this, the reproducer for #3982 will no longer work, though the general issue will likely still be there. EDIT: yes, it will.

Provide a Reproducer

runBlocking {
    repeat(1000) {
        launch {
            try {
                runBlocking {
                    // do nothing
                }
            } catch (e: Throwable) {
                println(e)
            }
        }
    }
}

prints

java.lang.NoClassDefFoundError: Could not initialize class kotlin.internal.PlatformImplementationsKt
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
java.lang.StackOverflowError
@dkhalanskyjb
Copy link
Collaborator Author

(Excerpts from our design meeting with @qwwdfsad and @zuevmaxim)

One solution to this is to have runBlocking prioritize their own tasks. This can be done by each runBlocking adding its own CoroutineContext element to the coroutines launched inside it, recognizing them by that element, and giving them priority.

The problem is that runBlocking should perform work-stealing anyway to improve liveness, and it's not clear what strategy to use. We could take Dispatchers.Default's scheduler and repurpose it so that each runBlocking gets its own queue with work-stealing support, but that's complicated. A deterministic but seemingly arbitrary choice that supports liveness is to traverse the stack of all runBlockings running on the current thread (let's number them from 0, where 0 is the currently running runBlocking) and simulate a streetlight pattern: check own queue — if empty, check runBlocking #1 — if empty, check runBlocking #2 — and so on; but next time, check 0 — then 2 — then 3, and so on, and then 1; next time, check 0 — then 3, then 4, and so on, then 1, then 2. This way, we avoid a situation where runBlocking 0 and 1 communicate with one another without a chance for deeper runBlocking to do anything.

In any case, prioritizing own work has a downside: the following code will stop working:

runBlocking {
    var exit = false
    launch {
        runBlocking {
            while (!exit) yield()
        }
    }
    launch {
      exit = true
    }
}

The inner runBlocking will spin without giving exit = true a chance to run.

In summary, if this issue is actually bothering someone, we know how to fix it, but if it doesn't, we'd prefer not to change runBlocking's ordering needlessly.

@dovchinnikov
Copy link
Contributor

The problem is that runBlocking should perform work-stealing anyway to improve liveness

I still think this is the main problem. runBlocking is the only coroutine library primitive that does something out of scope.

@pablichjenkov
Copy link

Great analogy 👌

@ninja-
Copy link

ninja- commented Dec 29, 2024

👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants