-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Support completing the test coroutine from outside the test thread. #1206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note to reviewer: Reading the code, I believe invokeOnCompletion will get a invoked result if the coroutine was already completed in every case (including completion by another thread). Can you confirm that this understanding is correct and the code here is sound? Thanks, |
Any ETA on when this will be released? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed change does not fix the problem, but masks it, adding even more non-determinism and making test flakier and much harder to debug/diagnose for inexperienced developers.
For example, the test from #1222 now passes:
@Test
fun foo() = runBlockingTest {
println(1)
withContext(Dispatchers.IO) {
println(2)
}
println(3)
}
But not because the fix actually fixes that, but because it hides non-determinism with the additional check. You can see that by adding a random delay right after println(2)
and observing the failure ratio (it will be linearly increasing with the delay value). The similar effect can be achieved by adding Thread.sleep(rand())
right after advanceUntilIdle
call.
While the error message mentions this fact, it still appears pretty non-deterministic. E.g. a developer ran her tests once, they passed, the developer pushed changes and then tests fail on CI 15 minutes later.
I am sure we can provide better user experience here :)
Additional observation, not related to this particular fix, but to establish more common ground (sorry, I've missed the original design discussion, so maybe it was already discussed before):
End-users tests behave inconsistently depending on the target dispatcher. For example:
@Test
fun foo() = runBlockingTest {
launch {
delay(1000) // Or any blocking/external call, e.g. Thread.sleep
}
}
this test is perfectly valid, though launch
is not yet completed by the end of runBlockingTest
block, while the same code, but with different dispatcher (e.g. launch(Dispatchers.IO)
) is not.
Even worse, it is now timing dependent as we've seen in #1222. It is clear (for me) neither from the implementation point of view nor from API design shape why this is the desired behaviour for end users.
These facts (alongside with very suspicious invokeOnClose{}.dispose()
) makes me think that we should change the design of runBlockingTest
itself to make it deterministic and consistent.
The first thing that comes to my mind is an event loop with a timeout:
withTimeout(DEFAULT_TIMEOUT) {
while (!isCompleted()) {
advanceUntilIdle() // TODO should not burn 100% CPU when completion comes from another thread
}
}
if default timeout is any reasonable value (e.g. 1-5 seconds), then most of the determinism will disappear, and for integration tests or advanced use-cases we can introduce one more parameter and/or JUnit rule.
WDYT?
Thank you for the detailed review! Agreed completely this is a surprising API and introduces an unfortunate non-determinism. Putting together a new commit based on your suggestions. |
This requires a call to invokeOnCompletion to get the completion state from the other thread. Changes: - runBlockingTest now supports completing the test coroutine from another thread - success and failure path tests While fixing this, a subtle non-determinism was discovered that would lead to flakey tests. If another thread completed the test coroutine *during* the cleanup checks it was possible to modify the state of the test during the cleanup checks in a way that could expose false-positive or false-negative results randomly. To resolve this, a non-completed coroutine immediately after advanceUntilIdle is now considered a failure, even if it completes before the more aggressive cleanup checks. There is still a very brief window (between advanceTimeBy and getResultIfKnown) that non-determinism may be introduced, but it will fail with a descriptive error message at on random executions directing the developer to resolve the non-determinstic behavior. Note: testWithOddlyCompletingJob_fails may fail when the implementation of runBlockingTest changes (it's very tightly coupled).
As a result, runBlockingTest is now correct in the presence of repeated time-delayed dispatch from other dispatchers. Changes: - runBlockingTest will now allow a 30 second timeout for other dispatchers to complete coroutines - Introduced WaitConfig, SingleDispatcherWaitConfig, and MultiDispatcherWaitConfig to configure runBlockingTest timeout behavior - Added DelayController.queueState as a ConflatedBroadcastChannel to observe or poll the queue status of a TestCoroutineDispatcher - Added queue status of Idle, HasCurrentTask and HasDelayedTask as public APIs for quering the status of a TestCoroutineDispatcher - Added dependency on runBlocking from runBlockingTest - Improved documentation for threading concerns around resumeDispatcher
4da43bf
to
d3f40a9
Compare
@qwwdfsad @elizarov I've uploaded a proposed change that adds an event loop to runBlockingTest. PTAL! If you can take a look for API review I'll make any updates necessary. One big decision that I wasn't sure about - there is a way to implement this by making TestCoroutineDispatcher be an EventLoop. However, I do kind of like the current impl (using a state channel) since it also exposes the API requested in #1202. For the parameter, I ended up giving it a class since it was really confusing to pass a timeout of 0 for the expected failure case (it felt like it would lead to false positives when using). See
Let me know if you think this works or if you would prefer to see it implemented with an EventLoop. |
Is it possible that this is also very important for testing
Is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a first pass mostly at API surface. Please, take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed only API surface, let's put implementation details aside (I can help you with them later if necessary) and focus on improving public API shape and desired behaviour.
I've also added test tag to simplify feedback processing.
@@ -112,9 +113,75 @@ public interface DelayController { | |||
* Resumed dispatchers will automatically progress through all coroutines scheduled at the current time. To advance | |||
* time and execute coroutines scheduled in the future use, one of [advanceTimeBy], | |||
* or [advanceUntilIdle]. | |||
* | |||
* When the dispatcher is resumed, all execution be immediate in the thread that triggered it. This means | |||
* that the following code will not switch back from Dispatchers.IO after `withContext` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are here, could you please elaborate on why this behaviour is preferred over "classic" dispatching?
I am a bit concerned about how this behaviour is different from "release" builds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the main goal of this eager behavior was to make this test function testable without having to call runCurrent():
fun launchesCoroutine() {
scope.launch {
// do stuff
}
}
In a test that doesn't use multiple threads (TestCoroutineDispatcher is used for all coroutines) it provides eager entry into the launch body (during dispatch the launch body is immediately executed).
As this is both the normal structure for this code (at least on Android) it is nice to avoid extra calls to runCurrent()
in this test
@Test fun callLaunchesCoroutine() = runBlockingTest {
// assume `scope` is using this TestCoroutineDispatcher
launchesCoroutine()
// coroutine has already launched here
}
This thread switching behavior is an undesired side effect of that API choice.
So basically, I think there are two options here:
- Modify the behavior of
dispatch
to require a call torunCurrent()
in this test. This does make common test cases require an extra call, but leads to correct threading behavior if the developer hasn't injectedTestCoroutineDispatcher
throughout their code under test. - Keep the behavior (prefer eager entry of launch blocks) with the assumption most tests will not execute coroutines on multiple threads.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
"Keep the behavior" part is, of course, more preferable as it really simplifies writing simple unit tests for end users. I've just realized that this is exactly why we thought to make Uncofined
the default behaviour of TestMainDispatcher
.
I think it's easier to change the docs and explain that test dispatcher acts like unconfined one (and maybe implementation should use it as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a pretty good way to explain it. I'll add that to the docs in this PR since it's become all about threading.
* runBlockingTest { | ||
* pauseDispatcher() | ||
* withContext(Dispatchers.IO) { doIo() } | ||
* // runBlockingTest has returned to it's starting thread here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit counter-intuitive as well, pauseDispatcher
does not pause the dispatcher, but changes its behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea - this one I'm stuck on. It seems required to support the pauseDispatcher { } block in runBlockingTest but it has this side effect in the presence of multi-threading.
Minor notes: all |
Sorry delayed response (just got back to work after travel) - taking a look at comments above. |
Ok after reading all the feedback here's a rough plan for this PR:
The first will help us focus in on the API surface without the distractions of the extra classes/channel API. The second seems important and should land before this library is considered stable. I would like to advocate for keeping the current behavior since it makes single threaded testing easier (which is the recommended practice) but can see the complexity exposed by it in a multi-threading behavior change. I am curious if there is a way to unify both API goals? If so that would be the best approach. |
Waiting for updates. |
…is fixed / Kotlin/kotlinx.coroutines#1206 merged and both released
Hi @objcode, do you have any update on this? |
Definitely - sorry delay had a few interruptions. Let me put together the event loop impl. tonight and then pass it over to you for finalization. |
for external parking via a single suspend function.
@mariusboepple The actor issue you're seeing above is likely because you're not terminating the actor before runBlockingTest completes. In the case of an infinite loop such as an actor, you'll need to cancel it and ensure it's complete before exiting runBlockingTest or it will detect the leaked coroutine. |
[Dispatchers.Unconfined]
@qwwdfsad @elizarov updated PR with requested changes. I was unable to find a working solution using EventLoop/unpark() since that's all implemented in EventLoopImplBase so I externalized parking via a small additional public interface https://github.com/Kotlin/kotlinx.coroutines/pull/1206/files#diff-e3c98820170c8b47a0aad1428e940c2cR255 PTAL! |
Any update on this pull request? |
Moving it to |
Any updates on the release ? |
Where did the moved PR end up? This issue is closed, but kotlin-coroutines-test 1.3.8 still has this problem. |
1.4.0-M1 still has this issue |
It seems that 1.4.0 still has this issue |
kotlinx-coroutines-test:1.4.1 still has this issue |
kotlinx-coroutines-test:1.4.2 still has this issue |
@qwwdfsad
Can I ask you to link the succeeding PR (if any) here for future references? A couple of open issues (#1204 #1222) lead here and this PR kinda makes an impression it's been closed and abandoned for good. |
We've actually realized the amount of work needed to be done to get this module in order and decided to redesign it from scratch, trying to address all the issues with This process is quite long-running and painful because we, in addition, don't want to force to migrate all the users of this module to a new API and are trying to maintain backward compatibility. |
Implement an eventLoop for runBlockingTest.