-
Notifications
You must be signed in to change notification settings - Fork 1.9k
kotlinx.coroutines.test: Option to disable virtual time and delay skipping #3179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yep, I also think it's really important. Another use case is integration tests that need to interact with real-time systems that are unaware of virtual time and cannot be made aware in any way. The purpose of Note that there already was a discussion on this topic, but I think it's worth bringing back: |
Sure, we recognize the problem, it's just that the API shape is currently unclear. withContext(StandardTestDispatcher(scheduler = testScheduler, skipDelays = false)) {
// do whatever
} But then, there's not much reason to use the test dispatchers at all. We would prefer to take our time with this because there are other, more intrusive aspects of the test API that would influence the API of this functionality—like mocking the IO and Default dispatchers—and it doesn't look like the issue is urgent, as there is a workaround (from the linked thread): withContext(Dispatchers.Default.limitedParallelism(1)) {
// delays are not skipped here
} If there are usage patterns not covered by this, please share them. |
I think the issue perhaps with the current API is the naming, not the functionality. "runTest" is such a generic name it implies that it is just "the way to do tests", StandardTestDispatcher implies this even stronger. With such a generic name I would think think that the functionality should be generic. In some contexts skipping delays breaks correctness contracts (networking), and in others it can cause heavy workloads to overrun (polling services in a cross system test). Performing a delay when it is not needed however at worst makes a test time out or hang, which can be hard to debug sure, but those cases should be fairly rare. The name "runTest" doesn't really imply that skipping should happen by default. I would expect runTest to by default do the same thing as runBlocking, but return a TestResult, But it would have optional parameters like, skipDelays. With the current approach a team might very easily get use to just using runTest on all their tests, then one day when they have to write an integration test that polls services they end up DOSing their own service because they never released delays would be skipped. Neither approach addresses the fact though that both approaches might need to be mixed. It is possible that you may want a test that skips animations, but does not skip polling. There really is no way to do this unless you wrap all of your polling delays in the mentioned workaround. Perhaps calls to delay could be associated with some sort of tag.
Then you could do:
|
Haha, there it is to bite us. It reminds me of our discussion with @dkhalanskyjb on semantic precision in #3050 (comment).
And I agree with you @EanLombardo, I think the issue is that historically this library was JVM-only, so the only point to use |
Could you elaborate? If you're doing networking, why are you not doing it in
|
I guess that's where opinions diverge. Skipping delays doesn't seem to be as reasonable a default as you seem to imply. Lots of tests are surprised by it.
But would it wait, though? The predefined value would likely be returned instantly, and a wrapping @OptIn(ExperimentalCoroutinesApi::class)
class CoroutineTest {
@Test
fun test() = runTest {
regularSuspendFun() // fails instantly by default
}
private suspend fun regularSuspendFun() {
withTimeout(1000) {
someRealIO()
}
}
private suspend fun someRealIO() = withContext(Dispatchers.IO) {
Thread.sleep(100)
}
} All I'm saying is that this is surprising by default, to many people.
No, because catching exceptions is likely to be a facility that is welcome by default. My argument is that delay-skipping is not unanimously accepted as a desirable default, unlike other things you seem to have in the pipe for |
Depends on the problem area and one's test infrastructure, I guess. In general, this issue is getting way less attention than the other ones: test
In the case you show, no, there's no waiting. But it's also valuable to test that timeouts do trigger, like (pseudocode): @Test
fut testLongResponsesTimingOut() = runTest {
getServerResponse(alwaysSilentServer)
assertUiStateIs<UnsuccessfulConnection>()
}
suspend fun getServerResponse(server: Server) {
drawASpinner()
withContext(Dispatchers.IO) {
withTimeout(5.seconds) {
server.requestResponse()
}
}
updateUi(UnsuccessfulConnection())
}
Oh, I am certain we're even going to break someone's tests that way and they will come to us and complain that they were fully aware of exceptions in
I've heard a saying that if you asked a hundred people to vote on whether all those who voted would get a free dinner, not all would agree. With an audience as extensive as ours, doing anything unanimously is impossible. So, other considerations come into play. One consideration is as follows: it is wrong for a test to perform delay skipping when it shouldn't, it is wrong for a test not to perform delay skipping when it should, and by choosing a default, we choose which mistakes are more likely to happen. Now, judging by this very thread, it looks like the cases when delay skipping should not have happened but happened anyway are very obvious: you get an immediate timeout, or your network calls all fire in quick succession, or something like that. So, you run your test, observe the issue, write an irritated comment here, problem solved. And even when the problem was there, it was localized to a single test. Or if not, if that's some common initialization code, just fix that. On the other hand, let's imagine that |
Fair enough.
My bad, wrong word. We obviously cannot satisfy everyone. My point is that we disagree on the "majority" in this case.
That's very fair as well. As I said, I'm not against the idea of enabling some test niceties by defaults. There is obviously a balance to strike, and the decision of enabling those features by default is not all-or-nothing. We could enable some test features by default, and some could be opt-in. I don't know the details about the exception handling feature per se, so I'm not debating this one here of course, it was merely an example. But I am discussing the amount of use cases that do need delay-skipping versus the amount that don't, which would inform the choice of default.
I'm 100% with you here. It's exactly the kind of test that would willingly decide that they want controlled time to generate timeouts. In this situation, you want controlled time because that's the goal of the test, so you enable whatever feature you need to control time, or use a special dispatcher for it, etc. But that's one use case. The question is rather, for the unsuspecting regular test, who really needs delay-skipping? Which use-cases did you have in mind for delay-skipping in tests that are not about specifically testing timing?
I totally agree with this approach, and I arrive to opposite conclusions.
That wasn't my experience. It took me a while to understand what was going on the first time I faced this (and I had already a decent experience with coroutines). It was affecting most of my tests, and in particular all the tests that were calling a real service (spawned in a sidecar container for testing purposes). The proper fix, once the cause is understood, would be to write my own
This seems way more obvious to me than your previous point. If you have a problem of speed, it's easy to understand the relation between your delays and the speed of the test, and you would immediately look into facilities to speed things up or mock time itself. On the other hand, if you have random timeouts and crashes when you didn't use anything special, it's harder to identify why you're having this behaviour if you weren't aware of delay-skipping in the first place.
I have trouble understanding in which case someone would want to mock using Summary So I guess we agree that:
We disagree on the relative impact of skipping when undesired VS not skipping when desired. IMO solving a slow test is obviously linked to time (and leads to researching things in this area), while identifying crash root causes is harder and might not lead to looking into delay-skipping. |
When researching how people wrote tests with the old test framework, I saw code like this all the time. This is in order to check how parallel execution behaves. "While this happens, if it takes this long, what will the rest of the system do", stuff like that. So, yeah, timings.
If you know that delay skipping exists! How many people do you think would know this? There are some precedents of delay skipping, like robolectric, but not all that many at all. Why would one even expect that delay skipping for tests exists? In production, when your code is slow, you don't research "how to make time-consuming parts take zero time", you go ahead and do performance optimization. How does one even arrive at the option of writing tests some other way? In this case, someone who didn't know about the possibility of delay skipping could instead try to use |
My point didn't require the assumption that people already know they exist. My point is that a Google search will lead to that pretty quickly:
Maybe you're right here. I am not sure though. For instance, when you are calling a real service you might research how to mock it if it's too slow. When you run tests with "sleeps"/"delays" in them, you might wonder if there is a better way, and that leads to this kind of question which is exactly what I'm expecting in my assumptions. On the other hand, if we assume people don't know about delay skipping, it may take quite some time before realizing why their Admittedly, now that people have had the problem, searching for "coroutine withTimeout fails instantly in test" leads to this SO question (2nd result for me) which is one of the questions I'm thinking about when saying people are surprised by it. So, maybe if we wait a bit more and more people have trouble with that, it will become visible enough to not be a concern anymore 😄 |
Well, there is a concern with the solution being hacky. Without searching for this issue, one may not be able to think of I think a good API shape for this would be to have a coroutine context element named something like runTest {
withContext(NoDelaySkipping) {
initialize()
}
checkAssumptions()
} or for the whole test: runTest(NoDelaySkipping) {
initialize()
checkAssumptions()
} Also, easy to pass to some deeply-nested entities when doing mocking. |
That would indeed look nice! |
It being obvious is not necessarily the problem. There are some cases delay correctness is important outside of the scope of the test itself. For example, if I am writing an end-to-end integration test that checks production, whose job it is to schedule some async work in system A, then poll system B every 30 seconds until I get the expected result, or notify someone that production is broken if that doesn't happen. It would be nice if I could write this test using existing coroutine based client librararies, and run this as part of a normal testing system. So I pull in the client library code, write it as a Junit test, wrap the test body in runTest, execute the test, and start DOSing production unknowingly because my 30second delay is actually instant. The scale of testing matters, not all tests are unit tests, not all test simply break the build, some of them make sure the world isn't broken. Highly generic testing utilities, like assertFailsWith, can be used for any level of test. runTest, can be dangerous for certain tests, but that is not at all clear from its name. That is the part I have a problem with. |
Did you actually witness projects where this could happen, or is this just a theoretical danger? In my experience, if it's possible to DDoS a production system from an incorrectly-written test, the system as a whole is dangerously brittle. All it takes is one programmer not being careful enough and forgetting a |
Randomly ended up looking at this issue again, and sorry for spam, but could someone with sufficient permissions fix the 2 typos in the issue title? It's been bugging me since the beginning 😅
So: |
I understand that there could be some cases where a virtual clock might be useful to make tests run faster. Unfortunately, this breaks a whole lot of normal coroutine use cases out of the box:
I think this is not a good default behavior. I understand that using delay within tests is not good, but there are plenty of times in real code that timeouts are needed. runTest should be useful for multiplatform testing, but I think for now I will roll my own. |
It does not break Turbine. We have tests to ensure virtual time skipping works correctly. |
While delay skipping is working in Turbine, auto-advancing time causes some weird quirks that make delays very hard to test with Turbine. Or maybe I'm doing it wrong. Just filed a bug: cashapp/turbine#268 |
Supposedly, these use cases happen on |
It is often a need to keep regular
delay()
behavior with actual delay, for example:kotlinx.datetime
)It could be something like
runTest(noVirtualTime=true) { /}
The text was updated successfully, but these errors were encountered: