Skip to content

Get rid of Thread.name augmentation with CoroutineId when debug mode … #3678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

qwwdfsad
Copy link
Collaborator

…is enabled

See #3677 and #2234 for the rationale

Fixes #2234

@qwwdfsad qwwdfsad requested a review from dkhalanskyjb March 13, 2023 17:58
@qwwdfsad
Copy link
Collaborator Author

@dkhalanskyjb WDYT? Do you have any objections or comments?

This PR is draft because it has a bunch of tests failing, mostly guide-related once where coroutine id is part of the output.
I also deliberately avoid adding a feature flag to revert the behaviour -- that way we'll see if anybody actually ever relied on this behaviour and will be able to make much more weighted decision (instead of the maintenance of such flag nearly forever)

@dkhalanskyjb
Copy link
Collaborator

  • When I google "coroutinename", most of the top results do mention/request naming threads. Plenty of tutorials and internal lore exist around the existing behavior of the debug mode.
  • This behavior seems to be also used for debugging: https://grep.app/search?q=Log%5C.d.%2AThread%5C.current&regexp=true&filter[lang][0]=Kotlin
  • The way I read the issue about the ART crash, it seems to happen in production but not during debugging, so it should not interfere with the purpose of the debug mode.
  • Performance is typically expected to suffer a bit during debugging.

I don't understand why the debug mode is enabled in production, and from the discussion in #2234, our users don't, too. Something somewhere introduces the debug mode behavior, most of which is typically undesirable in production, not just this part. That's the actual root of the issue, the way I see it. As or Thread.name, yes, if we look at it as a feature for production builds, it's really problematic, but as debug functionality, I think it could be nice.

@elizarov
Copy link
Contributor

elizarov commented Mar 16, 2023

One thing worries me here. If we had ideal working coroutine debugging tools, we would not need Thread.name augmentation feature at all. But we don't. In particular, if I run the code with coroutines with under debugger and just press "pause", I don't see any coroutine information in the debugger. Currently, the augmented coroutine name would be my only way to figure out what's going on. Would not it be safe to leave it under a flag?

@qwwdfsad
Copy link
Collaborator Author

I don't understand why the debug mode is enabled in production

I tend to think completely the opposite. We should eliminate as many barriers as possible for folks to keep debug mode enabled (the same way a lot of folks have -ea in production) -- performance, non-trivial behaviours, and cumbersome scenarios.

The reason it is not enabled by default is straightforward -- it changes the semantics of the application: thread names, exception identities, slightly slowing down the performance etc. and it is not acceptable as the default for a sufficiently large framework.

Yet it would be nice to eliminate as many barriers as possible for an average user to have better diagnostics, observability and additional info as possible; coroutines are already hard to debug, and that's the reason we have requests like #3611 and DebugProbes improvements (that are enabled by default in all IDEA products).

For me, it seems like changing thread's name is not carrying its own weight and I'm yet to find the use of this behaviour for an actual app in the wild.

Roman has a good observation about the actual debugger though, it would be nice to keep this behaviour unless IDEA's debugger automatically augments thread's names in its tooltips. Taking into account current state of IDE team priorities, It's unlikely to happen in an observable timeframe though

@dkhalanskyjb
Copy link
Collaborator

Debugging is when you dive into code to find a bug. At that point, you disable compiler optimizations (where applicable), enable the (often stripped) debug symbols like function names, and don't care even if your code runs twice as slow as it does in production, as you want all the help you can get. For example, it's not inconceivable to record the complete execution of a program when debugging to obtain a consistent reproducer (https://rr-project.org/), or to instrument memory reads and writes (https://valgrind.org/, https://github.com/google/sanitizers/wiki/AddressSanitizer). In production, however, a ≈50x slowdown is often out of the question.

Clearly, diagnostics are often useful in production, despite their cost. After all, there's the whole sprawling telemetry industry, and its role is not limited to just building advertisement profiles, and telemetry is often heavier than, say, stacktrace recovery. The idea behind #3677 also seems solid to me: yes, what we have now for debugging can also be useful for diagnostics. However, there's a spectrum between "everything runs as fast as it can, but we get zero visibility into what's going on" and "everything is instrumented, the full logs are stored, and even the slightest anomalies are detected and reported," and my argument is making it feasible to enable debug mode in production, essentially jamming debugging and observing into a single point of that spectrum, is not a worthy goal.

@elizarov
Copy link
Contributor

elizarov commented Mar 17, 2023

Good point on telemetry tools. Our goal should be providing easy-to-use APIs for telemetry tools and working with the tool vendors so that they add proper support. Changing the thread name is just a cheap hack to get this kind of integration "for free" as in "zero new APIs", but, as it turns out, it is not free at all in terms of runtime cost.

@qwwdfsad qwwdfsad closed this Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants