Skip to content

Stacktrace recovery #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Dec 13, 2018
Merged

Stacktrace recovery #792

merged 19 commits into from
Dec 13, 2018

Conversation

qwwdfsad
Copy link
Collaborator

@qwwdfsad qwwdfsad commented Nov 4, 2018

Basic exception stacktrace recovery mechanism

  • Implement CoroutineStackFrame in CancellableContinuationImpl, DispatchedContinuation and ScopeCoroutine
  • On coroutine resumption try to reflectively instantiate exception instance of the same type, but with augmented stacktrace
  • Recover stacktrace by walking over CoroutineStackFrame
  • Recover stacktrace on fast-path exceptions without CoroutineStackFrame walking to provide more context to an exception
  • Unwrap exceptions when doing aggregation in JobSupport
  • Add kill-switch to disable stacktrace recovery, introduce method to recover stacktrace on the exceptional fast-path
  • Add suspendCoroutineOrReturn on exceptional fast-path in await in order to provide "real" stacktrace

Design rationale:
All recovery of suspended continuations takes place in Dispatched.kt file, the only place where all calls to "resume*" ends up, so we don't have to remember about stacktrace recovery in every primitive we are implementing. But to provide more accurate stacktraces we have to recover it on every fast-path for better debuggability.

Fixes #493

@qwwdfsad qwwdfsad requested a review from elizarov November 4, 2018 12:35
@qwwdfsad
Copy link
Collaborator Author

qwwdfsad commented Nov 4, 2018

This is still work-in-progress. It is worth reviewing, but I am actively trying to adjust this branch for debug agent purposes and won't merge it until debug agent is ready.

For example, I've found that recovery is stable only if stacktrace is recovered from Dispatched.kt (where all resume* end up eventually), thus changing the initial implementation accordingly.
(The problem was in our variety of our suspension mechanisms and exception reporting and aggregation)

  * Implement CoroutineStackFrame in CancellableContinuationImpl, DispatchedContinuation and ScopeCoroutine
  * On coroutine resumption try to reflectively instantiate exception instance of the same type, but with augmented stacktrace
  * Recover stacktrace by walking over CoroutineStackFrame
  * Recover stacktrace on fast-path exceptions without CoroutineStackFrame walking to provide more context to an exception
  * Unwrap exceptions when doing aggregation in JobSupport
  * Add kill-switch to disable stacktrace recovery, introduce method to recover stacktrace on the exceptional fast-path
  * Add `suspendCoroutineOrReturn` on exceptional fast-path in await in order to provide "real" stacktrace

Design rationale:
All recovery of *suspended* continuations takes place in Dispatched.kt file, the only place where all calls to "resume*" ends up, so we don't have to remember about stacktrace recovery in every primitive we are implementing. But to provide more accurate stacktraces we *have to* recover it on every fast-path for better debuggability.

Fixes #493
  * Extract call to suspendCoroutineUninterceptedOrReturn behind recoverability check
  * Cache exceptions constructors into WeakHashMap in order to avoid expensive ctor lookup
@qwwdfsad qwwdfsad force-pushed the stacktrace-recovery branch from 9013717 to 25a886e Compare November 19, 2018 14:36
@qwwdfsad

This comment has been minimized.

@elizarov

This comment has been minimized.

@qwwdfsad qwwdfsad force-pushed the stacktrace-recovery branch from 55e9ff6 to f7c3e6b Compare November 29, 2018 09:17
@qwwdfsad qwwdfsad force-pushed the stacktrace-recovery branch 2 times, most recently from a0cd925 to ca3acc0 Compare November 30, 2018 16:20
  * Recover stacktraces via both fast and slow paths in scoped builders
  * Merge stracktraces of nested scoped builders, eliminate common prefix so it looks like a flat profile
  * Optimize stacktrace-related machinery
  * Copy meaningful part of original exception into recovered one, rename "Current coroutine stacktrace" to "Coroutien boundary"
  * Rename Exception.kt to StackTraceRecovery.kt to avoid name clash with another Exception.kt
@qwwdfsad qwwdfsad force-pushed the stacktrace-recovery branch from 1e37d13 to 1032f58 Compare December 5, 2018 15:28
  * Can be installed dynamically or from command line
  * Captures coroutine creation stacktrace and stores it in completion, automatically enhancing stacktrace recovery mechanism
  * Allows to dump and introspect all active coroutines
  * Allows to dump Job hierarchy
  * When installed from command line, dumps all coroutines on kill -5
  * Probe support in undispatched coroutines
qwwdfsad and others added 11 commits December 10, 2018 19:12
  * Do not store creation stacktrace of the coroutine, recover it from ArtificialStackFrame
  * Reduce lock window where possible to avoid contention
  * Avoid stacktrace copying where possible
# Conflicts:
#	common/kotlinx-coroutines-core-common/src/JobSupport.kt
#	core/kotlinx-coroutines-core/test/exceptions/SuppressionTests.kt
  * Readme
  * Do not fail on Windows during signal handler installation
  * Do not warn about missing artificial stackframe
…endency of target process), use shadow plugin for that to avoid clashes with other byte-buddy versions and publish shadow jar instead of regular one for debug module
Coroutines debug agent

Fixes #772
@qwwdfsad qwwdfsad merged commit 29493e1 into develop Dec 13, 2018
@qwwdfsad qwwdfsad deleted the stacktrace-recovery branch December 13, 2018 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants