Skip to content

Halt the 2D render loop when the canvas is removed. #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

emackey
Copy link

@emackey emackey commented Dec 17, 2015

This is a partial fix for plotly.js#126 that will stop the 2D GL render loop when its canvas is removed from document.body.

The advantage of this approach is that it works even if purge is never called, for example when using Tabs.fresh() in the test dashboard. Any removal of the plot from the page at any level is enough to shut down the rendering.

A separate fix is still needed to address the same issue in 3D scenes.

@etpinard
Copy link
Contributor

Thanks so much for writing this partial fix and tackling the problem you found! 🍻

It seems great for you to use temporarily for the problem you're experiencing, but for a final solution we're going to delve into the issue a bit more.

First, we need to figure out if the situation discussed in #126 happens in gl3d graphs.

In this case where the gl3d is able to correctly handle lost/remove contexts, we should consider porting this sub-routine in the gl2d code.

@emackey
Copy link
Author

emackey commented Jan 4, 2016

Happy New Year! Here's my $0.02 on this situation. There are two different issues at play here, context lost, and runaway requestAnimationFrames.

I'll start with context lost. When you create many separate gl plots, each one gets its own WebGL context. Eventually there are too many for a single page, and the oldest ones are "lost" (destroyed by the browser). You can have code similar to what your gl3d plots have, where it listens for the context lost event, and takes steps to try to recover the context. But don't be too hasty: If the user's app has shown and discarded several different plots, the oldest contexts may belong to plots that the user dismissed some time ago. There's no good way for the Plotlyjs library to know if the user app is planning to resurrect that old plot, or if it has been permanently dismissed. The user app may simply create a fresh plot if the user clicks the button again. In this case, even if Plotly receives a context lost event, Plotly should not attempt to restore the context. It was lost with good reason, and the app may not want it back, particularly if restoring it costs contexts from more current plots.

This PR doesn't deal with context lost of course, it's all about requestAnimationFrame. The problem (design flaw?) with requestAnimationFrame is that it's a browser-window-level event, not tied to any particular DOM node, canvas, or context. With each new gl plot (2d and 3d), Plotly (or the 3d sub-module) schedules a new requestAnimationFrame callback, and each time one of these callbacks fires, it schedules itself for the next frame. This means if you've opened 4 gl plots, then all 4 callbacks will fire every 60th of a second or so, anytime the video card thinks a new animation frame should be constructed. That's just fine, until the app dismisses some of the plots. If a given gl plot has been purged, or removed from the page, then you're just using up CPU (and mobile battery, possibly) if you allow the requestAnimationFrame callback to continue re-scheduling itself for that particular plot.

You guys mentioned, in a different issue, that purge is slated to become private. Apps are not expected to call anything if they get rid of a plot, for example in response to a user clicking a "close" button on a pop-up plot in a single-page web app. Worst case, the app will simply set display: none; on the popup, and Plotly will never know that the requestAnimationFrame loop needs to be shut down. With this PR though, if the app removes the plot from the page (by actually removing the node, not just setting a CSS style), the 2D plot at least will notice that it's not on the page anymore, and will stop rescheduling the requestAnimationFrame callback, effectively shutting down the render loop for the dismissed plot. IMHO, a better solution would be to leave purge public, and fix it so it's always safe to call, and then purge can be responsible for stopping requestAnimationFrame.

The 3D gl module does need this same fix, but since it's in a separate repo, that fix can't be part of this same pull request. Hence this is marked a "partial" fix (complete for the 2D plots only).

@etpinard
Copy link
Contributor

etpinard commented Jan 5, 2016

@emackey Happy new year to you too. Thanks for your feedback.

What if we stored the requestAnimationFrame id in the scene object and call cancelAnimationFrame(id) upon context lost, as suggested here ?

Would that solve this issue or am I missing something?

@etpinard
Copy link
Contributor

etpinard commented Jan 5, 2016

Potentailly more useful info here: https://www.khronos.org/webgl/wiki/HandlingContextLost

@emackey
Copy link
Author

emackey commented Jan 7, 2016

No, context lost is a separate problem from requestAnimationFrame. Perhaps a more concrete example would help highlight the issues at hand.

Let's imagine a stand-alone, single-page web app that has several buttons on it that bring up different Plotly charts fed by data from the server. When the user clicks the first plot button, a graph pops up and the render loop starts. 60 times a second or so, the gl-based plot re-renders itself, whether it needs to or not. Soon you can hear the laptop's cooling fan spin up, or your mobile phone gets warm from the furious activity, although visually nothing is changing.

Next, the user clicks the app's little "x" button to dismiss their first plot. The plot is removed from the page, but the requestAnimationFrame callback is still re-scheduling itself at about 60fps. No gl contexts have yet been lost, and the seemingly dismissed plot's gl context is still valid and still being re-rendered to texture memory at full frame rate without being shown on the screen.

Next, the user brings up 7 more gl-based plots, one at a time, and dismisses each one after reviewing it. The browser allows 8 active gl contexts on a single page, so we have not yet seen our first "context lost" event take place. All 8 dismissed gl plots are re-rendering as rapidly as they can into texture memory behind the scenes, taking CPU, GPU, battery, memory, resources, and keeping their gl contexts active.

Our persistent user brings the total above 20 plots. Now the browser issues a warning: You can't have 20-some active gl contexts in a single page, so the oldest context is lost, and the page receives its very first event that one of its many contexts was lost. The code may react by following the advice on the above posts regarding reclaiming lost contexts, but the browser simply isn't going to allow more than the maximum number of contexts. So the moment the page tries to reclaim its first lost context, the browser can only restore that context by bumping off one of the more recent gl contexts, generating a new context lost event for that one. Worst case, this can trigger an endless game of whack-a-mole where the page has MAX+1 contexts and keeps asking to restore one, and keeps losing another one in the process, over and over.

In general, I would say the responsibility for handling lost contexts rests with the user app, not the Plotly library, since only the app knows which contexts are still needed by the user. At most, Plotly could offer a helper function for the app to call if the app ever receives a context lost event on a context that is known to still be actively used and visible to the user. But in most cases, the oldest gl context is likely not being shown to the user anymore, and should actually be lost and stay lost, to keep the active context count low.

As for requestAnimationFrame, I see two approaches: (a) detect when the plot is removed from the page, as this PR does, and automatically halt the render loop when that happens, or (b) make the purge function more robust, more publicly prominent, and encourage apps to call it when dismissing old plots, even if a new plot is not immediately forthcoming, so purge can be responsible for manually halting any gl render loops and destroying the associated context.

@etpinard
Copy link
Contributor

etpinard commented Jan 8, 2016

Ok. I see.

There’s no way, in general, to match a context lost event to a particular animation frame.

@etpinard etpinard closed this Jan 8, 2016
@etpinard etpinard reopened this Jan 8, 2016
@etpinard
Copy link
Contributor

etpinard commented Jan 8, 2016

Continuing ...

But, what if instead, newPlot (via Plots.purge) would clear the animation frame before requesting a new one on the same graph div?

Wouldn't that solve your issue (assuming that your buttons are all linked to the same graph div) ?

@bpostlethwaite
Copy link
Member

@mikolalysenko interesting topic, you have some input?

@etpinard etpinard added this to the On-par gl2d milestone Feb 24, 2016
@mikolalysenko
Copy link
Contributor

Sorry for the long delay in getting to this, but if I am correct in understanding what this PR does is that it stops rendering for a plot if its canvas is removed from the document? It seems like just checking if the canvas is stuck in the body could have unintended consequences for things like the image server or some applications.

I think the real problem is that Plots.purge() isn't calling the scene destructor. Probably the right thing to do is to stick in a call to scene.destroy() on any active 2D/3D scene objects.

Now that I look at this, I wonder if there are other places in the code where this happening now.

@etpinard
Copy link
Contributor

@emackey PR #300 is now merged and should take care of the issues you put forward here.

@etpinard etpinard closed this Mar 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants