Speed up Lib.extendDeep (_extend) for arrays that have no object, array elements #732

monfera · 2016-07-11T09:47:27Z

This PR has the same effect on speed as the suggestion in #726 (comment)

The referred suggestion isn't obsoleted by this PR, because deepExtend, by its nature, is still expensive: with large datasets, Plotly.restyle will still allocate and populate very large arrays (though much faster if this PR is merged).

Similarly, this PR can be useful even outside the original speed problem, as it generally speeds up structure extension when large, simple arrays are present.

rreusser · 2016-07-11T11:22:01Z

On a related note: based on animation needs, I made a function called deepExtendNoArrays that deep merges objects but simply transfers array references without copying at all.

monfera · 2016-07-11T11:27:42Z

@rreusser thanks for mentioning, it would lead to higher efficiency and DRYness (my change keeps the existing behavior). What I don't know is if it's OK to use it in the place where it causes the slowdown - emitting an event here: https://github.com/plotly/plotly.js/blob/master/src/plot_api/plot_api.js#L2036.

I suppose the extension is deep so that if a user modifies callback value contents, it has no effect. If it's not a concern, then it would be worth doing a PR with your 'deepExtendNoArrays' ahead of the other parts.

... on the other hand queue will be 0 length for most things anyway.

rreusser · 2016-07-11T12:24:13Z

Hmm… yeah, wasn't sure exactly what your need is, but we were talking and came to the conclusion (with maybe only an exception or two) that effectively any arrays in plotly json (i.e. traces—except for the array of traces themselves, of course) are data that can be transferred without copying.

Perhaps a common function that copies traces/layouts without arrays and handles any edge cases would make sense at some point.

rreusser · 2016-07-11T12:26:31Z

See: https://github.com/plotly/plotly.js/blob/animate-api-take-3/src/lib/extend.js#L27-L29

monfera · 2016-07-11T12:57:41Z

@rreusser awesome, good to know! In the current case, an array of arrays is passed for extension, so probably the
gd.emit('plotly_restyle', Lib.extendDeep([], [redoit, traces])) call would need to switch to {}s:
gd.emit('plotly_restyle', Lib.extendDeepNoArrays({}, {redo: redoit, traces: traces})).

... or actually, one level deeper as traces is an array, or just doing it manually here.

So it looks possible to directly use extendDeepNoArrays for this case. I don't know how structures are handled generally, since as you say, sometimes an encountered array is not a plain array but just an (ordered) container of other things (e.g. traces) - maybe the way you call it also has assurances that an array is always flat.

What's in this PR speeds up those cases where you can't make assumptions about which arrays need to be deep copied. This commit is compatible with extendDeep rather than extendDeepNoArrays. I'll leave it to consensus here if this has any utility. There are a few dozen calls to extendDeep, some are for top-level objects i.e. in cloneplot.js and plot_api.js. If of no use, I'll just close the PR.

monfera · 2016-07-11T13:09:04Z

Forgot to add one more thing that might help: it would be good in general to store large arrays as typed JS arrays. It's easy to test e.g. a instanceof Float32Array ro ArrayBuffer.isView(arg) and has other benefits. The cons side is you can't s(p)lice them or push etc. Also, they only admit IEEE numbers - which include NaN and infinities, but doesn't include null so it would need that the meaning of missing data points is switched from null to NaN so all in all it would be a big change. But would be easy to tell apart kinds of arrays.

rreusser · 2016-07-11T14:13:53Z

@monfera Agreed on locking down any corner cases where this might not be true.

And hmm… agreed that typed arrays have benefits, but I might need extra convincing that the difference is worthwhile and actually the bottleneck before worrying that it might be premature optimization. 😄

rreusser · 2016-07-11T14:35:22Z

Until then at least, I'll write a wrapper function to extend layouts and traces while simply moving array references with the goal of explicitly handling any corner cases where this transfer is not valid. (which, unfortunately, I think will be confined to my PR at the moment)

etpinard · 2016-07-11T15:15:37Z

@monfera Thanks for bringing this up!

Have you done any benchmarking comparing extendDeep before and after your patch? From what I understand your patch here is compatible with @rreusser extendDeepNoArrays (or any other future plotly-JSON-specific extend method).

More generally, I'm thinking that we may not need to extend the redoit objects and traces arrays before emitting plotly_restyle and plotly_relayout. Instead we should extend them inside Queue.add which, together with #726, will speed the (most common) case where the Queue is not activated.

monfera · 2016-07-11T16:48:43Z

@etpinard yes, in the context of loading additional 1k points when we already have a lot of points. It's quite significant:

Numbers: given 100k points, adding 1k new points takes:

around 340ms with the old extendDeep
around 90ms with the extendDeep speedup

The actual speedup is even more, because, clearly, this is not the only contribution to the overall runtime, so I think it's more like e.g. 40ms vs 300ms so around 10x or more speedup.

etpinard · 2016-07-11T16:55:55Z

Numbers: given 100k points, adding 1k new points takes:

around 340ms with the old extendDeep
around 90ms with the extendDeep speedup

Amazing.

monfera · 2016-07-11T17:00:59Z

@etpinard if you want to deeply extend [redoit, traces] - whether here or in queue - I'd recommend either of:

using @rreusser 's extendDeepNoArrays if you're okay with not deep-copying the arrays - in which case you'd need to actually make the call to extendDeepNoArrays in some loop, because traces itself is an array, and extendDeepNoArrays just takes all arrays as pointers if I understand it
or using this PR, if you want to stick to the current extendDeep semantics (which gives fresh arrays), or if you don't want to manually pry apart and loop over things a layer or two below [redoit, traces]

Speed up Lib.extendDeep (_extend) for primitive arrays

363aba3

monfera force-pushed the speed-up-extend branch from 22a07cc to f3e7d5b Compare July 11, 2016 10:34

monfera changed the title ~~Speed up Lib.extendDeep (_extend) for primitive arrays~~ Speed up Lib.extendDeep (_extend) for arrays that have no object, array elements Jul 11, 2016

monfera mentioned this pull request Jul 11, 2016

[concept] Queue length limit #726

Closed

monfera force-pushed the speed-up-extend branch from f3e7d5b to 363aba3 Compare July 11, 2016 10:58

monfera mentioned this pull request Jul 11, 2016

Reifies not yet covered, preexisting behavior of Lib.extendDeep #733

Merged

etpinard added status: discussion needed labels Jul 11, 2016

etpinard added status: reviewable and removed status: discussion needed labels Jul 11, 2016

etpinard merged commit 624b64f into plotly:master Jul 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up Lib.extendDeep (_extend) for arrays that have no object, array elements #732

Speed up Lib.extendDeep (_extend) for arrays that have no object, array elements #732

monfera commented Jul 11, 2016 •

edited

Loading

rreusser commented Jul 11, 2016

monfera commented Jul 11, 2016 •

edited

Loading

rreusser commented Jul 11, 2016 •

edited

Loading

rreusser commented Jul 11, 2016

monfera commented Jul 11, 2016

monfera commented Jul 11, 2016

rreusser commented Jul 11, 2016

rreusser commented Jul 11, 2016 •

edited

Loading

etpinard commented Jul 11, 2016

monfera commented Jul 11, 2016

etpinard commented Jul 11, 2016

monfera commented Jul 11, 2016 •

edited

Loading

Speed up Lib.extendDeep (_extend) for arrays that have no object, array elements #732

Speed up Lib.extendDeep (_extend) for arrays that have no object, array elements #732

Conversation

monfera commented Jul 11, 2016 • edited Loading

rreusser commented Jul 11, 2016

monfera commented Jul 11, 2016 • edited Loading

rreusser commented Jul 11, 2016 • edited Loading

rreusser commented Jul 11, 2016

monfera commented Jul 11, 2016

monfera commented Jul 11, 2016

rreusser commented Jul 11, 2016

rreusser commented Jul 11, 2016 • edited Loading

etpinard commented Jul 11, 2016

monfera commented Jul 11, 2016

etpinard commented Jul 11, 2016

monfera commented Jul 11, 2016 • edited Loading

monfera commented Jul 11, 2016 •

edited

Loading

monfera commented Jul 11, 2016 •

edited

Loading

rreusser commented Jul 11, 2016 •

edited

Loading

rreusser commented Jul 11, 2016 •

edited

Loading

monfera commented Jul 11, 2016 •

edited

Loading