Skip to content

fix: Performance improvement for scattergl with many points. Issue #7065 #7301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 16, 2024

Conversation

giuseppe-straziota
Copy link
Contributor

@giuseppe-straziota giuseppe-straziota commented Dec 7, 2024

The performance issues are always intriguing, and I tried to determine why the browser freezes when handling a large number of points. I believe the problem stems from two main factors.

The first factor is the hover effect, which triggers a function that calculates the distance between the mouse pointer and each point in the set. This process can certainly contribute to the performance problem, I believe the best improvement would be using a filter calculated by a window determined by the pointer coordinates with a delta, in this way, we can calculate the distance only for a subset. One possible improvement could be using squared distances for comparison, thereby avoiding the computation of square roots, similar to the method utilized by the K-d tree algorithm, if the plot allows it.

The second factor involves a loop in the part of the code I modified, where "newDistance" is created for every point. This results in excessive overhead for the garbage collector, causing the browser to freeze.

for(var i = 0; i < cd.length; i++) {
            var newDistance = distfn(cd[i]);
            if(newDistance <= pointData.distance) {
                pointData.index = i;
                pointData.distance = newDistance;
            }
        }

I moved the variable outside the cycle and used a precalculated value for the array length. I conducted tests in the development environments provided by Plotly using the test_dashboard tool.
The values represent average times measured in milliseconds..

6000 10000 100000 1Milion 10Milion 20Milion
inside for 0,841015625 1,892871094 17,34858398 154,4843262 1412,1573 2733,622534
outside for 0,7019287109 1,51418457 13,50952148 113,6539307 1057,211401 1078,00

The red line shows the performance for the variable inside, the blue line outside. The vertical axis (Y-axis) represents the time measured in milliseconds.
image

Of course, the performance depends on the engine and the environment in which it runs, but I think this conveys the idea of performance.

Please let me know if this can be the first step to improving performance.

@gvwilson gvwilson requested a review from archmoj December 9, 2024 13:52
@gvwilson gvwilson added community community contribution fix fixes something broken performance something is slow labels Dec 9, 2024
@gvwilson
Copy link
Contributor

gvwilson commented Dec 9, 2024

@archmoj please have a look after the geojson stuff is sorted out - thank you

@giuseppe-straziota giuseppe-straziota changed the title fix: Performance improvement for scattergl with many points. Issue #7056 fix: Performance improvement for scattergl with many points. Issue #7065 Dec 10, 2024
@giuseppe-straziota
Copy link
Contributor Author

I apologize, everyone; I mistakenly referenced the wrong issue. The correct issue number is #7065 sorry.

Copy link
Contributor

@archmoj archmoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent PR. 🏆
Many many dancers:
💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃 💃

@archmoj archmoj merged commit f8d18de into plotly:master Dec 16, 2024
1 check passed
@giuseppe-straziota
Copy link
Contributor Author

I'm glad to have been helpful :)

@giuseppe-straziota giuseppe-straziota deleted the performance-improvement branch December 16, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community community contribution fix fixes something broken performance something is slow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants