Skip to content

Improved detection of NaN values in axis_autotype #2473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tylerdbristow opened this issue Mar 13, 2018 · 6 comments · Fixed by #3070
Closed

Improved detection of NaN values in axis_autotype #2473

tylerdbristow opened this issue Mar 13, 2018 · 6 comments · Fixed by #3070

Comments

@tylerdbristow
Copy link

Dear Plotly,

Thank you for your open source plotly.js library. We are finding it very useful for our renewable energy and meteorological data. We have found an issue and would like to report it in hopes of finding a resolution.

We have time series data for example 2 months of Clear Sky Insolation on a Horizontal Surface at a given location. We like to plot this data using a line and histrogram chart. When we have missing data, we use 'nan' to fill the missing data values to exclude from the charts. Sometimes (and we don't exactly know the threshold), if we have too many missing data values, the line chart Y axis is not sorted by value and it adds 'nan' as a value and the entire graph skews, as if it includes 'nan' as a value and sorts the y axis based on randomness. We are not sure how to resolve this other than not show a chart if we have too many missing values for a given time series and parameter, although sometimes it appears to work as expected. Here is our application power.larc.nasa.gov/data-access-viewer

I've attached 2 screen shots showing how the line charts excludes 'nan' values and removes them from the line in the first image, and in the second image how it adds 'nan' to the Y axis as if it is a value.

Thank you for your assistance with this isssue.

Feel free to reach me at [email protected]

powerchart1
powerchart2

@alexcjohnson
Copy link
Collaborator

This is an artifact of the algorithm we use to infer axis type. I guess you're using the string 'nan', rather than the "number" NaN - it might be smart for us to exclude that (and perhaps related values like 'NaN' and 'n/a') from the string values that don't count toward the categorical data point count here but in the meantime, if you specify explicitly yaxis: {type: 'linear'} you'll get the behavior you want.

@tylerdbristow
Copy link
Author

Thank you Alex for your response. Will that (yaxis: {type: 'linear'}) work for the histograms also?

@alexcjohnson
Copy link
Collaborator

Will that (yaxis: {type: 'linear'}) work for the histograms also?

Yes, pretty much all (cartesian) trace types pass their data into this same autotype routine. If you know the axis type ahead of time, you might as well provide it explicitly, as it avoids this kind of error and will also save the (brief) time needed to run autotype.

@alexcjohnson
Copy link
Collaborator

Possibly a more robust solution than adding more "exempt" strings (would be language-agnostic, for example): count only distinct values for category vs number count; normally you'll only have one "missing" value, or possibly 2 if there's one missing within the series and a different missing off the end of the series, but real category data will have multiple distinct non-numeric categories.

@tylerdbristow
Copy link
Author

Thanks. I went ahead and replaced all -999 (missing values) with NaN rather than 'nan' and I think that fixed the histograms and line charts. I will keep you posted but thanks again for your help.

@alexcjohnson alexcjohnson changed the title Line/Histogram Chart Issue Improved detection of NaN values in axis_autotype Mar 13, 2018
@etpinard
Copy link
Contributor

Related: #1413

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants