-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API/CLN: timeseries plotting #15071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
API Design
Clean
Needs Discussion
Requires discussion from core team before further action
Visualization
plotting
Comments
As I recall, the time series plotting with periods originated in scikits.timeseries. I am not especially attached to it -- if you can unify / have a single code path for plotting without significantly changing functionality, sounds good to me. |
jgehrcke
added a commit
to jgehrcke/ci-analysis
that referenced
this issue
Dec 7, 2020
There's a lot of magic going on between how the datetime64 values actually encode datetime in plots. Sharing an axis across (sub)plots is brittle w.r.t. these differences. Work around this, here: make it so that individual timestamps have a non-zero value for seconds, by simply adding one second, shifting the whole data set by one second to the left. That prevents, I guess, an optimization to hit in which would see that individual timestamps hit the full hour or integer multiples of 30 or 15 minutes. Also see pandas-dev/pandas#15874 pandas-dev/pandas#15071 pandas-dev/pandas#31074 pandas-dev/pandas#29705 pandas-dev/pandas#29719 pandas-dev/pandas#18571 pandas-dev/pandas#11574 pandas-dev/pandas#22586
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
API Design
Clean
Needs Discussion
Requires discussion from core team before further action
Visualization
plotting
Inspired by the timedelta plotting issue, I thought to look again at our timeseries plotting machinery. We know it is quite complex, and due to that several bugs, inconsistencies or unexpected behaviours exist (eg different results depending on order of plotting several serieses, wrong results when combining different types of time series, among others #9053, #6608, #14322, ..).
There has been some discussion related to this on the tsplot refactor PR of @sinhrks #7670 (not merged).
One of the reasons of the complexities is the distinction between 'irregular' and 'regular' time series (see eg #7670 (comment)):
x_compat=True
.So part of the problems and confusions comes from the differences between both (eg different label formatting) and from combining those two. Leading to the question:
Do we need both types of timeseries plotting?
The question is what the reason is that we convert DatetimeIndex to periods for plotting. The reasons I can think of:
ts.plot()
is faster asts.plot(x_compat=True)
). However, I think this could be solved as most of the time is spent in converting the datetimes to floats (which should be vectorizable).Others reasons that I am missing?
But, there are also clear drawbacks. Apart from the things mentioned above, you sometimes get clearly wrong behaviour: see eg the plot in #7670 (comment). In this case, the dates somewhere within a month, are snapped to the month edges when first a regular series is plotted with monthy frequency.
Another example of 'wrong' plotting is a yearly series (bug with freq 'A-dec', so end of year) plotted in the beginning of a year. See http://nbviewer.jupyter.org/gist/jorisvandenbossche/c0c68dce2fa02f1dfc4a8c343ec88cb6. But of course, in many cases, this behaviour is can also be the desired behaviour.
But do we need both? Would we want, if possible, to unify into one approach?
Can we unify both approaches?
Can we just use the matplotlib floats for timeseries plotting? Or always use the period-based machinery?
cc @pandas-dev/pandas-core (especially @TomAugspurger and @sinhrks, I think you haven been most involved in plotting code recently, or @wesm for historical viewpoint)
I know it's a long issue, but if you could give it a read and give your thoughts on this, very welcome!
The text was updated successfully, but these errors were encountered: