-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Plotting on a datetime axis with different levels of granularity creates misleading plots #15874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So, this is unfortunate... The real fix will be fixing #15071, which is a good bit of work. Since the index is regular frequency, we use Another workaround is to draw the In [89]: ax.axvline(pd.Period('2017-01-04T12'), color='r')
Out[89]: <matplotlib.lines.Line2D at 0x111f91518> I don't think there's any way we can reliably detect this and warn the user though, at least not without false positives :/ |
Aha. I suspected this could be pretty complicated under the hood! Is there anywhere sensible we could update things in the documentation to warn people? If there is, I'll try and submit a PR sometime over the next couple of days. (By the way that Period fix didn't work for me - at least not when substituted directly for the |
The best is probably near where we document
Huh, I thought that would do it, sorry. |
There's a lot of magic going on between how the datetime64 values actually encode datetime in plots. Sharing an axis across (sub)plots is brittle w.r.t. these differences. Work around this, here: make it so that individual timestamps have a non-zero value for seconds, by simply adding one second, shifting the whole data set by one second to the left. That prevents, I guess, an optimization to hit in which would see that individual timestamps hit the full hour or integer multiples of 30 or 15 minutes. Also see pandas-dev/pandas#15874 pandas-dev/pandas#15071 pandas-dev/pandas#31074 pandas-dev/pandas#29705 pandas-dev/pandas#29719 pandas-dev/pandas#18571 pandas-dev/pandas#11574 pandas-dev/pandas#22586
Issue setup:
Problem description
If you plot a Series with a DateTimeIndex of 'daily granularity', and subsequently add other things (e.g. axvline, scatter...) to that plot with a lower level of granularity (e.g. hourly), then the other things you add will be 'snapped' to the nearest day.
If, for example, we try and plot a vertical line at 12pm on the 4th Feb, the line will snap to midnight:
Reversing the order in which the things are plotted fixes the problem:
An alternative fix is to specify the DateTimeIndex at a lower granularity:
So, to wrap up, not a major problem, but certainly something that could catch you unawares... Perhaps this is more an issue on the matplotlib side though?
Output of
pd.show_versions()
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.12.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.2.2
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.3
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: