-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Pandas .plot() on regularly spaced timeseries result in slow plotting/interaction #31074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
maybe you want |
jgehrcke
added a commit
to jgehrcke/ci-analysis
that referenced
this issue
Dec 7, 2020
There's a lot of magic going on between how the datetime64 values actually encode datetime in plots. Sharing an axis across (sub)plots is brittle w.r.t. these differences. Work around this, here: make it so that individual timestamps have a non-zero value for seconds, by simply adding one second, shifting the whole data set by one second to the left. That prevents, I guess, an optimization to hit in which would see that individual timestamps hit the full hour or integer multiples of 30 or 15 minutes. Also see pandas-dev/pandas#15874 pandas-dev/pandas#15071 pandas-dev/pandas#31074 pandas-dev/pandas#29705 pandas-dev/pandas#29719 pandas-dev/pandas#18571 pandas-dev/pandas#11574 pandas-dev/pandas#22586
I think this is a duplicate (core issue) of #10578 so closing in favor of that issue |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Code Sample, a copy-pastable example if possible
Problem description
I noticed that plotting time series using .plot() sometimes resulted in very slow and unresponsive plots, where it is difficult to interact with the figure (e.g. pan). I think this happens with regularly spaced time series where there either is a frequency defined or pandas is able to infer the frequency. Perhaps it has something to do with the plot tick labels on the x (time) axis? It does not happen when plotting time series that are irregular, and thus when pandas does not style the plot ticks and tick labels.
In the example code the first and the third plot are smooth to interact with, while the second plot is lagging terribly. Also see screenshot of the second plot with the pandas styled tick labels:

If changing the dt_range from minute frequency to hour frequency (replace

minutes
withhours
) the pandas.plot() becomes much smooth to interact with despite having the same series size, and I think because it has much fewer ticks and labels:So it might be a combination of the size of series plotted and how the ticks are drawn/updated?
Are there ways to disable the styled pandas ticks somehow? I also notice that pyplot.plot() and pandas.plot() result in very different conversions of timestamps to numeric x values - is there a way to also disable that behavior in pandas so it is compatible with pyplot?
Expected Output
Equally responsive plotting as with pyplot.plot()
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.6.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en
LOCALE : None.None
pandas : 0.25.3
numpy : 1.17.4
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 44.0.0.post20200106
Cython : 0.29.14
pytest : 5.3.2
hypothesis : 4.54.2
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.2
sqlalchemy : 1.3.12
tables : 3.6.1
xarray : 0.14.1
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.7
Edit: This performance issue might be related to #15071
The text was updated successfully, but these errors were encountered: