-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
df.plot() very slow compared to explicit matplotlib on large dataframes #18236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you do some profiling to look where times differ? |
I am new to profiling in python but I guess the (unecessary) time is spent in:
|
So pandas do a list comprehension for all elements in the index using the function Is it necessary when you (at least in my case) can send the data directly to matplotlib? |
The following change to
but this version has problem if get_xticks() don't return an integer index) but I don't really understand how the old code work well in that case either (returning a '' is of course better than an index error but I would think that it would be good to have a tick there also in that case) |
Not sure either. If your index is meaningless, we have the |
Yeah indeed it is a bit strange as in many cases Anyhow, I have a PR that should fix this: #18373 |
Result on my slow computer:
matplotlib took:0.48404979705810547
pandas took: 47.75511360168457
Why is the df.plot() much slower than the explicit matplotlib plot when ploting the same dataframe?
The above is a simple example but I encountered the same difference on real world data with times as the index when run an a much faster computer with more memory (a recent anaconda download)
pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: 0.26.1
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: