-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Iteration over DatetimeIndex stops at 10000 #21012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report, can confirm this regression. |
I suppose this is the cause: #18848 |
Thanks for comments @jorisvandenbossche . I'm trying to make a PR, adding
It is complicated, so any improvement or another PR are welcome. |
Not that it's likely to be a huge performance hit, but you could save some space by only tracking the current chunk length instead of keeping a list of lengths:
|
Thank you @cbertinato . This is simpler. But let
BTW, my change fails TestPanel.test_setitem, TestCategoricalRepr.test_categorical_index_repr_datetime, TestCategoricalRepr.test_categorical_index_repr_datetime_ordered. It seems adding
I'm not sure, but I think this failure is a features, not a bug. And now I have no idea about the former one, TestPanel.test_setitem. Sorry. pytest results:
|
This looks the cause of failure of TestPanel.test_setitem.
It's a naive way, but adding |
The issue is not so much about dimensionality as it is about the identity of the index as an iterator. This leads to a somewhat deeper question: does it matter whether the index itself is an iterator? Perhaps the answer is: if the tests pass, it doesn't matter. If it does matter, then another solution is to make a separate iterator for indexes, instead of returning the index itself from Anyhow, something along the lines of what you suggest would fix it, though something a bit more general, such as |
A separate iterator sounds good. |
Code Sample
Problem description
Iteration over DatetimeIndex stops unexpectedly when it's evaluated 10000 times.
NOTE : this happened in 0.23.0rc2, not 0.22.0
Expected Output
(20000, 20000, 20000)
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-693.21.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: ja_JP.UTF-8
LOCALE: ja_JP.UTF-8
pandas: 0.23.0rc2
pytest: None
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.14.3
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: