Skip to content

BUG: pd._libs.lib.is_iterator reports Timestamp as iterable on Cygwin #45158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
DWesl opened this issue Jan 1, 2022 · 9 comments
Closed
2 of 3 tasks

BUG: pd._libs.lib.is_iterator reports Timestamp as iterable on Cygwin #45158

DWesl opened this issue Jan 1, 2022 · 9 comments
Labels
Bug Datetime Datetime data dtype Upstream issue Issue related to pandas dependency Windows Windows OS

Comments

@DWesl
Copy link

DWesl commented Jan 1, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

>>> import pandas as pd
>>> stamp = pd.Timestamp("2010-01-03T07:23:48")
>>> pd._libs.lib.is_iterator(stamp)
True
>>> list(stamp)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Timestamp' object is not iterable
>>> pd.__version__
'1.3.4'
>>> import collections.abc
>>> isinstance(stamp, collections.abc.Iterator)
False
>>> next(stamp)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Timestamp' object is not an iterator
$ for ver in 3.{6,7,8}; do 
>     python${ver} -c "import collections.abc, pandas as pd; stamp = pd.Timestamp('1987-06-05T04:32:10'); 
> print(pd.__version__, pd._libs.lib.is_iterator(stamp), isinstance(stamp, collections.abc.Iterator))"; 
> done
1.1.4 True False
1.3.4 True False
1.3.4 True False

Issue Description

I have noticed two functions in pandas that use the above pattern:

pandas.core.indexing._LocationIndexer.__getitem__(self, tuple)
pandas.core.indexing._LocIndexer._getitem_axis(self, tuple)

This is causing tests to fail as I try to build 1.3.5. Both the attempt shown above and the attempt with 1.3.5 are on Cygwin. The problem does not occur with pandas 1.0.5, 1.1.2, or 1.3.5 on Windows. All attempts were made with the CPython implementation.

Expected Behavior

I would expect either is_iterator to return False or list to succeed.

Installed Versions

>>> pd.show_versions()
/usr/lib/python3.8/site-packages/h5py/__init__.py:36: UserWarning: h5py is running against HDF5 1.10.8 when it was built against 1.10.7, this may cause problems
  _warn(("h5py is running against HDF5 {0} when it was built against {1}, "

INSTALLED VERSIONS
------------------
commit           : 945c9ed766a61c7d2c0a7cbb251b6edebf9cb7d5
python           : 3.8.12.final.0
python-bits      : 64
OS               : CYGWIN_NT-10.0-19043
OS-release       : 3.3.3-341.x86_64
Version          : 2021-12-03 16:35 UTC
machine          : x86_64
processor        :
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.3.4
numpy            : 1.21.4
pytz             : 2021.3
dateutil         : 2.8.2
pip              : 21.3.1
setuptools       : 59.6.0
Cython           : 0.29.25
pytest           : 6.2.5
hypothesis       : 6.31.3
sphinx           : 4.0.3
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.7.1
html5lib         : 1.1
pymysql          : None
psycopg2         : None
jinja2           : 3.0.3
IPython          : 7.30.1
pandas_datareader: None
bs4              : 4.10.0
bottleneck       : 1.3.2
fsspec           : 2021.06.0
fastparquet      : None
gcsfs            : None
matplotlib       : 3.5.1
numexpr          : 2.8.0
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pyxlsb           : None
s3fs             : None
scipy            : 1.7.1
sqlalchemy       : 1.4.28
tables           : None
tabulate         : 0.8.7
xarray           : 0.18.2
xlrd             : None
xlwt             : None
numba            : None
@DWesl DWesl added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 1, 2022
@DWesl DWesl changed the title BUG: BUG: Timestamp reports as iterable on Cygwin Jan 1, 2022
@DWesl DWesl changed the title BUG: Timestamp reports as iterable on Cygwin BUG: pd._libs.lib.is_iterator reports Timestamp as iterable on Cygwin Jan 1, 2022
@mroeschke
Copy link
Member

On OSX I get this on master

In [1]: >>> import pandas as pd
   ...: >>> stamp = pd.Timestamp("2010-01-03T07:23:48")
   ...: >>> pd._libs.lib.is_iterator(stamp)
Out[1]: False

This is causing tests to fail as I try to build 1.3.5

Could you show a full traceback of the build logs?

@mroeschke mroeschke added Build Library building on various platforms Datetime Datetime data dtype Windows Windows OS and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 13, 2022
@DWesl
Copy link
Author

DWesl commented Jan 13, 2022

I'm not sure what you mean by "full traceback from build logs." The full build log is bigger than the maximum 65536-character limit for a comment, and I can't figure out how to attach.

@mroeschke
Copy link
Member

You could pipe the build log to a text file and attach the file to a Github comment

@mroeschke mroeschke added the Needs Info Clarification about behavior needed to assess issue label Jan 13, 2022
@DWesl
Copy link
Author

DWesl commented Jan 13, 2022

build.log
Found the attach button. I think this is the full build log for python 3.7

@DWesl
Copy link
Author

DWesl commented Feb 6, 2022

I got a Cygwin compile running with GitHub Actions; this looks like one instance where this error arises. You should be able to see the build log from earlier on that same page.

@DWesl
Copy link
Author

DWesl commented Feb 14, 2022

I updated the Cygwin compile to merge the latest main branch from pandas-dev/pandas: the same error still shows up in the test suite. You can see the full build log here.

What information do you still need?

@mroeschke mroeschke removed the Needs Info Clarification about behavior needed to assess issue label Feb 14, 2022
@mroeschke
Copy link
Member

Thanks for confirming @DWesl.

@mroeschke mroeschke removed the Build Library building on various platforms label Feb 14, 2022
@DWesl
Copy link
Author

DWesl commented Feb 17, 2022

I just checked a few other cases, and it looks like PyIter_Check(obj) disagrees with isinstance(obj, collections.abc.Iterable) for os.environ and fractions.Fraction(0, 1), so I'm suspecting this is exposing an underlying CPython bug.

DWesl added a commit to DWesl/pandas that referenced this issue Mar 22, 2022
Fixes pandas-dev#45158 

Keep original implementation if not on Cygwin, and choose the implementation at compile time to avoid performance hits from the extra branch.

[Test results on Cygwin are available are available here](https://github.com/DWesl/pandas/runs/5648361139?check_suite_focus=true)
@mroeschke
Copy link
Member

Given #46477 (comment), it appears that this is an issue that should be fixed upstream. Closing

@mroeschke mroeschke added the Upstream issue Issue related to pandas dependency label Apr 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Upstream issue Issue related to pandas dependency Windows Windows OS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants