Skip to content

BUG: Implementation of is_list_like causes TypeError when working with Extension Arrays of ambiguous type #43195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
dcoukos opened this issue Aug 24, 2021 · 1 comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@dcoukos
Copy link

dcoukos commented Aug 24, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

# Your code here
import pint
import pandas as pd
import pint_pandas
import numpy as np
import datetime

u = pint.UnitRegistry()
Q = pint.Quantity(np.random.rand(20), u.hertz)
df = pd.DataFrame(Q , dtype='pint[hertz]')
limit1 = -0.4*u.hertz
limit2 = 4*u.hertz

df.clip(limit1, limit2)

Problem description

This makes it difficult for downstream libraries to correctly advertise their type when they must be able to accommodate array-like data or scalar date. In addition, having dtype information (important for pint in particular because it handles physical units) makes this type of class evaluate true for is_array_like even when it is a scalar, due to is_array_like using is_list_like.

Since Quantity does not know what type of data it will receive, it has an iter function. Calling this function - and thus allowing it to advertise whether it has scalar or array-like data - instead of checking for the presence of an iter function would make is_list_like more robust.

Traceback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-13539a13f5ad> in <module>
----> 1 df.diff().clip(limit1, limit2)

~/opt/anaconda3/envs/aug20/lib/python3.9/site-packages/pandas/core/generic.py in clip(self, lower, upper, axis, inplace, *args, **kwargs)
   7713         result = self
   7714         if lower is not None:
-> 7715             result = result._clip_with_one_bound(
   7716                 lower, method=self.ge, axis=axis, inplace=inplace
   7717             )

~/opt/anaconda3/envs/aug20/lib/python3.9/site-packages/pandas/core/generic.py in _clip_with_one_bound(self, threshold, method, axis, inplace)
   7586             return self._clip_with_scalar(threshold, None, inplace=inplace)
   7587 
-> 7588         subset = method(threshold, axis=axis) | isna(self)
   7589 
   7590         # GH #15390

~/opt/anaconda3/envs/aug20/lib/python3.9/site-packages/pandas/core/ops/__init__.py in f(self, other, axis, level)
    447         axis = self._get_axis_number(axis) if axis is not None else 1
    448 
--> 449         self, other = align_method_FRAME(self, other, axis, flex=True, level=level)
    450 
    451         new_data = self._dispatch_frame_op(other, op, axis=axis)

~/opt/anaconda3/envs/aug20/lib/python3.9/site-packages/pandas/core/ops/__init__.py in align_method_FRAME(left, right, axis, flex, level)
    255     elif is_list_like(right) and not isinstance(right, (ABCSeries, ABCDataFrame)):
    256         # GH 36702. Raise when attempting arithmetic with list of array-like.
--> 257         if any(is_array_like(el) for el in right):
    258             raise ValueError(
    259                 f"Unable to coerce list of {type(right[0])} to Series/DataFrame"

~/opt/anaconda3/envs/aug20/lib/python3.9/site-packages/pint/quantity.py in __iter__(self)
    262         # as one calls iter(self) without waiting for the first element to be drawn from
    263         # the iterator
--> 264         it_magnitude = iter(self.magnitude)
    265 
    266         def it_outer():

TypeError: 'float' object is not iterable

Expected Output

Expected output is that the values of df should be constrained between limit1 and limit2.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 5f648bf
python : 3.8.10.final.0
python-bits : 64
OS : Darwin
OS-release : 20.5.0
Version : Darwin Kernel Version 20.5.0: Sat May 8 05:10:33 PDT 2021; root:xnu-7195.121.3~9/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.2. #Also tested on 1.4.0.dev0+496.g8814de9fbc with same result.
numpy : 1.21.1
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.1
setuptools : 52.0.0.post20210125
Cython : 0.29.23
pytest : 6.2.3
hypothesis : 6.13.14
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.24.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : 0.9.0
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 4.0.1
pyxlsb : None
s3fs : None
scipy : 1.6.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@dcoukos
Copy link
Author

dcoukos commented Aug 24, 2021

Closing issue as this is probably best dealt with as an enhancement request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

1 participant