-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame.any() with not returning Boolean series with skipna=False across columns with numeric and string types #38962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report @rmwenzel! This looks to be caused because by the following
gives
In your second example which works as you expected (assuming you meant |
In the 1d case too:
gives |
@mzeitlin11 thanks for the speedy response! I did mean By upstream I take it you mean a |
That would be great! I think we should still leave this open, though, pending resolution of the |
Done! See numpy/numpy#18129 |
@mzeitlin11 after some discussion, it looks like |
Thanks @rmwenzel! I guess the question becomes whether pandas should match that behavior or always ensure returning something boolean. Your examples above show an inconsistency for how this is handled when reducing on different axes. |
@mzeitlin11 I was wondering the same thing. Consistency with behavior for rows ( |
Lines 440 to 446 in eead40e
Probably just want to ensure this returns only bools (typing here is also suspect because it can return an array of bools in addition to just a single boolean value). |
Closing as a duplicate in favor of #12863, feel free to chime in there if interested in working on this. Also looks like an issue on the |
@mzeitlin11 Thanks for catching that other issue. Apologies for not digging deep enough to find it! |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
If the dataframe has columns have different numeric types
the behavior is as expected
But if the columns have numeric and string types
it looks like the first column is returned
The mixed types across rows doesn't seem to be a problem, if I do
I get
Also don't have any issues if
skipna=True
.Expected behavior is to get a series of booleans either way.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : b5958ee
python : 3.9.1.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.1.5
numpy : 1.19.2
pytz : 2020.5
dateutil : 2.8.1
pip : 20.3.3
setuptools : 51.0.0.post20201207
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: