Skip to content

BUG: Coercing bool types to int in qcut #28802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ryankarlos
Copy link
Contributor

@ryankarlos ryankarlos commented Oct 5, 2019

  • closes qcut raising TypeError for boolean Series #20303
  • tests added / passed: pytest pandas/tests/reshape/test_qcut.py pandas/tests/reshape/test_cut.py -v
  • passes black pandas
  • passes git diff upstream/master --name-only -- "*.py" | xargs flake8
  • whatsnew entry

@@ -444,7 +444,7 @@ def _coerce_to_type(x):
if dtype is not None:
# GH 19768: force NaT to NaN during integer conversion
if is_bool_dtype(x):
Copy link
Contributor Author

@ryankarlos ryankarlos Oct 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Not entirely sure where this should go - Adding x.astype(int) under elif is_bool_dtype(x) higher up throws an error in tests with the existing np.where(x.notna(), x.view(np.int64), np.nan) statement if x is ndarray - it passes if x is Series though.
AttributeError: 'numpy.ndarray' object has no attribute 'isnan'

It passes if adding is_bool_dtype(x) with new np.where condition using ~np.isnan(x) like i've done here to account for x being an ndarray

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so don't use np.isnan, change this to

x = np.where(notna(x), x.astype(np.int64, copy=False), np.nan)

Copy link
Contributor Author

@ryankarlos ryankarlos Oct 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone with doing the integer conversion in the elif block higher up and leaving dtype as None as @jschendel suggested, rather than making any changes here.

@ryankarlos ryankarlos changed the title Coercing bool types to int in qcut BUG: Coercing bool types to int in qcut Oct 5, 2019
@jschendel jschendel added Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Oct 5, 2019
@jschendel jschendel added this to the 1.0 milestone Oct 5, 2019
@@ -444,7 +444,7 @@ def _coerce_to_type(x):
if dtype is not None:
# GH 19768: force NaT to NaN during integer conversion
if is_bool_dtype(x):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so don't use np.isnan, change this to

x = np.where(notna(x), x.astype(np.int64, copy=False), np.nan)

@@ -110,6 +110,7 @@ Other

- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` when replacing timezone-aware timestamps using a dict-like replacer (:issue:`27720`)
- Bug in :meth:`Series.rename` when using a custom type indexer. Now any value that isn't callable or dict-like is treated as a scalar. (:issue:`27814`)
- :func:`qcut` and `cut` now handle boolean input (:issue:`20303`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put :func:`cut` as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to go in 1.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jreback jreback merged commit c59c2df into pandas-dev:master Oct 8, 2019
@jreback
Copy link
Contributor

jreback commented Oct 8, 2019

thanks @ryankarlos

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

qcut raising TypeError for boolean Series
3 participants