Skip to content

Fix failure to convert string "uint64" to NaN #32541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

roberthdevries
Copy link
Contributor

Including regression test

@roberthdevries roberthdevries force-pushed the fix-32394-to-numeric-fail-string-uint64 branch from 955550d to 0240f9c Compare March 8, 2020 22:11
@WillAyd WillAyd added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Mar 9, 2020
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - @gfyoung care to look?

Copy link
Member

@gfyoung gfyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

The fix I had back then was good, but I guess not all patches age well...

@WillAyd WillAyd merged commit a3985f8 into pandas-dev:master Mar 9, 2020
@WillAyd
Copy link
Member

WillAyd commented Mar 9, 2020

Great thanks @roberthdevries - keep the PRs coming!

result = lib.maybe_convert_numeric(
np.array(["uint64"], dtype=object), set(), coerce_numeric=True
)
assert np.isnan(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a good test - does it replicate what the OP was? this is not user facing at all

pls follow what is going on in other tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OP currently takes this path, though it'd be good to test at the pd.to_numeric level too.

@jreback
Copy link
Contributor

jreback commented Mar 9, 2020

@WillAyd this has a number of issues in testing

pls revert

the OP is not tested at all

@jreback jreback added this to the 1.1 milestone Mar 9, 2020
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the issue is just lack of tests I think we can push that in an additional PR, rather than reverting.

result = lib.maybe_convert_numeric(
np.array(["uint64"], dtype=object), set(), coerce_numeric=True
)
assert np.isnan(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OP currently takes this path, though it'd be good to test at the pd.to_numeric level too.

@jreback
Copy link
Contributor

jreback commented Mar 9, 2020

right just need another PR; @roberthdevries can u push a test for to_numeric

@WillAyd remember to set the version in the PR and issue before merging

@roberthdevries
Copy link
Contributor Author

@jreback I will add another PR with a test for to_numeric referencing the original issue

@roberthdevries roberthdevries deleted the fix-32394-to-numeric-fail-string-uint64 branch March 11, 2020 15:48
SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pd.to_numeric(..., errors="coerce") failing silently when strings contain "uint64"
5 participants