-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix to numeric on decimal fields #14842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to numeric on decimal fields #14842
Conversation
3996fe8
to
2d2488c
Compare
@jreback Does this look ok to you? |
@RogerThomas You don't need to open a new PR to update. You can just push again the same branch, and the PR on github will automatically be updated. |
@jorisvandenbossche, thanks yeah, I thought the same, but the other PR doesn't seem to have been updated, maybe it only updates if its status is open. Anyway, I'll leave this one here and leave the other one closed if it's ok with you. |
indeed |
@@ -173,6 +174,8 @@ def to_numeric(arg, errors='raise', downcast=None): | |||
values = arg.values | |||
elif isinstance(arg, (list, tuple)): | |||
values = np.array(arg, dtype='O') | |||
elif isinstance(arg, Decimal): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be special cased
instead remove the use of np.isscalar
and use is_scalar
, which I think should work with decimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.lib.isscalar doesn't return true for decimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try changing that and see if anything breaks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I did, I tried
if lib.isscalar(arg):
...
But I don't see how lib.isscalar
could return True for decimal
this is the isscalar code
def isscalar(object val):
"""
Return True if given value is scalar.
This includes:
- numpy array scalar (e.g. np.int64)
- Python builtin numerics
- Python builtin byte arrays and strings
- None
- instances of datetime.datetime
- instances of datetime.timedelta
- Period
"""
return (np.PyArray_IsAnyScalar(val)
# As of numpy-1.9, PyArray_IsAnyScalar misses bytearrays on Py3.
or PyBytes_Check(val)
# We differ from numpy (as of 1.10), which claims that None is
# not scalar in np.isscalar().
or val is None
or PyDate_Check(val)
or PyDelta_Check(val)
or PyTime_Check(val)
or util.is_period_object(val))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jreback any more thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I understand where you're coming from, but I don't see how to get around it in this case.
If you're happy with it, this PR should be ready to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
were you able to move addtl logic to is_scalar as well (IOW, did that break things irrespective of this change)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I can easily move the isinstance(value, Decimal)
check to the isscalar function, but I don't see the point, I'd still have to check again after this in order to call float(value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok this is what I'd like to see; we are trying to treat decimal.Decimal
in a more thoughtful way
In [7]: d = decimal.Decimal('1.0')
In [8]: pandas.types.common.is_scalar(d)
Out[8]: False
In [9]: pandas.types.common.is_number(d)
Out[9]: True
I'd like [7] to be True.
further I'd like to add a is_decimal
in pandas.types.inference
then the last part can be something like:
elif isinstance(arg, (list, tuple)):
values = np.array(arg, dtype='O')
elif is_scalar(arg): # previous np.isscalar(arg):
if is_decimal(arg):
# coerce
arg = float(arg)
if is_number(arg):
return arg
is_scalar = True
values = np.array([arg], dtype='O')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also need to add some tests where decimals are in a list/array (they will be then subseuqnetly converted via maybe_convert_numeric
which is also fine.
Current coverage is 84.64% (diff: 100%)@@ master #14842 diff @@
==========================================
Files 144 144
Lines 51016 51019 +3
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43185 43187 +2
- Misses 7831 7832 +1
Partials 0 0
|
@@ -173,7 +174,9 @@ def to_numeric(arg, errors='raise', downcast=None): | |||
values = arg.values | |||
elif isinstance(arg, (list, tuple)): | |||
values = np.array(arg, dtype='O') | |||
elif np.isscalar(arg): | |||
elif lib.isscalar(arg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import these from pandas.types.common (like everything else). you prob have to fix is_decimal to import also (it should be imported into pandas.types.inference, which pandas.types.common subseuqently imports). pandas.types.common is the internal API.
almost perfect. left some comments. |
Thanks @jreback, I actually had it like that originally but decided to just directly import it from lib. Changed back now |
ok lgtm. ping on green. |
…eric_on_decimal_fields
@jreback all ready to go |
thanks! |
closes pandas-dev#14827 Author: Roger Thomas <[email protected]> Author: Roger Thomas <[email protected]> Closes pandas-dev#14842 from RogerThomas/fix_to_numeric_on_decimal_fields and squashes the following commits: 91d989b [Roger Thomas] Merge branch 'master' of github.com:pandas-dev/pandas into fix_to_numeric_on_decimal_fields d7972d7 [Roger Thomas] Move isdecimal to internal api 1f1c62c [Roger Thomas] Add Test And Refactor is_decimal f1b69da [Roger Thomas] Merge branch 'master' of github.com:pandas-dev/pandas into fix_to_numeric_on_decimal_fields 2d2488c [Roger Thomas] Fix To Numeric on Decimal Fields
git diff upstream/master | flake8 --diff