BUG: Fix `pd.to_numeric` to have consistent behavior for date-like arguments #43315

hec10r · 2021-08-31T00:08:14Z

closes BUG: pd.to_numeric has an inconsistent behavior for datetime objects #43280
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

pd.to_numeric was having an inconsistent behavior when using in datelike objects and was returning wrong result when using with pd.NaT

In master:

>>> import pandas as pd
>>> from datetime import datetime
>>> from functools import partial
>>> pd.to_numeric(datetime(2021, 8, 22), errors="coerce")
nan
>>> pd.to_numeric(pd.Series(datetime(2021, 8, 22)), errors="coerce")
0    1629590400000000000
dtype: int64
>>> pd.Series([datetime(2021, 8, 22)]).apply(partial(pd.to_numeric), errors="coerce")
0   NaN
dtype: float64
>>>
>>> pd.to_numeric(pd.NaT, errors="coerce")
nan
>>> pd.to_numeric(pd.Series(pd.NaT), errors="coerce")
0   -9223372036854775808
dtype: int64
>>> pd.Series(pd.NaT).apply(partial(pd.to_numeric), errors="coerce")
0   NaN
dtype: float64

In this PR:

>>> import pandas as pd
>>> from datetime import datetime
>>> from functools import partial
>>> pd.to_numeric(datetime(2021, 8, 22), errors="coerce")
nan
>>> pd.to_numeric(pd.Series(datetime(2021, 8, 22)), errors="coerce")
0   NaN
dtype: float64
>>> pd.Series([datetime(2021, 8, 22)]).apply(partial(pd.to_numeric), errors="coerce")
0   NaN
dtype: float64
>>>
>>> pd.to_numeric(pd.NaT, errors="coerce")
nan
>>> pd.to_numeric(pd.Series(pd.NaT), errors="coerce")
0   NaN
dtype: float64
>>> pd.Series(pd.NaT).apply(partial(pd.to_numeric), errors="coerce")
0   NaN
dtype: float64

Please let me know if this approach makes sense so I can add the tests.

Test that the function returns the same results for list and pd.Series

pep8speaks · 2021-08-31T01:57:00Z

Hello @hec10r! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-08-31 14:35:27 UTC

Navaneethan2503 · 2021-08-31T11:42:30Z

i thing already data is consider as object , then why you again defining as a object

mroeschke

See #43280 (comment) as the behavior here is largely correct.

If you would like to address the other bugs noted there, that would be great!

jreback · 2021-08-31T22:49:50Z

pandas/core/tools/numeric.py

@@ -137,6 +141,15 @@ def to_numeric(arg, errors="raise", downcast=None):
    if errors not in ("ignore", "raise", "coerce"):
        raise ValueError("invalid error value specified")

+    # Handle inputs of "date" type as objects


umm what is this?

The idea was to change the behavior of pd.to_datetime for date-like objects. Refer to this discussion: #43280 (comment)

jreback · 2021-08-31T22:50:41Z

pandas/tests/tools/test_to_numeric.py

+    result = to_numeric(lis, **kwargs)
+    expected = to_numeric(ser, **kwargs).values
+
+    tm.assert_numpy_array_equal(result, expected)


why are you comparing vs a numpy array?

these should use assert_series_equal at a minimum

Because one input was a list and the other input was a pd.Series. So the first ouput is a np.array and the output for series.values is a np.array when series is instance of pd.Series. It could have been done in other way tough;

... result = to_numeric(lis, **kwargs) expected = to_numeric(ser, **kwargs) tm.assert_series_equal(pd.Series(result), expected)

jreback · 2021-08-31T22:51:15Z

pandas/tests/tools/test_to_numeric.py

@@ -372,39 +417,6 @@ def test_str(data, exp, transform_assert_equal):
    assert_equal(result, expected)


-def test_datetime_like(tz_naive_fixture, transform_assert_equal):


why are you removing these?

Because I was changing the behavior of the function for date-like objects

hec10r · 2021-09-01T00:14:24Z

@jreback Thanks for your review.

This PR will likely be close and I will open a new one based on the discussion that we had in #43280. It would be great to have your input there as well.

github-actions · 2021-10-02T00:04:19Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

mroeschke · 2021-10-16T20:19:19Z

Thanks for the PR but as mentioned in #43280 (comment), this will probably need a different approach. Closing due to being stale

hec10r added 2 commits August 30, 2021 18:56

Handle date arguments as objects

870769f

Add docstrings

1f0b361

hec10r changed the title ~~Fix datetime behavior to numeric~~ Fix pd.to_numeric to have consistent behavior for date-like arguments Aug 31, 2021

hec10r changed the title ~~Fix pd.to_numeric to have consistent behavior for date-like arguments~~ BUG: Fix pd.to_numeric to have consistent behavior for date-like arguments Aug 31, 2021

hec10r added 4 commits August 30, 2021 20:12

Add support for numpy arrays and remove deprecated conversion

08a7bb8

Remove tests that depend on the old behavior

05e1f2d

Add test_type_error and generalize test_ignore_error

5a5fd77

Add test_list_series

f844dc4

Test that the function returns the same results for list and pd.Series

PEP 8

ffc1a6e

Remove failing tests that depends in the old behavior

bcf1cca

mroeschke requested changes Aug 31, 2021

View reviewed changes

mroeschke mentioned this pull request Aug 31, 2021

modified changes for datetime and NaT Objects Behavior closes #43280 #43289

Closed

jreback requested changes Aug 31, 2021

View reviewed changes

github-actions bot added the Stale label Oct 2, 2021

mroeschke closed this Oct 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Fix `pd.to_numeric` to have consistent behavior for date-like arguments #43315

BUG: Fix `pd.to_numeric` to have consistent behavior for date-like arguments #43315

hec10r commented Aug 31, 2021 •

edited

Loading

pep8speaks commented Aug 31, 2021 •

edited

Loading

Navaneethan2503 commented Aug 31, 2021

mroeschke left a comment

jreback Aug 31, 2021

hec10r Sep 1, 2021

jreback Aug 31, 2021

hec10r Sep 1, 2021

jreback Aug 31, 2021

hec10r Sep 1, 2021

hec10r commented Sep 1, 2021

github-actions bot commented Oct 2, 2021

mroeschke commented Oct 16, 2021

		@@ -372,39 +417,6 @@ def test_str(data, exp, transform_assert_equal):
		assert_equal(result, expected)


		def test_datetime_like(tz_naive_fixture, transform_assert_equal):

BUG: Fix pd.to_numeric to have consistent behavior for date-like arguments #43315

BUG: Fix pd.to_numeric to have consistent behavior for date-like arguments #43315

Conversation

hec10r commented Aug 31, 2021 • edited Loading

pep8speaks commented Aug 31, 2021 • edited Loading

Comment last updated at 2021-08-31 14:35:27 UTC

Navaneethan2503 commented Aug 31, 2021

mroeschke left a comment

Choose a reason for hiding this comment

jreback Aug 31, 2021

Choose a reason for hiding this comment

hec10r Sep 1, 2021

Choose a reason for hiding this comment

jreback Aug 31, 2021

Choose a reason for hiding this comment

hec10r Sep 1, 2021

Choose a reason for hiding this comment

jreback Aug 31, 2021

Choose a reason for hiding this comment

hec10r Sep 1, 2021

Choose a reason for hiding this comment

hec10r commented Sep 1, 2021

github-actions bot commented Oct 2, 2021

mroeschke commented Oct 16, 2021

BUG: Fix `pd.to_numeric` to have consistent behavior for date-like arguments #43315

BUG: Fix `pd.to_numeric` to have consistent behavior for date-like arguments #43315

hec10r commented Aug 31, 2021 •

edited

Loading

pep8speaks commented Aug 31, 2021 •

edited

Loading