Skip to content

PR for Pandas issue #14872 / fillna() TZ datetime64 rounded #14905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

RodolfoRFR
Copy link

@RodolfoRFR RodolfoRFR commented Dec 17, 2016

Hi Jeff,

The change adds a "datetime64[ns, UTC]" object to the _DATELIKE_DTYPES set in pandas/types/common.py.

No such object existed in this set and that is the reason the wrong method was executed on the inputs.

Object is created with "DatetimeTZDtype.new"

Thanks,
Rodolfo

@jreback
Copy link
Contributor

jreback commented Dec 17, 2016

this goes way overboard

see the pull request which closed the original issue

@RodolfoRFR
Copy link
Author

RodolfoRFR commented Dec 17, 2016

The changes in request #6587 added "dtype in _DATELIKE_DTYPES" to the method selection in the backfill_2d function (now in missing.py),

-        _method = getattr(algos, 'backfill_2d_inplace_%s' % dtype, None)
-    elif is_datetime64_dtype(values):
+        _method = getattr(algos, 'backfill_2d_inplace_%s' % dtype.name, None)
+    elif dtype in _DATELIKE_DTYPES or is_datetime64_dtype(values):

I initially tried adding an extra is_datetime64tz_dtype() to this line with no success. After tracing the process with pdb I found that the values are already coerced to integers in pandas/core/internals.py,

    def _interpolate_with_fill(self, method='pad', axis=0, inplace=False,
                               limit=None, fill_value=None, coerce=False,
                               downcast=None, mgr=None):
        """ fillna but using the interpolate machinery """
        .
        .
        .
        values, _, fill_value, _ = self._try_coerce_args(values, fill_value)

before being passed to /pandas/core/missing.interpolate_2d and then to backfill_2d. The dtype information is passed as dtype=datetime64[ns, UTC]. When it gets to these lines in the backfill_2d function,

elif dtype in _DATELIKE_DTYPES or is_datetime64_dtype(values):
      method = _backfill_2d_datetime

the return is always false because datetime64[ns, UTC] is not in _DATELIKE_DTYPES and the 'values' are integers.

Another way would be to either replace datetime64[ns, UTC] with datetime64[ns] before we get to this line or skip the coercion somehow. Unless I am missing something.

@jreback
Copy link
Contributor

jreback commented Dec 17, 2016

so this should simply be changed to pass the actual value (and not coerced ones) when this is called and drop al of the _DATELIKE_DTYPES that is really old code

use the is_datetime64_dtype and is_datetime64tz_dtype

@jreback
Copy link
Contributor

jreback commented Dec 17, 2016

actually you just need to check the dtype instead (of the values)

e.g.

(Pdb) p is_datetime64tz_dtype(dtype)
True

@jreback jreback added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Datetime Datetime data dtype Timezones Timezone data dtype labels Dec 17, 2016
@codecov-io
Copy link

codecov-io commented Dec 18, 2016

Current coverage is 84.57% (diff: 75.00%)

Merging #14905 into master will increase coverage by 0.01%

@@             master     #14905   diff @@
==========================================
  Files           144        144          
  Lines         51043      51058    +15   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43160      43181    +21   
+ Misses         7883       7877     -6   
  Partials          0          0          

Powered by Codecov. Last update 73bc6cf...18802b4

@RodolfoRFR
Copy link
Author

Done! I was indeed missing something. Thank you for the feedback!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a whatsnew entry (0.19.2) ok if u can make these changes in next day or so


filled = data.fillna(method='bfill')

expected = pd.Series([datetime.datetime(2016, 12, 12, 22, 24, 6, 100001,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test needs to go in test_missing (search and out it with other similar tests)

@@ -908,6 +911,24 @@ def test_interp_timedelta64(self):
index=pd.to_timedelta([1, 2, 4]))
assert_series_equal(result, expected)

# GH 14872
def test_dtype_utc():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to take self

@jreback jreback added this to the 0.19.2 milestone Dec 18, 2016
@jreback jreback closed this in f3c5a42 Dec 18, 2016
@jreback
Copy link
Contributor

jreback commented Dec 18, 2016

thanks!

ischurov pushed a commit to ischurov/pandas that referenced this pull request Dec 19, 2016
closes pandas-dev#14872

Author: Rodolfo Fernandez <[email protected]>

Closes pandas-dev#14905 from RodolfoRFR/pandas-14872-e and squashes the following commits:

18802b4 [Rodolfo Fernandez] added 'self' to test_dtype_utc function in pandas/tests/series/test_missing
e0c6c7c [Rodolfo Fernandez] added line to whatsnew v0.19.2 and test to test_missing.py in series folder
e4ba7e0 [Rodolfo Fernandez] removed all references to _DATELIKE_DTYPES from /pandas/core/missing.py
5d37ce8 [Rodolfo Fernandez] added is_datetime64tz_dtype and changed evaluation from 'values' to dtype
19eecb2 [Rodolfo Fernandez] fixed style errors using flake8
59b91a1 [Rodolfo Fernandez] test modified
5a59eac [Rodolfo Fernandez] test modified
bc68bf7 [Rodolfo Fernandez] test modified
ba83fc8 [Rodolfo Fernandez] test
b7358de [Rodolfo Fernandez] bug fixed
jorisvandenbossche pushed a commit to jorisvandenbossche/pandas that referenced this pull request Dec 24, 2016
closes pandas-dev#14872

Author: Rodolfo Fernandez <[email protected]>

Closes pandas-dev#14905 from RodolfoRFR/pandas-14872-e and squashes the following commits:

18802b4 [Rodolfo Fernandez] added 'self' to test_dtype_utc function in pandas/tests/series/test_missing
e0c6c7c [Rodolfo Fernandez] added line to whatsnew v0.19.2 and test to test_missing.py in series folder
e4ba7e0 [Rodolfo Fernandez] removed all references to _DATELIKE_DTYPES from /pandas/core/missing.py
5d37ce8 [Rodolfo Fernandez] added is_datetime64tz_dtype and changed evaluation from 'values' to dtype
19eecb2 [Rodolfo Fernandez] fixed style errors using flake8
59b91a1 [Rodolfo Fernandez] test modified
5a59eac [Rodolfo Fernandez] test modified
bc68bf7 [Rodolfo Fernandez] test modified
ba83fc8 [Rodolfo Fernandez] test
b7358de [Rodolfo Fernandez] bug fixed

(cherry picked from commit f3c5a42)
ShaharBental pushed a commit to ShaharBental/pandas that referenced this pull request Dec 26, 2016
closes pandas-dev#14872

Author: Rodolfo Fernandez <[email protected]>

Closes pandas-dev#14905 from RodolfoRFR/pandas-14872-e and squashes the following commits:

18802b4 [Rodolfo Fernandez] added 'self' to test_dtype_utc function in pandas/tests/series/test_missing
e0c6c7c [Rodolfo Fernandez] added line to whatsnew v0.19.2 and test to test_missing.py in series folder
e4ba7e0 [Rodolfo Fernandez] removed all references to _DATELIKE_DTYPES from /pandas/core/missing.py
5d37ce8 [Rodolfo Fernandez] added is_datetime64tz_dtype and changed evaluation from 'values' to dtype
19eecb2 [Rodolfo Fernandez] fixed style errors using flake8
59b91a1 [Rodolfo Fernandez] test modified
5a59eac [Rodolfo Fernandez] test modified
bc68bf7 [Rodolfo Fernandez] test modified
ba83fc8 [Rodolfo Fernandez] test
b7358de [Rodolfo Fernandez] bug fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fillna() containing timezone aware datetime64 rounded
4 participants