Skip to content

BUG: Using DatetimeIndex.date with timezone returns incorrect date #2… #21281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 7, 2018
Merged

Conversation

jamestran201
Copy link

@jamestran201 jamestran201 commented Jun 1, 2018

Added 2 tests for DatetimeIndex.date in tests/indexes/datetimes/test_datetime.py. Please let me know if I should move them somewhere else.

@mroeschke mroeschke added Datetime Datetime data dtype Timezones Timezone data dtype labels Jun 1, 2018
@@ -2040,7 +2040,7 @@ def date(self):
Returns numpy array of python datetime.date objects (namely, the date
part of Timestamps without timezone information).
"""
return libts.ints_to_pydatetime(self.normalize().asi8, box="date")
return libts.ints_to_pydatetime(self.asi8, box="date")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be self._local_timestamps() instead of self.asi8 since those are essentially epoch (UTC) timestamps and we want to return local values.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. That might be it.

@@ -27,6 +27,21 @@ def test_roundtrip_pickle_with_tz(self):
unpickled = tm.round_trip_pickle(index)
tm.assert_index_equal(index, unpickled)

def test_date_accessor_with_tz(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 2 separate tests, you can have one test pytest.mark.parameterize over the timezone aware and naive data. (e.g. dtype='datetime64[ns, CET]'' and dtype=None)

@codecov
Copy link

codecov bot commented Jun 2, 2018

Codecov Report

Merging #21281 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21281      +/-   ##
==========================================
+ Coverage   91.85%   91.85%   +<.01%     
==========================================
  Files         153      153              
  Lines       49549    49555       +6     
==========================================
+ Hits        45512    45518       +6     
  Misses       4037     4037
Flag Coverage Δ
#multiple 90.25% <100%> (ø) ⬆️
#single 41.87% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/datetimes.py 95.8% <100%> (+0.02%) ⬆️
pandas/core/series.py 94.12% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c65c57...3b15060. Read the comment docs.

@jamestran201
Copy link
Author

ASV for this commit:

[  0.00%] · For pandas commit hash d962273a:
[  0.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[  0.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ··· Running timeseries.DatetimeIndex.time_to_date                                                                     1.23±0.1ms;...
[ 50.00%] · For pandas commit hash 4274b840:
[ 50.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[100.00%] ··· Running timeseries.DatetimeIndex.time_to_date                                                                    1.20±0.08ms;...
BENCHMARKS NOT SIGNIFICANTLY CHANGED.

@@ -2040,7 +2040,11 @@ def date(self):
Returns numpy array of python datetime.date objects (namely, the date
part of Timestamps without timezone information).
"""
return libts.ints_to_pydatetime(self.normalize().asi8, box="date")
if (self.tz is None):
return libts.ints_to_pydatetime(self.normalize().asi8, box="date")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the .normalize() calls here? I think ints_to_pydatetime will strip the time information away when constructing the date

])
def test_date_accessor(self, test_input):
# GH 21230
from datetime import date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is imported at the top already

# GH 21230
from datetime import date

assert test_input.date == np.array(date(2013, 1, 24))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tm.assert_numpy_array_equal here.

@@ -27,6 +27,18 @@ def test_roundtrip_pickle_with_tz(self):
unpickled = tm.round_trip_pickle(index)
tm.assert_index_equal(index, unpickled)

@pytest.mark.parametrize("test_input", [
DatetimeIndex(['2013-01-24 15:01:00']),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, you could leave the DatetimeIndex construction in the test, and parametrize over the dtypes and pass that into the constructor.

@jorisvandenbossche jorisvandenbossche added this to the 0.23.1 milestone Jun 4, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Can you do a similar change for time attribute (just above date in the code) ?

@jorisvandenbossche
Copy link
Member

Can you do a similar change for time attribute (just above date in the code) ?

Then it would also close #21267

@jamestran201
Copy link
Author

Sure. There's some discussion in #21267 about whether time should return a time with the timezone in it. Could you tell me which is the correct behavior?

@jorisvandenbossche
Copy link
Member

I think converting back to the 0.22 behaviour is the best for now (which I think is no timezone in the output?) We can still later discuss if we want to add the timezone or not.

@mroeschke
Copy link
Member

@jorisvandenbossche regarding my comment (#21267 (comment)), it looks like the time behavior looks correct but there was some confusion stemming from the construction of the DatetimeIndex (#15938).

@@ -2040,7 +2040,11 @@ def date(self):
Returns numpy array of python datetime.date objects (namely, the date
part of Timestamps without timezone information).
"""
return libts.ints_to_pydatetime(self.normalize().asi8, box="date")
if (self.tz is None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob need
if self.tz is not None and self.tz is not utc
also write it more like

# add a comment here about tz conversions
if self.tz is not None and self.tz is not utc:
    stamps = self.asi8
else:
    stamps = self._local_timestamps()
return libts.init_to_pytdatetime(stamps, box="date")

# GH 21230

index = DatetimeIndex(['2013-01-24 15:01:00'], dtype=dtype)
tm.assert_numpy_array_equal(index.date,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use

result = 
expected = 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add another date that is not ambiguous (e.g. its on the same date regardless of tz)
can you add a NaT as well

@@ -27,6 +27,17 @@ def test_roundtrip_pickle_with_tz(self):
unpickled = tm.round_trip_pickle(index)
tm.assert_index_equal(index, unpickled)

@pytest.mark.parametrize("dtype", [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to go in test_timezones (same subdir)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

place it near other tests that look at .date

@@ -85,7 +85,7 @@ Indexing
- Bug in :meth:`Series.reset_index` where appropriate error was not raised with an invalid level name (:issue:`20925`)
- Bug in :func:`interval_range` when ``start``/``periods`` or ``end``/``periods`` are specified with float ``start`` or ``end`` (:issue:`21161`)
- Bug in :meth:`MultiIndex.set_names` where error raised for a ``MultiIndex`` with ``nlevels == 1`` (:issue:`21149`)
-
- Bug in :attr:`DatetimeIndex.date` where an incorrect date is returned when the input date has a timezone (:issue:`21230`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be non-UTC timezone

@jamestran201
Copy link
Author

@jorisvandenbossche @mroeschke @jreback For #21267, it seems that whether the timezone should be returned still needs to be discussed. Is it ok if I change the behavior of time to match v0.22 for now (i.e: time does not return a Time with timezone information) since that was also an unintended change?

@jreback
Copy link
Contributor

jreback commented Jun 5, 2018

wouldn’t change this as it’s correct
pls update the doc string

looking at this again -

both .date and .time should be naive i think (and .date already is)

so ok to change .time back to naive

mention in docs that these are local (naive) date/times that are returned

pls create an issue about this for discussion as well (eg to in future make these tz-aware)

expected = np.array([date(2018, 6, 4), pd.NaT], ndmin=1)

index = DatetimeIndex(['2018-06-04 10:00:00', pd.NaT], dtype=dtype)
result = index.date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u also test .time here

@jorisvandenbossche
Copy link
Member

wouldn’t change this as it’s correct

No, adding the timezone was unintended, and is inconsistent with how Timestamp itself works, and inconsistent with standard library (they both return naive local time).
So I would just change it back to how it was.

@jorisvandenbossche
Copy link
Member

@jreback sorry, missed the update of your comment. So yes, then we agree!

])
def test_date_accessor(self, dtype):
# Regression test for GH#21230
expected = np.array([date(2018, 6, 4), pd.NaT], ndmin=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the ndmin=1 needed here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed it to avoid a shape of (1,) when creating a numpy array with only 1 element. I'll remove it now.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, looks good to me, thanks!

@jamestran201
Copy link
Author

Is there any way to trigger a re-build on CircleCI? That build failed with this error:

CondaHTTPError: HTTP 522 ORIGIN CONNECTION TIME-OUT for url <https://repo.anaconda.com/pkgs/main/linux-64/libpq-10.3-h1ad7b7a_0.tar.bz2>
Elapsed: 00:42.034156
CF-RAY: 426f9db31db895a4-IAD

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

@jorisvandenbossche jorisvandenbossche merged commit a363e1a into pandas-dev:master Jun 7, 2018
@jorisvandenbossche
Copy link
Member

@tmnhat2001 Thanks a lot!

@jamestran201
Copy link
Author

Thanks everyone.

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Jun 12, 2018
…andas-dev#21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date pandas-dev#21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive pandas-dev#21267

(cherry picked from commit a363e1a)
TomAugspurger pushed a commit that referenced this pull request Jun 12, 2018
…21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date #21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive #21267

(cherry picked from commit a363e1a)
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018
…andas-dev#21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date pandas-dev#21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive pandas-dev#21267
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datetime.index.date returns incorrect date post upgrade to version 0.23
6 participants