Skip to content

Datetime.index.date returns incorrect date post upgrade to version 0.23 #21230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MarekOzana opened this issue May 28, 2018 · 5 comments · Fixed by #21281
Closed

Datetime.index.date returns incorrect date post upgrade to version 0.23 #21230

MarekOzana opened this issue May 28, 2018 · 5 comments · Fixed by #21281
Labels
Regression Functionality that used to work in a prior pandas version Timezones Timezone data dtype
Milestone

Comments

@MarekOzana
Copy link

Code Sample, a copy-pastable example if possible

df1 = pd.DataFrame(data=[24, 25],
                   index=pd.DatetimeIndex(['2013-01-24 15:01:00+01:00',
                                           '2013-01-25 15:01:00+01:00'],
                                          dtype='datetime64[ns, CET]',
                                          name='Date', freq=None))
print(df1.index.date)

the above code prints:
[datetime.date(2013, 1, 23) datetime.date(2013, 1, 24)]

Problem description

Datetime.index.date does return incorrect dates. The behavious worked until version 0.22, and seems to be incorrect post upgrade to version 0.23

Expected Output

[datetime.date(2013, 1, 24) datetime.date(2013, 1, 25)]

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.12.1
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@mroeschke
Copy link
Member

Looks like a timezone issue since this works correctly without a timezone dtype. Investigation and PR's welcome!

In [7]: d = pd.DatetimeIndex(['2013-01-24 15:01:00+01:00','2013-01-25 15:01:00+01:00'])

In [8]: d.date
Out[8]:
array([datetime.date(2013, 1, 24), datetime.date(2013, 1, 25)],
      dtype=object)

In [9]: d.tz_localize('CET').date
Out[9]:
array([datetime.date(2013, 1, 23), datetime.date(2013, 1, 24)],
      dtype=object)

@mroeschke mroeschke added Bug Timezones Timezone data dtype labels May 29, 2018
@ssikdar1
Copy link
Contributor

ssikdar1 commented May 31, 2018

Hmm is this possibly on purpose?

import pandas as pd

df1 = pd.DataFrame(data=[24, 25],
                   index=pd.DatetimeIndex(['2013-01-24 15:01:00+01:00',
                                           '2013-01-25 15:01:00+01:00'],
                                          dtype='datetime64[ns, CET]',
                                          name='Date', freq=None))
print(type(df1.index))

Prints

<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

Looking at the index method of DatetimeIndex :
https://github.com/pandas-dev/pandas/blob/master/pandas/core/indexes/datetimes.py#L2037

    @property
    def date(self):
        """
        Returns numpy array of python datetime.date objects (namely, the date
        part of Timestamps without timezone information).
        """
        return libts.ints_to_pydatetime(self.normalize().asi8, box="date")

So here the datetime.date obj is intentionally being returned without the timezone information.

@jorisvandenbossche jorisvandenbossche added Regression Functionality that used to work in a prior pandas version and removed Bug labels May 31, 2018
@jorisvandenbossche jorisvandenbossche added this to the 0.23.1 milestone May 31, 2018
@jorisvandenbossche
Copy link
Member

I suppose this is due to #18163, which improved the performance of the .date accessor, but probably forgot to deal with timezones.

Before it relied on the date() method of the individual timestamp objects, which is still correct:

In [19]: df1.index[0]
Out[19]: Timestamp('2013-01-24 14:01:00+0100', tz='CET')

In [20]: df1.index[0].date()
Out[20]: datetime.date(2013, 1, 24)

@jorisvandenbossche
Copy link
Member

cc @tmnhat2001 welcome to take a look if you would have time.

@jamestran201
Copy link

jamestran201 commented Jun 1, 2018

Just did a some more digging why some test cases fail. It seems that when DatetimeIndex.normalize() is used, The dates are converted properly when the original timezone is behind UTC time. But when they are ahead of UTC time, the returned value is incorrect.

In [41]: index = pd.DatetimeIndex(['2013-01-24 15:01:00'], dtype='datetime64[ns, EST]', freq=None)
In [42]: index.date
Out[42]: array([datetime.date(2013, 1, 24)], dtype=object)

In [43]: index = pd.DatetimeIndex(['2013-01-24 15:01:00'],dtype='datetime64[ns, CET]', freq=None)
    ...:

In [44]: index.date
Out[44]: array([datetime.date(2013, 1, 23)], dtype=object)

jorisvandenbossche pushed a commit that referenced this issue Jun 7, 2018
…21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date #21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive #21267
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue Jun 12, 2018
…andas-dev#21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date pandas-dev#21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive pandas-dev#21267

(cherry picked from commit a363e1a)
TomAugspurger pushed a commit that referenced this issue Jun 12, 2018
…21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date #21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive #21267

(cherry picked from commit a363e1a)
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this issue Jun 18, 2018
…andas-dev#21281)

* BUG: Using DatetimeIndex.date with timezone returns incorrect date pandas-dev#21230
* Fix bug where DTI.time returns a tz-aware Time instead of tz-naive pandas-dev#21267
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants