-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Call unique() on a timezone aware datetime series returns non timezone aware result #13565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is correct and as expected, you get a UTC numpy array back, and numpy displays things in your local timezone. When this eventually returns an
|
Yes the output is allways in UTC even if the input dates are from a different timezone. This isnt the issue. But as an pandas user i would expect to get timezone aware datetimes ( with the UTC timezone info ) if i run unique() on timezone aware datetimes, which is i think the intuitive thought about it. Anyway, if this is the expected behavior, it should be documented. |
and I pointed you to the other issue. |
@paulgueltekin The problem is that But indeed, this could be documented somewhere (in the docstring? and in the tutorial docs on date functionality) |
@paulgueltekin Do you want to add a note to the |
@jorisvandenbossche Yes i can do that |
After #13979:
lmk if there is anything should be added to docstring. |
Given the discussion in #13395, options for this issue are:
Personally not really a strong preference for one of both. |
comment here; should return an |
Keep also in mind that changing this would maybe break some existing code. |
@paulgueltekin that's why this is an API change. Further this will break loudly. |
Call unique() on a timezone aware datetime series returns non timezone aware result.
Code Sample
import pandas as pd
import pytz
import datetime
In [242]: ts = pd.Series([datetime.datetime(2011,2,11,20,0,0,0,pytz.utc), datetime.datetime(2011,2,11,20,0,0,0,pytz.utc), datetime.datetime(2011,2,11,21,0,0,0,pytz.utc)])
In [243]: ts
Out[243]:
0 2011-02-11 20:00:00+00:00
1 2011-02-11 20:00:00+00:00
2 2011-02-11 21:00:00+00:00
dtype: datetime64[ns, UTC]
In [244]: ts.unique()
Out[244]: array(['2011-02-11T20:00:00.000000000', '2011-02-11T21:00:00.000000000'], dtype='datetime64[ns]')
output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: de_AT.UTF-8
pandas: 0.18.1
nose: 1.3.4
pip: 8.1.2
setuptools: 22.0.5
Cython: 0.21.1
numpy: 1.11.0
scipy: 0.14.0
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: 1.2.3
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.4
matplotlib: 1.4.2
openpyxl: 2.3.5
xlrd: 0.9.2
xlwt: 0.7.4
xlsxwriter: None
lxml: 3.6.0
bs4: None
html5lib: 1.0b3
httplib2: 0.9
apiclient: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: None
jinja2: 2.7.3
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: