Skip to content

Bug: Resample removes timezone localization from Period.start_time #28039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ngehrsitz opened this issue Aug 20, 2019 · 3 comments
Open

Bug: Resample removes timezone localization from Period.start_time #28039

ngehrsitz opened this issue Aug 20, 2019 · 3 comments
Labels
Bug Error Reporting Incorrect or improved errors from pandas Period Period data type Resample resample method Timezones Timezone data dtype

Comments

@ngehrsitz
Copy link

ngehrsitz commented Aug 20, 2019

Code Sample

from pytz import timezone
index = DatetimeIndex(['2019-01-01 06:00', '2019-01-01 07:00', '2019-01-01 08:00'], tz=timezone('Europe/Berlin'))
s = Series(index=index, data=[1, 2, 3])
print(s.index.tz)
rs = s.resample("3h", kind="period").sum()
for x in rs.index:
    print(x.start_time.tz)
    print(x.end_time.tz)

Problem description

All the Periods in the resampled PeriodIndex should have their start_time and end_time TimeStamps set with the correct Timezone.
Since the the DateTimeIndex was tz aware the attributes in the PeriodIndex should be too.
This is probably related to #13238 and #15777

Expected Output

Europe/Berlin
Europe/Berlin
Europe/Berlin

Tested Version

pandas 0.25.0

@mroeschke
Copy link
Member

We normally show a warning when converting a tz aware Timestamp to a Period (applied in #22549)

In [15]: s.index[0].to_period('3h')
/anaconda3/envs/pandas-dev/bin/ipython:1: UserWarning: Converting to Period representation will drop timezone information.
  #!/anaconda3/envs/pandas-dev/bin/python
Out[15]: Period('2019-01-01 06:00', '3H')

Guess in this case the warning is not being surfaced.

@mroeschke mroeschke added Error Reporting Incorrect or improved errors from pandas Period Period data type Resample resample method Timezones Timezone data dtype labels Aug 20, 2019
@jreback
Copy link
Contributor

jreback commented Aug 21, 2019

slight OT: the kind arg to resample should be deprecated as this is a very confusing api and you should convert to a PI if that is what you want. the implementation currently is very opaque.

@ngehrsitz
Copy link
Author

@mroeschke I am not getting any warning. Neither in the Python 3.6 Console nor in PyCharm.
@jreback The reason why i am using resample in the firstplace is that i have a data that is randomly distributed and indexed with a DateTimeIndex. If do resample without the kind argument my index starts at second 12 and not at 18 like i want it to. I also have dynamically changing datapoints and frequencies so i am not sure how to set base correctly.

Code Sample

index = DatetimeIndex(['2019-01-01 06:17:18', '2019-01-01 06:17:22', '2019-01-01 06:18:00'], tz=timezone('US/Eastern'))
d = DataFrame(index=index, data=[[1, 1], [2, 2], [3, 3]])
print(d.index.tz)
incorrect = d.resample("8s").sum()
rs = d.resample("8s", kind="period").sum()
for x in rs.index:
	print(x.start_time.tz)
	print(x.end_time.tz)

If you are not working with your native timezone this requires nasty hacks since you now have to carry the timezone data along with the resampled dataframe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Period Period data type Resample resample method Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

3 participants