-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DateTimeIndex.tz_convert() does not apply DST from 2038 onward #33061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @mroeschke. @obeavers is the issue likely in pandas or pytz? |
Looks like its probably a pytz issue: stub42/pytz#31. |
IIRC from discussions from @pganssle this is basically the 2038 problem. The system tz files only specify DST transitions up to the end of the epoch. |
It's not actually that the system tz files only specify DST transitions to the end of the epoch, it's just that IIUC Shameless plug for PEP 615: discussion is ongoing! |
Is it worth raising a warning in cases where user are attempting timezone conversions past 2038? |
Yes, it’d be absolutely worth raising a warning.
From: Matthew Roeschke <[email protected]>
Sent: Friday, March 27, 2020 3:40 PM
To: pandas-dev/pandas <[email protected]>
Cc: Oliver Beavers <[email protected]>; Mention <[email protected]>
Subject: Re: [pandas-dev/pandas] DateTimeIndex.tz_convert() does not apply DST from 2038 onward (#33061)
Is it worth raising a warning in cases where user are attempting timezone conversions past 2038?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#33061 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHAUXOILQSILXJ67S6LUTWLRJUTKTANCNFSM4LUY2A6Q> . <https://github.com/notifications/beacon/AHAUXOIRCWQNZ7CSO2GSIHTRJUTKTA5CNFSM4LUY2A62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEQKNCZI.gif>
|
Do you mean in The reason I say it's probably still "no" even upstream is that warning on dates after 2038 is a somewhat arbitrary cut-off, and it's not entirely clear what you are warning about anyway. Issuing a warning for 2038 and not 2037 implies that the data we have about time zones in 2037 is good, but the data in 2038 is bad. In reality, the further into the future you go, the less likely the time zone data is to be accurate for any given zone. It's true that the Version 2 data has better guesses than the Version 1 data after 2038, but I don't put a huge amount of stock into those guesses in the first place. I'll note that this is one of the places that you're likely to start getting bitten sooner rather than later by the way (I'll note that the PEP 615 time zones are very fast -- considerably faster than |
Might be worth documenting in the user guide rather than provide a runtime warning: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#time-zone-handling |
As the user that made the mistake, runtime warning should be important here – code snippets get copied/pasted etc. It would have been fantastic to know something was wrong deep in the code.
From: Matthew Roeschke <[email protected]>
Sent: Saturday, March 28, 2020 2:07 PM
To: pandas-dev/pandas <[email protected]>
Cc: Oliver Beavers <[email protected]>; Mention <[email protected]>
Subject: Re: [pandas-dev/pandas] DateTimeIndex.tz_convert() does not apply DST from 2038 onward (#33061)
Might be worth documenting in the user guide rather than provide a runtime warning: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#time-zone-handling
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#33061 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHAUXOOGZ6AFDZTPXL3BES3RJZRINANCNFSM4LUY2A6Q> . <https://github.com/notifications/beacon/AHAUXOKCKQXJ46ZKWYIHIQDRJZRINA5CNFSM4LUY2A62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEQLX54I.gif>
|
+1 to documenting this in our user guide. I don't think we should be warning. The underlying tz library should be able to detect and warn for that better than we can. |
can I take a look at this issue? |
@narendra20 sure. I think the agreement from the maintainers is that a note in the user guide is the best way forward. |
@narendra20 are you moving forward with this ? If not, I can ! (seems like a doc fix ifs a good first issue) |
take |
Proposed wording: Warning: If you are using dates beyond 13 Jan 2038, note that pandas does not apply daylight saving time adjustments to timezone aware dates. This is partly because the underlying libraries do not currently address the Year 2038 Problem , and partly because there is some discussion on how reliable any DST settings that far into the future will be. For example, for two dates that are in British Summer Time and so would normally be GMT+1, both the following evaluate as true: assert pd.Timestamp('2037-03-31T010101', tz='Europe/London') != pd.Timestamp('2037-03-31T010101', tz='GMT')
assert pd.Timestamp('2038-03-31T010101', tz='Europe/London') == pd.Timestamp('2038-03-31T010101', tz='GMT') |
thanks - Is the above wording OK, do you think ? |
I disagree with that wording, it implies that pandas actively prevents time zone changes after the epochalypse, when in fact this has nothing to do with pandas. Some time zone providers (notably the only ones I will comment on the PR. |
Code Sample, a copy-pastable example if possible
Problem description
Wow, this one hurt. US/Eastern timezone is DST-adjusted (blend of EST/EDT) whereas EST is just EST.
The second and third assert statements above should both return False.
Surprised this hasn't come up before.
This is apparently related to a UNIX issue: https://en.wikipedia.org/wiki/Year_2038_problem. With that said, it seems the dtype is datetime64 with some pandas customizations on timezone. Supposedly 64 bit should have solved this.
Expected Output
Both of the following should pass:
assert np.all(pd.date_range('1/1/2038', periods=8760, freq='H', tz='EST').time == pd.date_range('1/1/2038', periods=8760, freq='H', tz='US/Eastern').time) == False # fails
assert np.all(pd.date_range('1/1/2039', periods=8760, freq='H', tz='EST').time == pd.date_range('1/1/2039', periods=8760, freq='H', tz='US/Eastern').time) == False # fails
Output of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]NSTALLED VERSIONS
commit : None
python : 3.8.1.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.0.0
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200127
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.11.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.13.0
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
numba : None
The text was updated successfully, but these errors were encountered: