-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Odd timezone offset change with old datetimes with tz_convert #41834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think this is due to a shortcoming / bug in pytz. For example for the first case of "Australia/Broken_Hill": >>> import datetime
>>> import pytz
>>> tz1 = pytz.timezone("Australia/Broken_Hill")
>>> tz1
<DstTzInfo 'Australia/Broken_Hill' LMT+9:26:00 STD>
>>> print(tz1.utcoffset(datetime.datetime(2000, 1, 1)))
10:30:00
>>> print(tz1.utcoffset(datetime.datetime(1900, 1, 1)))
9:26:00 So for the older date, it is falling back to the LMT ("Local Mean Time"). While comparing that to the >>> import zoneinfo
>>> tz2 = zoneinfo.ZoneInfo("Australia/Broken_Hill")
>>> tz2
zoneinfo.ZoneInfo(key='Australia/Broken_Hill')
>>> print(tz2.utcoffset(datetime.datetime(2000, 1, 1)))
10:30:00
>>> print(tz2.utcoffset(datetime.datetime(1900, 1, 1)))
9:30:00 this gives the correct offset for the older datetime. |
Huh, interesting! |
I don't know exactly from which point in time there are clear timezone rules, and what the different libraries return for a datetime before the start of those rules (probably the "local mean time"). For example, for an even older date, both
(pytz just rounds to the minute) |
Yeah it gets really fuzzy really fast. |
Thanks for the report, but as mentioned I think this is ultimately a pytz issue (we also have a note in our timezone docs that tz libraries may have different timezone definitions). Since this an upstream issue, I'm unsure if there's a pandas specific fix here, but happy to reopen if there is. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Returns:
Problem description
Timezone offset seems to change when it shouldn't? I'm not 100% this is not the correct behavior but it seems odd.
I hope this is known behavior and not something esoteric.
Expected Output
Probably this:
I'm not sure these timezones even existed then so this might be an invalid calculation.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2cb9652
python : 3.7.0.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-73-lowlatency
Version : #82-Ubuntu SMP PREEMPT Wed Apr 14 19:19:50 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.4
numpy : 1.19.1
pytz : 2020.5
dateutil : 2.8.1
pip : 20.2.3
setuptools : 50.3.0.post20201006
Cython : None
pytest : 6.2.1
hypothesis : None
sphinx : 2.4.4
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.2
html5lib : None
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.2
IPython : 7.18.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.1
numexpr : None
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 1.0.1
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : 1.3.19
tables : None
tabulate : 0.8.7
xarray : None
xlrd : 2.0.1
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: