Skip to content

Recognize timezoned labels when accessing dataframes. #17920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
4671aeb
Recognize timezoned labels when accessing dataframes.
1kastner Oct 19, 2017
2297833
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
Oct 31, 2017
69b517e
Make `test_access_datetimeindex_with_timezoned_label` PEP08 compliant.
Oct 31, 2017
6532e76
add translate function for converting time zones.
Oct 31, 2017
c354271
Move NaT to self-contained module (#18014)
jbrockmendel Nov 1, 2017
a9202fb
Separate out arithmetic tests for datetimelike indexes (#18049)
jbrockmendel Nov 1, 2017
88bf001
Adding skip to test failing because of lxml import (#17747) (#17748)
datapythonista Nov 1, 2017
7d8c9ab
a zillion flakes (#18046)
jbrockmendel Nov 1, 2017
1310680
TST: separate out grouping-type tests (#18057)
jreback Nov 1, 2017
46d9416
BUG: DataFrame.groupby() interprets tuple as list of keys
GuessWhoSamFoo Nov 1, 2017
c8a604e
CLN: some lint issues
jreback Nov 1, 2017
de7a065
read_html(): rewinding [wip] (#18017)
LiamIm Nov 1, 2017
7c0a3be
CI: temp disable scipy on windows 3.6 build (#18078)
jreback Nov 2, 2017
8844b2e
DOC: Remove duplicate 'in' from contributing.rst (#18040) (#18076)
mattayes Nov 2, 2017
62695a2
improve test output for Categoricals (#18069)
topper-123 Nov 2, 2017
7691209
MAINT: Remove np.array_equal calls in tests (#18047)
gfyoung Nov 2, 2017
edad476
Move scalar arithmetic tests to tests.scalars (#18075)
jbrockmendel Nov 2, 2017
bd958a1
Update Contributing Environment section (#18052)
TomAugspurger Nov 2, 2017
ef9a06c
Index tests in the wrong places (#18074)
jbrockmendel Nov 2, 2017
ba279c0
Move comparison utilities to np_datetime; (#18080)
jbrockmendel Nov 2, 2017
2a31f7b
Separate _TSObject into conversion (#18060)
jbrockmendel Nov 2, 2017
aa5ea0f
Port Timedelta implementation to tslibs.timedeltas (#17937)
jbrockmendel Nov 3, 2017
4bfbca9
COMPAT: compare platform return on 32-bit (#18090)
jreback Nov 3, 2017
dd761d3
Fix 18068: Updates merge_asof error, now outputs datatypes (#18082)
manrajgrover Nov 3, 2017
a6353dd
TST: Add regression test for empty DataFrame groupby (#18097)
Licht-T Nov 4, 2017
c440981
BUG: Fix the error when reading the compressed UTF-16 file (#18091)
Licht-T Nov 4, 2017
2c3faad
BUG: Implement PeriodEngine to fix PeriodIndex truncate bug (#17755)
Licht-T Nov 4, 2017
fff48bb
standardize indentation, arrange in allphabetical order (#18104)
jbrockmendel Nov 4, 2017
00f61bb
BLD: Make sure to copy ZIP files for parser tests (#18108)
gfyoung Nov 4, 2017
69a3b06
Revert "CI: temp disable scipy on windows 3.6 build (#18078)" (#18105)
jreback Nov 4, 2017
ffd363b
Masking and overflow checks for datetimeindex and timedeltaindex ops …
jbrockmendel Nov 4, 2017
8587a3d
BUG: Override mi-columns in to_csv if requested (#18110)
gfyoung Nov 5, 2017
763b5f7
fix failing tests.
1kastner Nov 5, 2017
9456b77
Merge branch 'error-on-non-naive-datetime-strings' of https://github.…
1kastner Nov 5, 2017
fd49175
Rewrite naive/timezone matrix condition, Improve test cases
1kastner Nov 5, 2017
d944bfd
adjust as it was before (un-done changes)
1kastner Nov 5, 2017
1641bf2
Add tz keyword.
1kastner Nov 5, 2017
f12caa1
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 13, 2017
1a3ab3b
Apply suggestions of review
Nov 13, 2017
31ef655
refactor: replace _utc() with utc
Nov 13, 2017
edfd895
fix flake8 issues
1kastner Nov 13, 2017
fbf8a1c
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 13, 2017
9f0dc5d
replace datetime.datetime with pd.Timestamp
Nov 14, 2017
817bfef
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 16, 2017
5c11e02
Add whatsnew and documentation.
1kastner Nov 16, 2017
6a218e5
Fix variable name in documentation
1kastner Nov 16, 2017
577d742
Apply review suggestions.
1kastner Nov 17, 2017
931b7f9
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 17, 2017
02aa59f
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 17, 2017
0e4c499
Move change to bug and rename into result and expected
1kastner Nov 23, 2017
16fe3c3
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 23, 2017
5724292
Add CET timezoned datetime index as another test case
1kastner Nov 26, 2017
a4f3a5c
Merge branch 'master' of https://github.com/pandas-dev/pandas into er…
1kastner Nov 26, 2017
8a2176d
Adjust for flake8
Nov 26, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 20 additions & 15 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -1273,52 +1273,57 @@ def _parsed_string_to_bounds(self, reso, parsed):
lower, upper: pd.Timestamp

"""
if parsed.tzinfo is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you need to do something different here. leave the dates that are generated in the current tz (e.g. self.tz). Then you need to convert it to the parsed_tz, BUT then you need to localize back into the original timezone. There are a number of cases, here's an example.

# index in US/Eastern, parsed in UTC
In [22]: pd.Timestamp('20130101',tz='US/Eastern')
Out[22]: Timestamp('2013-01-01 00:00:00-0500', tz='US/Eastern')
00:00+0000', tz='UTC')

In [24]: pd.Timestamp('20130101',tz='US/Eastern').tz_convert('UTC').tz_localize(None).tz_localize('US/Eastern')
Out[24]: Timestamp('2013-01-01 05:00:00-0500', tz='US/Eastern')

# index is naive, parsed is UTC, ineffect no change here
In [25]: pd.Timestamp('20130101')
Out[25]: Timestamp('2013-01-01 00:00:00')

In [26]: pd.Timestamp('20130101').tz_localize('UTC').tz_localize(None)
Out[26]: Timestamp('2013-01-01 00:00:00')

As I am writing this, it looks overly complicated. I might choose instead to raise if the timezones don't match (they can be same tz or both None).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I elaborated in #16785 the timezones do not need to match. There are quite usual cases with daylight savings time that require some flexibility here. So my idea is to change target_tz = parsed.tzinfo to some kind of conversion which you mentioned. Next week I might give it a shot. Dealing with only the timezone of the DatetimeIndex seems reasonable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really tired but tried to find a solution. If it works, great, if not, at least it shows the idea. I will come back to it as soon as possible. Now I am having some issues with installing the development environment under Windows 10 and all solutions found look quite time consuming. Sorry when I spam you with not working code.

target_tz = self.tz
else:
target_tz = parsed.tzinfo

if reso == 'year':
return (Timestamp(datetime(parsed.year, 1, 1), tz=self.tz),
return (Timestamp(datetime(parsed.year, 1, 1), tz=target_tz),
Timestamp(datetime(parsed.year, 12, 31, 23,
59, 59, 999999), tz=self.tz))
59, 59, 999999), tz=target_tz))
elif reso == 'month':
d = libts.monthrange(parsed.year, parsed.month)[1]
return (Timestamp(datetime(parsed.year, parsed.month, 1),
tz=self.tz),
tz=target_tz),
Timestamp(datetime(parsed.year, parsed.month, d, 23,
59, 59, 999999), tz=self.tz))
59, 59, 999999), target_tz))
elif reso == 'quarter':
qe = (((parsed.month - 1) + 2) % 12) + 1 # two months ahead
d = libts.monthrange(parsed.year, qe)[1] # at end of month
return (Timestamp(datetime(parsed.year, parsed.month, 1),
tz=self.tz),
tz=target_tz),
Timestamp(datetime(parsed.year, qe, d, 23, 59,
59, 999999), tz=self.tz))
59, 999999), tz=target_tz))
elif reso == 'day':
st = datetime(parsed.year, parsed.month, parsed.day)
return (Timestamp(st, tz=self.tz),
return (Timestamp(st, tz=target_tz),
Timestamp(Timestamp(st + offsets.Day(),
tz=self.tz).value - 1))
tz=target_tz).value - 1))
elif reso == 'hour':
st = datetime(parsed.year, parsed.month, parsed.day,
hour=parsed.hour)
return (Timestamp(st, tz=self.tz),
return (Timestamp(st, tz=target_tz),
Timestamp(Timestamp(st + offsets.Hour(),
tz=self.tz).value - 1))
tz=target_tz).value - 1))
elif reso == 'minute':
st = datetime(parsed.year, parsed.month, parsed.day,
hour=parsed.hour, minute=parsed.minute)
return (Timestamp(st, tz=self.tz),
return (Timestamp(st, tz=target_tz),
Timestamp(Timestamp(st + offsets.Minute(),
tz=self.tz).value - 1))
tz=target_tz).value - 1))
elif reso == 'second':
st = datetime(parsed.year, parsed.month, parsed.day,
hour=parsed.hour, minute=parsed.minute,
second=parsed.second)
return (Timestamp(st, tz=self.tz),
return (Timestamp(st, tz=target_tz),
Timestamp(Timestamp(st + offsets.Second(),
tz=self.tz).value - 1))
tz=target_tz).value - 1))
elif reso == 'microsecond':
st = datetime(parsed.year, parsed.month, parsed.day,
parsed.hour, parsed.minute, parsed.second,
parsed.microsecond)
return (Timestamp(st, tz=self.tz), Timestamp(st, tz=self.tz))
return (Timestamp(st, tz=target_tz), Timestamp(st, tz=target_tz))
else:
raise KeyError

Expand Down
24 changes: 24 additions & 0 deletions pandas/tests/indexing/test_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,30 @@ def test_consistency_with_tz_aware_scalar(self):
result = df[0].at[0]
assert result == expected

def test_access_datetimeindex_with_timezoned_label(self):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the github issue number here as a comment? (see the test after this one for an example)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now.

idx = pd.DataFrame(index=pd.date_range('2016-01-01T00:00', '2016-03-31T23:59', freq='T'))

former_naive_endpoint_idx = idx[
"2016-01-01T00:00-02:00"
:
"2016-01-01T02:03"
]

former_non_naive_endpoint_idx = idx[
pd.Timestamp("2016-01-01T00:00-02:00")
:
pd.Timestamp("2016-01-01T02:03")
]

assert len(former_naive_endpoint_idx) == len(former_non_naive_endpoint_idx)

assert former_naive_endpoint_idx.iloc[0].name == former_non_naive_endpoint_idx.iloc[0].name
assert former_naive_endpoint_idx.iloc[1].name == former_non_naive_endpoint_idx.iloc[1].name
assert former_naive_endpoint_idx.iloc[2].name == former_non_naive_endpoint_idx.iloc[2].name
assert former_naive_endpoint_idx.iloc[3].name == former_non_naive_endpoint_idx.iloc[3].name


def test_indexing_with_datetimeindex_tz(self):

# GH 12050
Expand Down