-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Add CalendarDay ('CD') offset #22288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #22288 +/- ##
==========================================
+ Coverage 92.04% 92.05% +<.01%
==========================================
Files 169 169
Lines 50782 50787 +5
==========================================
+ Hits 46742 46751 +9
+ Misses 4040 4036 -4
Continue to review full report at Codecov.
|
Should be ready for a look @jbrockmendel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good. I think need a bit of docs in timeseries.rst e.g. about when to use CD.
@@ -3174,3 +3180,60 @@ def test_last_week_of_month_on_offset(): | |||
slow = (ts + offset) - offset == ts | |||
fast = offset.onOffset(ts) | |||
assert fast == slow | |||
|
|||
|
|||
@pytest.mark.parametrize('box, assert_func', [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use tm.assert_equal
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am testing 2 array like and 1 scalar here, and tm.assert_equal
is not defined for scalars:
https://github.com/pandas-dev/pandas/blob/master/pandas/util/testing.py#L1497
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you then split this test , it is very confusing
@@ -2,6 +2,7 @@ | |||
from datetime import date, datetime, timedelta | |||
|
|||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a fixture that need to add CalendarDay, which systematically tests lots of things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will exisit here since CalendarDay
is in offsets.__all__
pandas/pandas/tests/tseries/offsets/conftest.py
Lines 5 to 6 in cf70d11
@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__]) | |
def offset_types(request): |
""" | ||
Apply scalar arithmetic with CalendarDay offset. Incoming datetime | ||
objects can be tz-aware or naive. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about adding a CD and a D which crosses dst?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test with CD that crosses dst exists here:
def test_CalendarDay_with_timezone(box, assert_func): |
A test with D that crosses dst exists here:
Day: ['11/4/2012', '11/4/2012 23:00']}.items() |
Looks like I am running into a Windows/platform specific bug: On OSX
But on the AppVeyor test (Win32)
Unfortunately I can't reproduce this locally. Guessing it may due to either:
|
This may be a hassle to check, but does |
Hmm not sure I quite understand @jbrockmendel. Check that |
The issue is that |
@jreback all green. |
index = cls._simple_new(arr.astype('M8[ns]'), freq=None, tz=tz) | ||
else: | ||
# Create a linearly spaced date_range in local time | ||
arr = np.linspace(start.value, end.value, periods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this (does it work?)
pandas/core/arrays/datetimes.py
Outdated
end = end.tz_localize(tz, ambiguous=False) | ||
if tz is not None: | ||
# Localize the start and end arguments | ||
if start is not None and start.tz is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could use _maybe_localize_point(start, getattr(start, 'tz' None))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I think that should work!
@jreback Addressed your comments and all green. |
pandas/core/arrays/datetimes.py
Outdated
# 1) freq = a Timedelta-like frequency (Tick) | ||
# 2) freq = None i.e. generating a linspaced range | ||
if isinstance(freq, Tick) or freq is None: | ||
localize_args = {'tz': tz, 'ambiguous': False} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its worthile to add freq to _maybe_localize_point
then you don;t need localize_args
; it will be inside that function
pandas/core/arrays/datetimes.py
Outdated
else: | ||
# Create a linearly spaced date_range in local time | ||
arr = np.linspace(start.value, end.value, periods) | ||
index = cls._simple_new(arr.astype('M8[ns]'), freq=None, tz=tz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can add copy=False
I think
Returns | ||
------- | ||
ts : Timestamp | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. figure out localize_args here (from a passed freq)
Hello @mroeschke! Thanks for updating the PR.
Comment last updated on September 05, 2018 at 16:52 Hours UTC |
@jreback made the requested changes and all green |
thanks! @mroeschke nice addition |
Sorry that I did not comment earlier (also didn't read all review comments), but I think this is a much bigger break in behaviour than we think / is documented. This completely changes the default behaviour of Now on master:
Previously:
This also changed the inferring of freq (now '24H' instead of 'D'), changes the behaviour of |
(I would personally revert this PR until we have better discussed it) |
I left a #22274 (comment) with the rational. In terms of documentation, there's a section in the whatsnew v0.24 that notes this API change: pandas/doc/source/whatsnew/v0.24.0.txt Lines 288 to 326 in 70c9003
And new reference in |
Yeah, there were certainly nice new docs about the CalendarDay, but the illustration of the changed behaviour was only for, IMO, a minor use case of arithmetic with offset objects, while the more relevant user-facing changes as date_range and resample were not mentioned. |
That's a fair concern; I can certainly give more attention to this impact on |
git diff upstream/master -u -- "*.py" | flake8 --diff
Adds
CalendarDay
and'CD'
offset and alias respectively and changesDay
('D'
) indate_range
to respect calendar day.With regards to
CalendarDay
added tests for:CalendarDay
arithmetic with itself, datetimes, and timedeltas (which should raise)'CD'
with use indate_range
andtimedelta_range
(which should raise)'CD'
Also refactored
_generate_range
fordate_range
since we no longer have to check forDay