-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Bitesize offsets #17318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bitesize offsets #17318
Conversation
Hello @jbrockmendel! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on September 23, 2017 at 16:21 Hours UTC |
plural = 's' | ||
else: | ||
plural = '' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole thing should just be moved to a separate function, much more clear this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you're referring to as "this whole thing".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this formatting function (e.g. repr should just call it, passing parameters in).
@@ -642,9 +623,6 @@ def get_str(td): | |||
else: | |||
return '+' + repr(self.offset) | |||
|
|||
def isAnchored(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls don't simply remove things. instead in a separate PR deprecate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method is still available. It's inherited verbatim from the parent class.
Codecov Report
@@ Coverage Diff @@
## master #17318 +/- ##
==========================================
+ Coverage 91.01% 91.02% +0.01%
==========================================
Files 162 162
Lines 49558 49564 +6
==========================================
+ Hits 45105 45116 +11
+ Misses 4453 4448 -5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17318 +/- ##
==========================================
- Coverage 91.22% 91.21% -0.01%
==========================================
Files 163 163
Lines 49655 49673 +18
==========================================
+ Hits 45296 45308 +12
- Misses 4359 4365 +6
Continue to review full report at Codecov.
|
I think this test error is unrelated. Pls confirm. |
this was a while ago, rebase and we will see |
needs a rebase |
plural = 's' | ||
else: | ||
plural = '' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this formatting function (e.g. repr should just call it, passing parameters in).
return fstr | ||
|
||
def _offset_str(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try not to add new things which just clutter up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_offset_str
is currently a method of BusinessDay
, which has its own implementation of freqstr
. Adding this dummy method lets us get rid of the duplicate freqstr method. Same idea with __repr__/_repr_attrs
. Strictly less clutter.
pandas/tseries/offsets.py
Outdated
@@ -710,6 +689,7 @@ def __init__(self, **kwds): | |||
kwds['end'] = self._validate_time(kwds.get('end', '17:00')) | |||
self.kwds = kwds | |||
self.offset = kwds.get('offset', timedelta(0)) | |||
self._offset = self.offset # alias for backward compat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you just define .offset
as a property to return ._offset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely tried this, don't remember off the top why it didn't work. Maybe when the caffeine kicks in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's try that type of refactor again.
pandas/tseries/offsets.py
Outdated
@@ -710,6 +689,7 @@ def __init__(self, **kwds): | |||
kwds['end'] = self._validate_time(kwds.get('end', '17:00')) | |||
self.kwds = kwds | |||
self.offset = kwds.get('offset', timedelta(0)) | |||
self._offset = self.offset # alias for backward compat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's try that type of refactor again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you run the freq asv's to make sure nothing has changed here.
pandas/tseries/offsets.py
Outdated
if 'offset' in state: | ||
# Older versions have offset attribute instead of _offset | ||
assert '_offset' not in state, list(state.keys()) | ||
state['_offset'] = state['offset'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state['_offset'] = state.pop('offset')
can you remove the assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the pop
. Change assert to ValueError if both keys are (somehow) there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may want to add a 0.20.3 pickle that adds things for every frequency.
in separate PR.
you can use pandas/tests/io/generate_legacy_storage_files.py
, update to add all of the offsets. Then generate and add to the repo. All tests should pass. (this is all with 0.20.3), make the modification locally but run it with the older python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make a note to do this once other fixes to offsets are done.
pandas/tseries/offsets.py
Outdated
return out | ||
@property | ||
def offset(self): | ||
# Alias for backward compat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better comment on what this attribute is (its not for backward compat, rather its the API).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. It explicitly is for backward compat since we are trying to standardize on _offset
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offset
is a user API, the _offset
is merely an implementation detail (e.g. how we implement it). add a doc-string on what this returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Though to the extent that we can lock down what constitutes the user-facing API, offset
probably doesn't belong in it.
@@ -507,8 +513,18 @@ def freqstr(self): | |||
else: | |||
fstr = code | |||
|
|||
try: | |||
if self._offset: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Not all subclasses define the _offset
attribute. That is something I intend to standardize, but this PR is explicitly intended to be limited in scope.
Begrudgingly...
|
This is repeatable.
|
|
After updating, I get the same timings too. But it looks like this may be due to a bug in master. Manually stepping through the benchmark,
which for me (py27, ubuntu...) is giving:
and the resulting |
Same on py35 |
git bisect tells me the first bad commit was b59f107 |
yeah looks like this was a bug introduced there, hmm. |
let me see if I can work around: #17637 |
After #17637, benchmarks are now unaffected. |
does this mean they are the same (worse) or the same as master? |
|
Just pushed. Besides the requested docstring, had to edit the asv to avoid the new OverflowError. |
@@ -56,7 +56,7 @@ def setup(self): | |||
self.no_freq = self.rng7[:50000].append(self.rng7[50002:]) | |||
self.d_freq = self.rng7[:50000].append(self.rng7[50000:]) | |||
|
|||
self.rng8 = date_range(start='1/1/1700', freq='B', periods=100000) | |||
self.rng8 = date_range(start='1/1/1700', freq='B', periods=75000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok great.
lgtm. ping on green. side note, I believe you can make |
That is pretty much the original motivation here. There are a couple more steps between here and there, since until we get |
thanks! |
This is the first of several PRs cleaning up
tseries.offsets
. The ultimate goals of this series of PRs are:DateOffset.__eq__
, as that gets called byPeriod.__eq__
.DateOffset
immutable, since it is attached to aPeriod
object which is supposed to be immutable (TODO: fill in the appropriate GH issue)tseries.offsets
into cython so that_libs.period
and_libs.tslib
can import it guilt-free and not need to do run-time imports. See TODO comment in_libs.__init__
:The biggest impediment to the immutability goal is the
kwds
attribute, which is just adict
. The first couple of steps in this sequence is focused on whittling down the number of attributes set at runtime.This PR is mainly fixing typos and removing redundant methods.
git diff upstream/master -u -- "*.py" | flake8 --diff