-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Rolling groupby should not maintain the by column in the resulting DataFrame #32332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
a6ad6c5
make sure exclusions are applied before the groupby object reaches ro…
54903a7
pass object with exclusions earlier on
MarcoGorelli d81c572
revert accident
MarcoGorelli e049205
check for side effects in obj
MarcoGorelli 5f43f3a
wip
MarcoGorelli 3b1c3ff
Wip
MarcoGorelli 207a507
change _selected_obj according to self.mutated
MarcoGorelli c9b34b2
sanitize
MarcoGorelli b2ce758
update old tests
MarcoGorelli c307fdc
fix old docstring
MarcoGorelli 35f2b2d
finish correcting old docstring
MarcoGorelli 2bb7932
use getattr
MarcoGorelli eed4122
old fix
26d9dec
simpler fix
68b4299
try making Window's own _shallow_copy
159a3d6
more wip
0b393b6
typeerror!
9fc50ee
add some types, comment tests
7fe2fcf
fix typing
8a34ff4
better fix
8953bda
correct return annotation
b2b8a42
simplify code
d476191
fix upstream, remove try except
63e3d85
document change as api-breaking
17373ad
Merge remote-tracking branch 'upstream/master' into rolling-groupby
MarcoGorelli a7ca8eb
remove elements from exclusions within _obj_with_exclusions
MarcoGorelli 35e7340
remove no-longer-necessary performance warning
MarcoGorelli 17090da
simplify _obj_with_exclusions
MarcoGorelli 7c7f79c
correct comment
MarcoGorelli 9b98dad
Merge remote-tracking branch 'upstream/master' into rolling-groupby
4d1c5a6
deal with multiindexed columns case
0dde4a8
use elif in _obj_with_exclusions, add test for column multiindex, dis…
1e4ba56
simplify unique_column_names
6312725
rewrite using get_level_values
d9748d1
revert 'or' which was accidentally changed to 'and'
1623902
only patch _apply, cov and corr
5d7a477
reinstate change to _obj_with_exclusions
d117624
exclude 'by' column in Rolling.count
367e671
don't modify _selected_obj - instead, patch obj before we reach apply
d7cf9f1
slight simplification to _shallow_copy
d251715
comment on patch in corr and cov
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -209,11 +209,18 @@ def _obj_with_exclusions(self): | |
if self._selection is not None and isinstance(self.obj, ABCDataFrame): | ||
return self.obj.reindex(columns=self._selection_list) | ||
|
||
if len(self.exclusions) > 0: | ||
return self.obj.drop(self.exclusions, axis=1) | ||
else: | ||
elif not self.exclusions or not isinstance(self.obj, ABCDataFrame): | ||
return self.obj | ||
|
||
# there may be elements in self.exclusions that are no longer | ||
# in self.obj, see GH 32468 | ||
nlevels = self.obj.columns.nlevels | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. huh? why are you doing all of this |
||
unique_column_names = { | ||
j for i in range(nlevels) for j in self.obj.columns.get_level_values(i) | ||
} | ||
exclusions = self.exclusions.intersection(unique_column_names) | ||
return self.obj.drop(exclusions, axis=1) | ||
|
||
def __getitem__(self, key): | ||
if self._selection is not None: | ||
raise IndexError(f"Column(s) {self._selection} already selected") | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,6 +35,8 @@ def _dispatch(name: str, *args, **kwargs): | |
def outer(self, *args, **kwargs): | ||
def f(x): | ||
x = self._shallow_copy(x, groupby=self._groupby) | ||
# patch for GH 32332 | ||
x.obj = x._obj_with_exclusions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really don't like this |
||
return getattr(x, name)(*args, **kwargs) | ||
|
||
return self._groupby.apply(f) | ||
|
@@ -82,6 +84,8 @@ def _apply( | |
# TODO: can we de-duplicate with _dispatch? | ||
def f(x, name=name, *args): | ||
x = self._shallow_copy(x) | ||
# patch for GH 32332 | ||
x.obj = x._obj_with_exclusions | ||
|
||
if isinstance(name, str): | ||
return getattr(x, name)(*args, **kwargs) | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an expl here, a couple of sentences; this the issue number.