-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: groupby
then resample
on column gives incorrect results if the index is out of order
#59408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
aram-cedarwood
wants to merge
18
commits into
pandas-dev:main
from
aram-cedarwood:groupby-then-resample-if-index-out-of-order
Closed
Changes from 10 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
c21c5c9
Index should be ignored when using the on keyword argument to resample
aram-cedarwood 626b722
add test for when index is set from column
aram-cedarwood 70d6ee8
minor: change test function name
aram-cedarwood 5d77941
add test case
aram-cedarwood c92ec0b
move reset_index
aram-cedarwood 1ef4d3b
minor: add GH issue number to tests
aram-cedarwood 55fe96b
add test
aram-cedarwood 13f7414
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cedarwood 10836a2
push reset_index to _set_grouper
aram-cedarwood 62bdbdf
simplify
aram-cedarwood dd566f1
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cedarwood 7a0ad97
remove tests per comments
aram-cedarwood 612e275
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cedarwood bd91de4
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cedarwood 8a7f662
change test_groupby_resample_on_api_with_getitem
aram-cedarwood 9852678
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
aram-cedarwood 7777766
add comments
f784a3d
Merge branch 'main' into groupby-then-resample-if-index-out-of-order
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason we need to do a
.take
in theRangeIndex
case, but not otherwise?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, returning to this. I put in that condition to make this test pass:
def test_groupby_resample_on_api_with_getitem(self):
which was apparently added as part of #17813I have not had a chance to look too deeply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding
index=list("xyzwt")
to the DataFrame in that test makes the op fail.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the issue happens on
pandas.core.resample:1559
. We pass only the groupx
but the entire axisself.ax
. I wonder if splitting the axis as we dodata
on L967 there and passing it tof
would resolve.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rhshadrach, coming back to this, sorry for the hiatus. Can you elaborate on the above please? (I think maybe the word "not" is missing somewhere; I'm not entirely sure what you mean.) What file are you referring to re: L967? Can you give me a link to a line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did in my previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in your previous comment you provided a link for this ^
I ask again because the link you provided in your previous comment does not contain
data
. Are you sure you didn't mean a different line here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to contain data? We pass the
func
from the highlighted line to groupby, which has a DataSplitter. This splitsdata
. My comment is that I wonder if we should be splittingaxis
as we do thisdata
on L967.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rhshadrach for your patience. I am new to making changes to groupby.
data
to be split downstream? Can you please elaborate on how the data is split? It would help greatly.axis
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure anymore and I doubt it would be helpful.
groupby splits data into groups and operates on each group.
x
in the code above is one of these groups, not the entire input data, whileself.ax
is the entire input's axis.I would suggest determining first if this is the right direction to go before spending time on figuring out how to do it. As I've stated, I'm not sure this is the root cause of the issue (or even an issue at all!).