Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH: Handle extension arrays in algorithms.diff #31025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Handle extension arrays in algorithms.diff #31025
Changes from 1 commit
fcde96b
7c5e6f7
3cc7c11
5017912
dfea6a5
38fe40c
fc6eef0
84e5e93
4183b5b
ab9b23f
2f5d55f
e0ce8be
bd18da2
1c0a9fe
f3af8f5
4d0c5cf
6843e2b
bd6c157
7861f57
a496f13
869ce96
8fa2836
d34ffe3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not thrilled about this, but it may be unavoidable. Happy to hear of an alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if it doesn't support
-
, the operation will also raise a TypeError? What you are doing below yourself?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I meant the need for the hasattr just to please the type checker. If you do a diff on an EA that doens't implement
__sub__
you'll get the TypeError either way.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
com.values_from_object
was the old behavior of Series.diff. So this should be identical to the old behavior, just with an extra function call to go throughself.array.diff
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to override the base class version? Doesn't that work as well for numpy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More technical debt from the 1D arrays inside 2D blocks :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and this is on ObjectValuesExtensionBlock, but is only useful for PeriodArray. IntervalArray is the only other array to use this, and doesn't implement
__sub__
. Is it worth creating aPeriodBlock
just for this (IMO no)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this only needed for ObjectValuesExtensionBlock, and not for ExtensionBlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that in principle, we can hit this from ExtensionBlock.
We hit the problem when going from a NonConsolidatable block type (like period) to a consolidatable one (like object). In that case, the values passed to
make_block
will be 1d, but the block is expecting them to be 2d.In practice, I think that for most EAs,
ExtensionArray.__sub__
will return another extension array. So while ExtensionBlock.diff isn't 100% correct for all EAs, I think trying to handle this case is not worth it. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It certainly seems fine to ignore this corner case for now.