-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: dispatch Block.fillna to putmask/where #45911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if self.dtype.kind == "m": | ||
try: | ||
res_values = self.values.fillna(value, limit=limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for putting this behind a if self.dtype.kind == "m":
check?
Because this now will call super().fillna()
for all EAs, but the parent fillna()
implementation doesn't actually call EA.fillna()
(this was catched in the geopandas test suite)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for putting this behind a if self.dtype.kind == "m": check?
to get the FutureWarning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which future warning?
Let me rephrase my question: what's the reason for no longer calling EA.fillna() here for all EAs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which future warning?
The FutureWarning on L1494-1502 just below this.
Let me rephrase my question: what's the reason for no longer calling EA.fillna() here for all EAs?
Primarily code-sharing, with the side-benefit that we may actually respect the "inplace" keyword (see TODO comment L1508)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the point of the EA.fillna()
method is that this gets called from the DataFrame/Series fillna, so that an EA can provide a specialized implementation. So I think we should keep that logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that an EA can provide a specialized implementation
does anyone (e.g. pyarrow) actually have a specialized implementation where it makes a real difference? e.g. is more performant than going through putmask/where?
the point of the EA.fillna() method is that this gets called from the DataFrame/Series fillna
I see "the point" of the EA interface in general as being to provide a minimal(ish) set of primitives from which we can implement the Series/DataFrame(/Index!) API. If Series.foobar can be implemented efficiently without EA.foobar, that is a reason to remove EA.foobar, not to use it unnecessarily.
(Though in this case we still use EA.fillna for pad/bfill/interpolate cases, so I'm not actually advocating getting rid of it)
Last, while the corrected 'inplace' behavior was not the motivator behind this change, I do think it is an important improvement.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Among other things, we avoid re-calculating 'mask' when splitting.