Skip to content

REF: do extract_array earlier in series arith/comparison ops #28066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 2, 2019

Conversation

jbrockmendel
Copy link
Member

With this, the middle third of _arith_method_SERIES and _comp_method_SERIES are array-specific and can be refactored out (separate step) to become a) block-wise implementation for DataFrame and b) PandasArray implementation.

This also simplifies the NullFrequencyError handling nicely.

@@ -520,13 +524,29 @@ def column_op(a, b):
return result


def dispatch_to_extension_op(op, left, right):
def dispatch_to_extension_op(op, left, right, keep_null_freq: bool = False):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins do we have a way of typing left and right as "not a Series or Index"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u type left here (EA / np.ndarray)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins do we have a way of typing left and right as "not a Series or Index"?

i'm not aware of being able to exclude types.

if a particular type raises, (in this case TYpeError?) then maybe could use overloads with a return type of NoReturn https://mypy.readthedocs.io/en/latest/more_types.html#the-noreturn-type (New in version 3.5.4)

could maybe use the following pattern to allow checking with older Python...

if TYPE_CHECKING:
    from typing import NoReturn
else:
    NoReturn = None

@jbrockmendel
Copy link
Member Author

@jreback gentle ping; this turns out to be a blocker(ish) for fixing #28080.

@@ -520,13 +524,29 @@ def column_op(a, b):
return result


def dispatch_to_extension_op(op, left, right):
def dispatch_to_extension_op(op, left, right, keep_null_freq: bool = False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u type left here (EA / np.ndarray)

@jreback jreback added ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Aug 24, 2019
@jreback jreback added this to the 1.0 milestone Aug 24, 2019
@jbrockmendel
Copy link
Member Author

Types added

@TomAugspurger
Copy link
Contributor

Can you expand on keep_null_freq? It seems strange to have that specific of an option in a general method like dispatch_to_extension_op. When / why do we sometimes convert that to a TypeError?

@jbrockmendel
Copy link
Member Author

Can you expand on keep_null_freq? It seems strange to have that specific of an option in a general method like dispatch_to_extension_op. When / why do we sometimes convert that to a TypeError?

If we have a Series that is dt64, dt64tz, or td64, ser+1 should always raise TypeError. But because of the way we wrap/unwrap for dispatch_to_extension_op, we end up operating on a DTA/TDA, which will raise NullFrequencyError on arr+1. In these cases we need to catch NullFrequencyError and re-raise as TypeError.

The exception is when we have a Series[int] and we're adding/subtracting a DTA/TDA/DTI/TDI, in which case NullFrequencyError is the correct thing to raise.

@TomAugspurger
Copy link
Contributor

So the complexity comes from having different rules for op(Series[datetime], int) and op(DatetimeArray, int)? Can / should we align that behavior instead?

@jbrockmendel
Copy link
Member Author

Can / should we align that behavior instead?

The DTA/TDA/DTI/TDI/Timestamp behavior has been deprecated, now just need to be patient

@TomAugspurger
Copy link
Contributor

So in the future keep_null_freq won't be needed? In that case, +1 on the changes here, if we could add a TODO to remove it once the deprecation is enforced.

@jbrockmendel
Copy link
Member Author

sounds good

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since keep_null_freq is just a temporary thing until the deprecation can be enforced, I'm +1 on this.

I'd recommend giving @jreback about 48 hours to look, and then merge if he hasn't had a chance by them.

@jreback jreback merged commit 2cd7888 into pandas-dev:master Sep 2, 2019
@jreback
Copy link
Contributor

jreback commented Sep 2, 2019

thanks @jbrockmendel

yeah the keep_null_freq stuff is a bit hard to follow, though you have documented it, so I think ok

@jbrockmendel jbrockmendel deleted the knf branch September 2, 2019 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants