Continue de-nesting core.ops #19448

jbrockmendel · 2018-01-29T18:27:20Z

Move isinstance(other, ABCDataFrame) checks to consistently be the first thing checked in Series ops
Remove force kwarg, define it in the one place it is used.
Remove kludge for PeriodIndex
Handle categorical_dtype earlier in arith_method_SERIES, decreasing complexity of the closure.
Handle scalar na other earlier in _comp_method_SERIES, decreasing complexity of the closure.
Remove broken broadcasting case from _arith_method_FRAME (closes ops._arith_method_FRAME typo? #19421)

jbrockmendel · 2018-01-30T00:08:00Z

Just had to revert the removal of an ugly case, longstanding bug #5284, #5035.

codecov · 2018-01-30T00:50:51Z

Codecov Report

Merging #19448 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #19448      +/-   ##
==========================================
+ Coverage   91.62%   91.62%   +<.01%     
==========================================
  Files         150      150              
  Lines       48728    48732       +4     
==========================================
+ Hits        44645    44650       +5     
+ Misses       4083     4082       -1

Flag	Coverage Δ
#multiple	`89.99% <100%> (ø)`	⬆️
#single	`41.75% <48.64%> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/sparse/series.py	`95.26% <ø> (ø)`	⬆️
pandas/core/ops.py	`95.74% <100%> (+0.21%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca4ae4f...399fcd5. Read the comment docs.

jreback

not clear what you moved and what you added. need tests for new paths.

jreback · 2018-01-30T11:28:04Z

pandas/core/ops.py

    for name, method in new_methods.items():
+        # inplace SparseArray methods do not get overriden; everything else
+        # does
+        force = not (issubclass(cls, np.ndarray) and name.startswith('__i'))


use ABCSparseArray (may need to disentagle a bit)

can you fix this up / make comments more clear here

jreback · 2018-01-30T11:28:41Z

pandas/core/ops.py

@@ -658,6 +649,10 @@ def wrapper(left, right, name=name, na_op=na_op):
                                    index=left.index, name=res_name,
                                    dtype=result.dtype)

+        elif is_categorical_dtype(left):


tests that hit this

Is the request for comments in the code for what tests hit this path? Or confirmation here that such tests exist?

tests.categorical.test_operators.TestCategoricalOps.test_numeric_like_ops hits this path. The change here is catching this case early (and explicitly) instead of in the else: clause within the na_op above (see point where this PR changes elif isinstance(x, np.ndarray) to else: assert isinstance(x, np.ndarray)

my requests are about things that are being added. e.g. if its just refactoing then existing tests are ok, but some things look like they are catching additional cases, so should have tests for these. a coverage analysis can test you (compare before and after)

if its just refactoing then existing tests are ok

This is very nearly pure refactoring to catch things earlier, more explicitly, and with fewer levels of indentation. I'll go through the diff and annotate any exceptions to this rule. The is_categorical_dtype check on 652 replaces the else previously on 621 (i.e. not an exception to the rule).

jreback · 2018-01-30T11:28:56Z

pandas/core/ops.py

            result = _comp_method_OBJECT_ARRAY(op, x, y)
+
+        elif is_datetimelike_v_numeric(x, y):


tests that hit this

tests.series.test_operators.TestSeriesComparisons.test_comparison_invalid. This is just moved up a few lines and down one level of indentation.

jreback · 2018-01-30T11:29:10Z

pandas/core/ops.py

+
+        elif is_datetimelike_v_numeric(x, y):
+            raise TypeError("invalid type comparison")
+
        else:

            # we want to compare like types
            # we only want to convert to integer like if
            # we are not NotImplemented, otherwise


comment no longer relevant?

The comment may be more verbose than it needs to be, but is still relevant. Once #19301 goes in we can dispatch to DTI and TDI and the needs_i8_conversion block below can be simplified quite a bit.

jreback · 2018-01-30T11:29:36Z

pandas/core/ops.py

        elif isinstance(other, (np.ndarray, pd.Index)):
            # do not check length of zerodim array
            # as it will broadcast
            if (not is_scalar(lib.item_from_zerodim(other)) and
                    len(self) != len(other)):
                raise ValueError('Lengths must match to compare')

-            if isinstance(other, ABCPeriodIndex):


test that hit this

tests.series.test_operators.TestSeriesComparisons.test_nat_comparisons (2 parametrized cases)

I think removing this check is the main piece of code actually removed instead of just rearranged. The comment mentions a temp workaround and it is no longer necessary.

jreback · 2018-01-30T11:29:50Z

pandas/core/ops.py

-                       "'series <op> np.asarray(other)'.")
-                raise TypeError(msg.format(op=op, typ=self.dtype))
+        elif (isinstance(other, pd.Categorical) and
+              not is_categorical_dtype(self)):


tests that hit this

tests.categorical.test_operators.TestCategoricalOpsWithFactor.test_comparisons

jreback · 2018-01-30T11:29:55Z

pandas/core/ops.py

+                            .format(op=op, typ=self.dtype))
+
+        elif is_scalar(other) and isna(other):
+            # numpy does not like comparisons vs None


test that hit this

tests.series.test_operators.TestSeriesComparisons.test_nat_comparisons_scalar, tests.series.test_operators.TestSeriesComparisons.test_more_na_comparisons, tests.series.test_arithmetic.TestTimestampSeriesComparison.test_timestamp_equality, tests.series.test_arithmetic.TestTimestampSeriesComparison.test_timestamp_compare_series

total of 10 cases between these with parametrization

(This is moved down from na_op)

jreback · 2018-01-30T11:30:18Z

pandas/core/ops.py

                if yrav.shape != mask.shape:
-                    yrav = np.empty(mask.shape, dtype=yrav.dtype)
-                    yrav.fill(yrav.item())
+                    # FIXME: GH#5284, GH#5035, GH#19448


tests that hit this

See #19421. Only one test hits this, and only in py3:

s = Series([2, 3, 4, 5, 6, 7, 8, 9, datetime(2005, 1, 1)]) s[::2] = np.nan d = DataFrame({'A': s}) with pytest.raises(ValueError): d.__and__(s, axis='columns')

This is a change and not just a refactor. As per the comments, it is horribly broken, and better to raise intentionally than accidentally have the yrav.item() raise a ValueError.

jbrockmendel · 2018-01-31T00:58:14Z

pandas/core/ops.py

-            else:
-                raise TypeError("{typ} cannot perform the operation "
-                                "{op}".format(typ=type(x).__name__,
-                                              op=str_rep))


The cases currently caught by this else: raise TypeError are after this PR caught in the is_categorical_dtype check on 652.

jbrockmendel · 2018-01-31T00:59:12Z

pandas/core/ops.py

-                if name == '__ne__':
-                    return np.ones(len(x), dtype=bool)
-                else:
-                    return np.zeros(len(x), dtype=bool)


The is_datetimelike_v_numeric and is_scalar cases are moved to earlier in the checking process, not removed.

jbrockmendel · 2018-01-31T00:59:47Z

pandas/core/ops.py

-        elif isinstance(other, ABCDataFrame):  # pragma: no cover
-            return NotImplemented
+            res_values = na_op(self.values, other.values)
+            return self._constructor(res_values, index=self.index, name=name)


Everything in this part of the diff (788-799) is pure refactor, moving the ABCDataFrame check to the top spot for consistency

jbrockmendel · 2018-01-31T01:01:22Z

pandas/core/ops.py

-
-                    # let null fall thru
-                    if not isna(y):
-                        y = bool(y)


This is just moving the not isna(y) outside of the try/except block to be more specific about whats being tried.

jbrockmendel · 2018-01-31T01:01:50Z

pandas/core/ops.py

+
+            res_values = na_op(self.values, other)
+            unfilled = self._constructor(res_values, index=self.index)
+            return filler(unfilled).__finalize__(self)


All the edits to this function (895-918) are cleanup/refactor.

jbrockmendel · 2018-01-31T01:04:13Z

pandas/core/ops.py

                result = np.empty(x.size, dtype=x.dtype)
                mask = notna(xrav)
                xrav = xrav[mask]
-                if np.prod(xrav.shape):
+                if xrav.size:


Using xrav.size instead of np.prod(xrav.shape) here and above is to move towards joining this case with the case above it. This masking logic is done almost identically in Series/DataFrame/Panel methods and one of the next steps will be to de-duplicate these.

jbrockmendel · 2018-01-31T01:05:08Z

pandas/core/ops.py

                    with np.errstate(all='ignore'):
                        result[mask] = op(xrav, yrav)
-            elif hasattr(x, 'size'):
+
+            elif isinstance(x, np.ndarray):


isinstance(x, np.ndarray) instead of hasattr(x, 'size') to be more explicit. The other case that otherwise gets to this point in tests is Categorical, but that raises shortly after this anyway.

can you add a comment here

jreback

looks good. indicated where some added comments.

jreback · 2018-01-31T11:24:14Z

pandas/core/ops.py

    for name, method in new_methods.items():
+        # inplace SparseArray methods do not get overriden; everything else
+        # does
+        force = not (issubclass(cls, np.ndarray) and name.startswith('__i'))


can you fix this up / make comments more clear here

jreback · 2018-01-31T11:24:35Z

pandas/core/ops.py

@@ -795,39 +785,43 @@ def wrapper(self, other, axis=None):
        if axis is not None:
            self._get_axis_number(axis)

-        if isinstance(other, ABCSeries):
+        if isinstance(other, ABCDataFrame):  # pragma: no cover


add a comment here about early failing

jreback · 2018-01-31T11:25:56Z

pandas/core/ops.py

-                       "of dtype {typ}.\nIf you want to compare values, use "
-                       "'series <op> np.asarray(other)'.")
-                raise TypeError(msg.format(op=op, typ=self.dtype))
+        elif (isinstance(other, pd.Categorical) and


use is_categorical_dtype(other)

This is catching pd.Categorical specifically (as opposed to CategoricalIndex or Series[Categorical])

that's totally not obvious, is this tested or needed? that seems oddly specific

Well the two branches preceeding this handle ABCSeries and pd.Index cases, and the (existing) error message specifically refers to a Categorical

ok, pls revist at some point, this should prob be less specific and more about a non-ndarray like (e.g. an ExtensionArray check)

jreback · 2018-01-31T11:26:17Z

pandas/core/ops.py

@@ -899,26 +892,30 @@ def wrapper(self, other):

        self, other = _align_method_SERIES(self, other, align_asobject=True)

-        if isinstance(other, ABCSeries):
+        if isinstance(other, ABCDataFrame):


comment here

jreback · 2018-01-31T11:26:31Z

pandas/core/ops.py

                    with np.errstate(all='ignore'):
                        result[mask] = op(xrav, yrav)
-            elif hasattr(x, 'size'):
+
+            elif isinstance(x, np.ndarray):


can you add a comment here

…s_kwargs5

jreback · 2018-02-02T11:28:57Z

pandas/core/ops.py

-                       "of dtype {typ}.\nIf you want to compare values, use "
-                       "'series <op> np.asarray(other)'.")
-                raise TypeError(msg.format(op=op, typ=self.dtype))
+        elif (isinstance(other, pd.Categorical) and


ok, pls revist at some point, this should prob be less specific and more about a non-ndarray like (e.g. an ExtensionArray check)

jreback · 2018-02-02T11:29:55Z

thanks!

jbrockmendel added 2 commits January 29, 2018 10:20

continue de-nesting core.ops

b048b47

kludge for long-broken issue

fb6f534

jreback requested changes Jan 30, 2018

View reviewed changes

use ABCSparseArray

5105786

jbrockmendel commented Jan 31, 2018

View reviewed changes

jreback requested changes Jan 31, 2018

View reviewed changes

jreback added the Clean label Jan 31, 2018

jbrockmendel added 2 commits January 31, 2018 08:20

requested comments

82fe7a1

Merge branch 'master' of https://github.com/pandas-dev/pandas into op…

399fcd5

…s_kwargs5

jreback approved these changes Feb 2, 2018

View reviewed changes

jreback added this to the 0.23.0 milestone Feb 2, 2018

jreback merged commit 7db4bea into pandas-dev:master Feb 2, 2018

jbrockmendel deleted the ops_kwargs5 branch February 4, 2018 16:43

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018

Continue de-nesting core.ops (pandas-dev#19448)

4e0a32d

		result = _comp_method_OBJECT_ARRAY(op, x, y)

		elif is_datetimelike_v_numeric(x, y):

Continue de-nesting core.ops #19448

Continue de-nesting core.ops #19448

Conversation

jbrockmendel commented Jan 29, 2018

jbrockmendel commented Jan 30, 2018

codecov bot commented Jan 30, 2018 • edited Loading

Codecov Report

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel Jan 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel Jan 31, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Feb 2, 2018

codecov bot commented Jan 30, 2018 •

edited

Loading

jbrockmendel Jan 30, 2018 •

edited

Loading

jbrockmendel Jan 31, 2018 •

edited

Loading