BUG/TST: run and fix all arithmetic tests with+without numexpr #40463

jorisvandenbossche · 2021-03-16T10:40:58Z

This PR adds a auto-used fixture for pandas/tests/arithmetic that sets the expressions._MIN_ELEMENTS to 0, to force taking the numexpr path, even if our test data are small (which would otherwise never excercise the numexpr code path)

Further, it makes some fixes to get the tests passing with this better coverage.
The fixes are a bit quick-and-dirty, but at least already show what is needed to get things working.

jorisvandenbossche · 2021-03-16T10:44:07Z

pandas/core/ops/array_ops.py

@@ -199,7 +207,9 @@ def arithmetic_op(left: ArrayLike, right: Any, op):
    rvalues = ensure_wrapped_if_datetimelike(right)
    rvalues = _maybe_upcast_for_op(rvalues, lvalues.shape)

-    if should_extension_dispatch(lvalues, rvalues) or isinstance(rvalues, Timedelta):
+    if should_extension_dispatch(lvalues, rvalues) or isinstance(
+        rvalues, (Timedelta, BaseOffset, Timestamp, NaTType)


this additional check could maybe be moved into should_extension_dispatch (although it is not necessarily related to "is extension array", but rather to "don't take the numexpr path")

yah, IIRC should_extension_dispatch is only used here, so might as well refactor/rename/move/whatever is most convenient.

There is a comment below about why Timedelta is included; can you update it for the others

i think check rvalues is NaT rather than isinstance check

Changed to right is NaT and updated the comment.

jorisvandenbossche · 2021-03-16T10:47:12Z

pandas/core/ops/array_ops.py

@@ -157,8 +159,14 @@ def _na_arithmetic_op(left, right, op, is_cmp: bool = False):
    """
    import pandas.core.computation.expressions as expressions

+    if isinstance(right, str):
+        # can never use numexpr
+        func = op


Basically with a string argument, numexpr will fail with a "wrong" error message. Alternatively, _can_use_numexpr in expressions.py could also be updated to check for this and avoid using the numexpr path (currently that only checks object with dtypes, not for scalars)

lets make an effort to keep numexpr-specific lgoic in _can_use_numexpre/expressions

@jbrockmendel would you be OK with leaving the check here as is, short term? I have a next PR that moves this check inside a can_use_numexpr function inside expressions.py (#41122), so that will clean this up.
But I would like to merge this PR before #41122 since this one is adding a lot of test coverage for with/without numexpr.

yeah ok for now, but let's for sure move later

pandas/core/ops/array_ops.py

pandas/tests/arithmetic/test_numeric.py

jbrockmendel · 2021-03-16T22:15:59Z

pandas/tests/arithmetic/conftest.py

+    _MIN_ELEMENTS = expr._MIN_ELEMENTS
+    expr._MIN_ELEMENTS = request.param
+    yield request.param
+    expr._MIN_ELEMENTS = _MIN_ELEMENTS


can we get rid of some of the setup/teardown in test_expressions with this?

Potentially something similar could be used there as well, yes. But this PR is focusing on the tests/arithmetic/ tests, there is #40497 as general issue to modernize test_expressions.py

jreback · 2021-03-20T01:23:19Z

can you merge master

jreback

can you merge master again; @jbrockmendel comments addressed?

pandas/_libs/tslibs/nattype.pyx

jbrockmendel · 2021-04-06T04:48:34Z

@jbrockmendel comments addressed?

Not yet

jorisvandenbossche · 2021-04-23T12:54:36Z

This is ready for another review now

jorisvandenbossche · 2021-04-23T12:57:01Z

~~(although there are some failures about a wrong error message with the latest commit, I see, but aside from that should already be reviewable)~~ this should be fixed, wrong bracket alignment ;)

jreback

can you rebase as well

jreback · 2021-04-26T11:51:34Z

pandas/core/ops/array_ops.py

@@ -157,8 +159,14 @@ def _na_arithmetic_op(left, right, op, is_cmp: bool = False):
    """
    import pandas.core.computation.expressions as expressions

+    if isinstance(right, str):
+        # can never use numexpr
+        func = op


yeah ok for now, but let's for sure move later

jreback · 2021-04-26T11:52:26Z

pandas/core/ops/array_ops.py

@@ -246,7 +259,10 @@ def comparison_op(left: ArrayLike, right: Any, op) -> ArrayLike:
                "Lengths must match to compare", lvalues.shape, rvalues.shape
            )

-    if should_extension_dispatch(lvalues, rvalues):
+    if should_extension_dispatch(lvalues, rvalues) or (


this is the same check as above, (L212) can you put the common parts in a function

It's slightly different: here I need an additional and not is_object_dtype(lvalues.dtype), because in that case we need to take the comp_method_OBJECT_ARRAY code path below

sure i see that, that's why i said common parts (meaning the datetimes like + NaT)

Yes, but note the brackets: currently the checks for the scalars is first combined with is_object_dtype, before being combined with should_extension_dispatch. So that means I cannot move those scalar checks inside should_extension_dispatch without changing the behaviour of the overall check.

jreback · 2021-04-26T11:53:11Z

pandas/tests/arithmetic/test_numeric.py

@@ -405,6 +406,11 @@ def test_ser_div_ser(self, dtype1, any_real_dtype):
                name=None,
            )
        expected.iloc[0:3] = np.inf
+        if first.dtype == "int64" and second.dtype == "float32":


is the reverse excluded as well?

The reverse (float32 + int64) is not tested, as the first.dtype is always int64/float64/uint64

(but yeah, the reverse order would also result in float32 instead of float64 when numexpr is used)

jreback · 2021-04-26T21:57:38Z

thanks @jorisvandenbossche

…s-dev#40463)

BUG/TST: run and fix all arithmetic tests with+without numexpr

a4dea4a

jorisvandenbossche added Bug Testing pandas testing functions or related to the test suite Numeric Operations Arithmetic, Comparison, and Logical operations labels Mar 16, 2021

jorisvandenbossche requested a review from jbrockmendel March 16, 2021 10:40

jorisvandenbossche commented Mar 16, 2021

View reviewed changes

jorisvandenbossche mentioned this pull request Mar 16, 2021

PERF: no need to check for DataFrame in pandas.core.computation.expressions #40445

Merged

jbrockmendel reviewed Mar 16, 2021

View reviewed changes

pandas/core/ops/array_ops.py Show resolved Hide resolved

jbrockmendel reviewed Mar 16, 2021

View reviewed changes

pandas/tests/arithmetic/test_numeric.py Outdated Show resolved Hide resolved

jbrockmendel reviewed Mar 16, 2021

View reviewed changes

jorisvandenbossche added 2 commits April 1, 2021 10:16

Merge remote-tracking branch 'upstream/master' into ops-test-numexpr

24d087e

fix test for comparison with NaT

6851089

jreback added this to the 1.3 milestone Apr 2, 2021

jreback requested changes Apr 2, 2021

View reviewed changes

jbrockmendel reviewed Apr 6, 2021

View reviewed changes

pandas/_libs/tslibs/nattype.pyx Outdated Show resolved Hide resolved

jorisvandenbossche added 8 commits April 20, 2021 14:46

Merge remote-tracking branch 'upstream/master' into ops-test-numexpr

d6f23ba

revert change error message test

5122675

fix case with object dtype and scalar right operand

82a7247

Merge remote-tracking branch 'upstream/master' into ops-test-numexpr

340232a

fix for wrong error message with Timestamp scalar

e294267

add comment

0804e63

Merge remote-tracking branch 'upstream/master' into ops-test-numexpr

d97ab38

check right is NaT

dcf38cf

jorisvandenbossche added 2 commits April 23, 2021 16:40

wrong bracket

036cf62

Merge remote-tracking branch 'upstream/master' into ops-test-numexpr

06d76e0

This was referenced Apr 26, 2021

PERF/REF: Check use of numexpr earlier in the DataFrame operation #41122

Closed

REF: move check for disallowed bool arithmetic ops out of numexpr-related expressions.py #41161

Merged

jreback requested changes Apr 26, 2021

View reviewed changes

Merge remote-tracking branch 'upstream/master' into ops-test-numexpr

6ff3be0

jreback approved these changes Apr 26, 2021

View reviewed changes

jreback merged commit 0158382 into pandas-dev:master Apr 26, 2021

jorisvandenbossche deleted the ops-test-numexpr branch April 27, 2021 06:38

jorisvandenbossche mentioned this pull request Apr 27, 2021

TST: run frame/series arithmetic tests with+without numexpr #41178

Merged

yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request May 6, 2021

BUG/TST: run and fix all arithmetic tests with+without numexpr (panda…

78c3065

…s-dev#40463)

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021

BUG/TST: run and fix all arithmetic tests with+without numexpr (panda…

013572b

…s-dev#40463)

Uh oh!

BUG/TST: run and fix all arithmetic tests with+without numexpr #40463

BUG/TST: run and fix all arithmetic tests with+without numexpr #40463

Uh oh!

Conversation

jorisvandenbossche commented Mar 16, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Mar 20, 2021

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jbrockmendel commented Apr 6, 2021

Uh oh!

jorisvandenbossche commented Apr 23, 2021

Uh oh!

jorisvandenbossche commented Apr 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Apr 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Apr 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Apr 26, 2021

Uh oh!

Uh oh!

jorisvandenbossche commented Apr 23, 2021 •

edited

Loading

jorisvandenbossche Apr 26, 2021 •

edited

Loading

jorisvandenbossche Apr 26, 2021 •

edited

Loading