BUG: flex op with DataFrame, Series and ea vs ndarray #34277

jbrockmendel · 2020-05-20T16:04:18Z

~~I have a couple ideas in mind that might make the edit in blocks here unnecessary, but want to profile them first, so will do that cleanup in a refactor following the bugfix.~~ Updated with a much cleaner implementation, it also de-special-cases one of our timedelta64 tests.

…g-arith-flex

jorisvandenbossche · 2020-05-22T14:39:28Z

This overlaps now with #34312

jorisvandenbossche · 2020-05-22T14:39:44Z

pandas/core/ops/__init__.py

+        rvalues = rvalues.reshape(1, -1)
+
+    rvalues = np.broadcast_to(rvalues, frame.shape)
+    return type(frame)(rvalues, index=frame.index, columns=frame.columns)


Why do we need to turn this into a DataFrame?

The alternative is to re-implement the ensure-shape-shape-values code in ops.blockwise (see the first commit, which did it that way)

Why does it need to be the same shape? The block op or array op should be able to handle this broadcasting automatically themselves?

ndarray handles broadcasting, but DTA/TDA dont. I checked with seberg a few months ago who said there isnt a perf penalty to operating on the broadcasted ndarray.

ive got plans to make it so we dont need to wrap the ndarray in a DataFrame (which we actually do in _align_method_FRAME too)

but DTA/TDA dont.

You mean the "hidden" 2D version of DTA/TDA? (EAs in general handle broadcasting)

EAs in general handle broadcasting

im not sure thats accurate in general (in particular the op(ea_len_1, arr_len_n)), but that can be discussed separately

(in particular the op(ea_len_1, arr_len_n)

Ah, yes, indeed, that's a broadcasting that will not be generally supported (we probably should though, worth to open an issue about that).

Now, I was thinking about the op(frame_with_EA_column, array_len_n) case, where the array is aligned on the columns of the frame. Which means that each EA (1 column) gets a scalar, I think (op(EA, scalar) ?
Or, we might want to change this to an array of len 1 in case the scalar loses information? (which can be the case where the 1D array-like is an EA / Series[EA] instead of a numpy array)

…g-arith-flex

jorisvandenbossche · 2020-05-22T17:53:39Z

I could also port your test to my PR, so can you try to explain why this fix would be better? (it might be doing more now, but considering only what is required for fixing the original issue, i.e. the interval flex op test you added)

doc/source/whatsnew/v1.1.0.rst

jbrockmendel · 2020-05-22T17:55:28Z

I could also port your test to my PR

please do!

jorisvandenbossche · 2020-05-22T17:57:46Z

Only if you agree it's a good fix (I don't fully understand the rationale of the change in this PR, so I can't really judge at this time which fix is "better")

jbrockmendel · 2020-05-22T18:11:26Z

Only if you agree it's a good fix (I don't fully understand the rationale of the change in this PR, so I can't really judge at this time which fix is "better")

I agree that #34312 is a valid fix to a legitimate regression, and is well-written, minimally-invasive. So I would have no problem with merging it on green (especially if the test is ported). It is especially worthwhile if we don't expect this to get merged for 1.1

The advantage of the approach in this PR is that it continues operating blockwise in cases where #34312 does not. It also gets a bunch of code out of _combine_series_frame, which will allow for streamlining in upcoming pass.

jorisvandenbossche · 2020-05-22T18:26:30Z

OK, I added the test in the other PR, then we can continue the discussion here regardless of the regression fix.

jbrockmendel · 2020-05-22T18:35:57Z

docbuild fail looks unrelated

jorisvandenbossche · 2020-05-22T18:37:19Z

docbuild fail looks unrelated

It's failing everywhere -> #34306

…g-arith-flex

jbrockmendel · 2020-05-23T16:55:56Z

green

pandas/core/ops/__init__.py

…g-arith-flex

jreback · 2020-05-25T17:43:11Z

thanks @jbrockmendel

jbrockmendel added 7 commits May 20, 2020 08:10

test for axis=1 case

baf4e38

GH ref

ea3aaba

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

4ef99a7

…g-arith-flex

REF: send broadcastable Series through DataFrame route

612fab6

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

c59650c

…g-arith-flex

REF: re-use DataFrame dispatch code

b857721

typo fixup

ddb9fa3

jbrockmendel added Bug Numeric Operations Arithmetic, Comparison, and Logical operations labels May 21, 2020

jorisvandenbossche reviewed May 22, 2020

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

5c85d1a

…g-arith-flex

jbrockmendel mentioned this pull request May 22, 2020

REGR: fix op(frame, series) with extension dtypes #34312

Closed

port test from pandas-dev#34312

86eaed9

jorisvandenbossche reviewed May 22, 2020

View reviewed changes

doc/source/whatsnew/v1.1.0.rst Outdated Show resolved Hide resolved

revert whatsnew

a99028e

jreback added this to the 1.1 milestone May 22, 2020

jbrockmendel added 3 commits May 22, 2020 18:02

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

95442be

…g-arith-flex

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

1c36218

…g-arith-flex

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

19d8c7d

…g-arith-flex

jorisvandenbossche reviewed May 25, 2020

View reviewed changes

pandas/core/ops/__init__.py Outdated Show resolved Hide resolved

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

e9ca4c2

…g-arith-flex

rename

ee17990

jreback merged commit 014d8ea into pandas-dev:master May 25, 2020

jbrockmendel deleted the bug-arith-flex branch May 25, 2020 17:45

jorisvandenbossche mentioned this pull request May 25, 2020

BUG: op(frame, series) raising NotImplementedError with extension dtypes #34311

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: flex op with DataFrame, Series and ea vs ndarray #34277

BUG: flex op with DataFrame, Series and ea vs ndarray #34277

jbrockmendel commented May 20, 2020 •

edited

Loading

jorisvandenbossche commented May 22, 2020

jorisvandenbossche May 22, 2020

jbrockmendel May 22, 2020

jorisvandenbossche May 22, 2020

jbrockmendel May 22, 2020

jorisvandenbossche May 22, 2020

jbrockmendel May 22, 2020

jorisvandenbossche May 25, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 22, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 22, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 22, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 23, 2020

jreback commented May 25, 2020

BUG: flex op with DataFrame, Series and ea vs ndarray #34277

BUG: flex op with DataFrame, Series and ea vs ndarray #34277

Conversation

jbrockmendel commented May 20, 2020 • edited Loading

jorisvandenbossche commented May 22, 2020

jorisvandenbossche May 22, 2020

Choose a reason for hiding this comment

jbrockmendel May 22, 2020

Choose a reason for hiding this comment

jorisvandenbossche May 22, 2020

Choose a reason for hiding this comment

jbrockmendel May 22, 2020

Choose a reason for hiding this comment

jorisvandenbossche May 22, 2020

Choose a reason for hiding this comment

jbrockmendel May 22, 2020

Choose a reason for hiding this comment

jorisvandenbossche May 25, 2020

Choose a reason for hiding this comment

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 22, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 22, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 22, 2020

jorisvandenbossche commented May 22, 2020

jbrockmendel commented May 23, 2020

jreback commented May 25, 2020

jbrockmendel commented May 20, 2020 •

edited

Loading