WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

jbrockmendel · 2020-03-25T03:57:50Z

xref #32779 which does the same thing for frame-with-frame ops. This one is turning out to be the most difficult of the cases to get right.

Posting this for comments, in large part because there are still two failing tests locally, and these may just be behaviors we want to change.

# test_scalar_na_logical_ops_corners
s = Series([2, 3, 4, 5, 6, 7, 8, 9, datetime(2005, 1, 1)])
s[::2] = np.nan
d = DataFrame({"A": s})

with pytest.raises(TypeError):
    d.__and__(s, axis="columns")   # <-- when done block-wise, this does not raise

The other one is harder to post an example of because it uses fixtures making it a PITA to copy/paste from the test, but the upshot is that op(frame[mixed-float-dtypes], series[float64]) downcasts to the original dtypes ATM (since each column is effectively operating against a scalar) but when done blockwise they all come back float64.

…-dev#32795)

Co-authored-by: MomIsBestFriend <>

…dev#32813)

Co-authored-by: MomIsBestFriend <>

…ev#32794)

…30903)

…2737) * Generate exception from the C code in the proper manner Get rid of all error printf's and produce proper Python exceptions * Declare some more exceptions from C code * Remove special case error message for c parser * Add whatsnew entry * Fix missing semicolons * Add regression test * Remove special case handling for Windows PyErr_SetFromErrnoWithFilename works for Unix and Windows * Remove call to GetLastError(), when using 0, the python error code handles this * black fixes * Fix indentation of assert statement (also in previous test, same error) * Skip the test on windows * Fix black issue * Let new_mmap fail without exception to allow fallback * Do not create a python error in new_mmap to allow the fallback to work silently * Remove the NULL pointer check for new_rd_source now that it will raise an exception * Update doc/source/whatsnew/v1.1.0.rst Co-Authored-By: gfyoung <[email protected]> Co-authored-by: Jeff Reback <[email protected]> Co-authored-by: gfyoung <[email protected]>

…2821)

…das-dev#32797)

…dev#32687) * TST: Parametrize in pandas/tests/internals/test_internals.py * Addressed lint issues * Addressing lint issues Co-authored-by: MomIsBestFriend <>

…acter in the column name (pandas-dev#32701)

…das-dev#32826)

…dev#32833)

TomAugspurger · 2020-04-01T19:04:20Z

pandas/core/array_algos/npcompat.py

+def tile(arr: ArrayLike, shape) -> ArrayLike:
+    raise NotImplementedError


Is this leftover debug stuff?

more or less, will remove

TomAugspurger · 2020-04-01T19:05:58Z

pandas/core/array_algos/npcompat.py

+        else:
+            values = values.reshape(-1, 1)
+
+    btvalues = np.broadcast_to(values, shape)


Is the result btvalues being returned to a user? broadcast_to is typically (always?) a readonly view, and I don't think we want to be returning readonly results.

the way its used in this PR, it is not returned to a user, but used as an intermediate object in the arithmetic op

TomAugspurger · 2020-04-01T19:07:29Z

pandas/core/arrays/datetimelike.py

@@ -455,6 +455,12 @@ def ravel(self, *args, **kwargs):
        data = self._data.ravel(*args, **kwargs)
        return type(self)(data, dtype=self.dtype)

+    @property
+    def T(self):


Where is this used? In an only datetimelike context?

TomAugspurger · 2020-04-01T19:08:08Z

pandas/core/array_algos/npcompat.py

+    if isinstance(arr, np.ndarray):
+        result = btvalues
+    else:
+        result = type(arr)._from_factorized(btvalues, arr)


Was the result of that other discussion not to use _from_factorized outside of pd.factorize? Or was that just _values_for_factorize?

That has more or less stalled. joris made a good point that e.g. fletcher doesnt use _values_for_factorize, so ive come around to the opinion that we shouldn't have _values_for_factorize/_from_factorized at all, since they are each only used once in EA.factorize.

Then the issue becomes whether we can use _values_for_argsort, and if we have a constructor that can round-trip those.

TomAugspurger · 2020-04-01T19:15:07Z

pandas/core/indexes/period.py

+    def __getitem__(self, key):
+        # PeriodArray.__getitem__ returns PeriodArray for 2D lookups,
+        #  so we need to issue deprecation warning and cast here


Are you planning to split this into it's own PR? Don't want to have it held up by review here.

hadnt planned on it, but might as well. this PR is back-burner until at least the frame-with-frame case is done

jbrockmendel · 2020-04-03T16:50:15Z

mothballing to clear the queue

jbrockmendel and others added 30 commits March 17, 2020 18:22

BUG/API: _values_for_factorize/_from_factorized round-trip

912c2d0

copy license text from: tidyverse/haven (pandas-dev#32756)

c33ba80

BLD: Suppressing warnings when compiling pandas/_libs/writers (pandas…

0a42227

…-dev#32795)

Avoid bare pytest.raises in dtypes/test_dtypes.py (pandas-dev#32672)

4e401cb

PERF: Using Numpy C-API when calling np.arange (pandas-dev#32804)

679e5d3

Co-authored-by: MomIsBestFriend <>

TYP: annotate to_numpy (pandas-dev#32809)

84f287e

fstring format added in pandas//tests/io/test_common.py:144: (pandas-…

5b92d03

…dev#32813)

BUG: Series.__getitem__ with downstream scalars (pandas-dev#32684)

0f0ec28

CLN: Using clearer imports (pandas-dev#32459)

02ac976

Co-authored-by: MomIsBestFriend <>

REF: Implement core._algos (pandas-dev#32767)

0ad9c82

CLN: Consolidate numba facilities (pandas-dev#32770)

964cedb

CLN: remove _ndarray_values (pandas-dev#32768)

23ac98a

BLD: Suppressing errors while compling pandas/_libs/groupby (pandas-d…

8e5ba59

…ev#32794)

TYP: PandasObject._cache (pandas-dev#32775)

3aa4226

Implement C Level Timedelta ISO Function; fix JSON usage (pandas-dev#…

bc24a8c

…30903)

Fixturize JSON tests (pandas-dev#31191)

92cf475

PERF: fix SparseArray._simple_new object initialization (pandas-dev#3…

ebeb6bc

…2821)

Avoid bare pytest.raises in indexes/categorical/test_indexing.py (pan…

6473fcd

…das-dev#32797)

See also (pandas-dev#32820)

192d736

TYP: annotate (pandas-dev#32730)

10c7b04

TST: Parametrize in pandas/tests/internals/test_internals.py (pandas-…

662aef3

…dev#32687) * TST: Parametrize in pandas/tests/internals/test_internals.py * Addressed lint issues * Addressing lint issues Co-authored-by: MomIsBestFriend <>

TYP: update setup.cfg (pandas-dev#32829)

b295c02

CLN: Update docstring decorator from Appender to doc (pandas-dev#32828)

804dfc6

BUG: Fix segfault on dir of a DataFrame with a unicode surrogate char…

1db3b09

…acter in the column name (pandas-dev#32701)

PERF: skip non-consolidatable blocks when checking consolidation (pan…

563da98

…das-dev#32826)

CLN: remove DatetimeLikeArray._add_delta (pandas-dev#32799)

78e0ccd

Error on C Warnings (pandas-dev#32163)

e8acc26

CLN: simplify MultiIndex._shallow_copy (pandas-dev#32772)

6d74398

DOC: use new pydata-sphinx-theme name (pandas-dev#32840)

21d3859

jbrockmendel and others added 17 commits March 20, 2020 19:41

REF: pass align_keys to apply

c8651ed

DOC: FutureWarning in Sphinx build when calling read_parquet (pandas-…

5edf4e1

…dev#32833)

checkpoint passing

5717ae5

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

88806dd

checkpoint passing

579a31a

Checkpoint 2 failures, both argueable

0617a17

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

968ba87

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

7a9c3f6

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

577ebf4

Implement block-wise ops for frame-series with axis=1

9738893

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

2a1cb23

rebase

37938db

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

742c962

Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast

30d6c2e

update test

f9d3895

update test

5abec0b

whatsnew

ae38398

TomAugspurger reviewed Apr 1, 2020

View reviewed changes

jbrockmendel added Performance Memory or execution speed performance Numeric Operations Arithmetic, Comparison, and Logical operations labels Apr 1, 2020

jbrockmendel closed this Apr 3, 2020

jbrockmendel mentioned this pull request Apr 4, 2020

BUG: 2D indexing on DTA/TDA/PA #33290

Merged

jbrockmendel deleted the bcast branch November 20, 2021 23:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

Uh oh!

jbrockmendel commented Mar 25, 2020

Uh oh!

TomAugspurger Apr 1, 2020

Uh oh!

jbrockmendel Apr 1, 2020

Uh oh!

TomAugspurger Apr 1, 2020

Uh oh!

jbrockmendel Apr 1, 2020

Uh oh!

TomAugspurger Apr 1, 2020

Uh oh!

jbrockmendel Apr 1, 2020

Uh oh!

TomAugspurger Apr 1, 2020

Uh oh!

jbrockmendel Apr 1, 2020

Uh oh!

TomAugspurger Apr 1, 2020

Uh oh!

jbrockmendel Apr 1, 2020

Uh oh!

jbrockmendel commented Apr 3, 2020

Uh oh!

Uh oh!

		def tile(arr: ArrayLike, shape) -> ArrayLike:
		raise NotImplementedError

Uh oh!

WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

Uh oh!

Conversation

jbrockmendel commented Mar 25, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Apr 3, 2020

Uh oh!

Uh oh!