Skip to content

WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 47 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
912c2d0
BUG/API: _values_for_factorize/_from_factorized round-trip
jbrockmendel Mar 18, 2020
c33ba80
copy license text from: tidyverse/haven (#32756)
jamin-aws-ospo Mar 18, 2020
0a42227
BLD: Suppressing warnings when compiling pandas/_libs/writers (#32795)
ShaharNaveh Mar 18, 2020
4e401cb
Avoid bare pytest.raises in dtypes/test_dtypes.py (#32672)
Vlek Mar 18, 2020
679e5d3
PERF: Using Numpy C-API when calling `np.arange` (#32804)
ShaharNaveh Mar 18, 2020
84f287e
TYP: annotate to_numpy (#32809)
jbrockmendel Mar 18, 2020
5b92d03
fstring format added in pandas//tests/io/test_common.py:144: (#32813)
mglasder Mar 18, 2020
0f0ec28
BUG: Series.__getitem__ with downstream scalars (#32684)
jbrockmendel Mar 19, 2020
02ac976
CLN: Using clearer imports (#32459)
ShaharNaveh Mar 19, 2020
0ad9c82
REF: Implement core._algos (#32767)
jbrockmendel Mar 19, 2020
964cedb
CLN: Consolidate numba facilities (#32770)
mroeschke Mar 19, 2020
23ac98a
CLN: remove _ndarray_values (#32768)
jbrockmendel Mar 19, 2020
8e5ba59
BLD: Suppressing errors while compling pandas/_libs/groupby (#32794)
ShaharNaveh Mar 19, 2020
3aa4226
TYP: PandasObject._cache (#32775)
topper-123 Mar 19, 2020
bc24a8c
Implement C Level Timedelta ISO Function; fix JSON usage (#30903)
WillAyd Mar 19, 2020
628513e
BUG: read_csv: fix wrong exception on permissions issue (#32737)
roberthdevries Mar 19, 2020
92cf475
Fixturize JSON tests (#31191)
WillAyd Mar 19, 2020
ebeb6bc
PERF: fix SparseArray._simple_new object initialization (#32821)
jorisvandenbossche Mar 19, 2020
6473fcd
Avoid bare pytest.raises in indexes/categorical/test_indexing.py (#32…
Vlek Mar 19, 2020
192d736
See also (#32820)
farhanreynaldo Mar 19, 2020
10c7b04
TYP: annotate (#32730)
jbrockmendel Mar 19, 2020
662aef3
TST: Parametrize in pandas/tests/internals/test_internals.py (#32687)
ShaharNaveh Mar 19, 2020
b295c02
TYP: update setup.cfg (#32829)
simonjayhawkins Mar 19, 2020
804dfc6
CLN: Update docstring decorator from Appender to doc (#32828)
raisadz Mar 19, 2020
1db3b09
BUG: Fix segfault on dir of a DataFrame with a unicode surrogate char…
roberthdevries Mar 19, 2020
563da98
PERF: skip non-consolidatable blocks when checking consolidation (#32…
jorisvandenbossche Mar 19, 2020
78e0ccd
CLN: remove DatetimeLikeArray._add_delta (#32799)
jbrockmendel Mar 19, 2020
e8acc26
Error on C Warnings (#32163)
WillAyd Mar 19, 2020
6d74398
CLN: simplify MultiIndex._shallow_copy (#32772)
topper-123 Mar 19, 2020
21d3859
DOC: use new pydata-sphinx-theme name (#32840)
jorisvandenbossche Mar 19, 2020
c8651ed
REF: pass align_keys to apply
jbrockmendel Mar 19, 2020
5edf4e1
DOC: FutureWarning in Sphinx build when calling read_parquet (#32833)
ShaharNaveh Mar 20, 2020
5717ae5
checkpoint passing
jbrockmendel Mar 21, 2020
88806dd
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Mar 21, 2020
579a31a
checkpoint passing
jbrockmendel Mar 21, 2020
0617a17
Checkpoint 2 failures, both argueable
jbrockmendel Mar 23, 2020
968ba87
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Mar 23, 2020
7a9c3f6
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Mar 24, 2020
577ebf4
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Mar 25, 2020
9738893
Implement block-wise ops for frame-series with axis=1
jbrockmendel Mar 25, 2020
2a1cb23
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Mar 30, 2020
37938db
rebase
jbrockmendel Mar 30, 2020
742c962
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Apr 1, 2020
30d6c2e
Merge branch 'master' of https://github.com/pandas-dev/pandas into bcast
jbrockmendel Apr 1, 2020
f9d3895
update test
jbrockmendel Apr 1, 2020
5abec0b
update test
jbrockmendel Apr 1, 2020
ae38398
whatsnew
jbrockmendel Apr 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,7 @@ Performance improvements
- Performance improvement in :class:`Timedelta` constructor (:issue:`30543`)
- Performance improvement in :class:`Timestamp` constructor (:issue:`30543`)
- Performance improvement in flex arithmetic ops between :class:`DataFrame` and :class:`Series` with ``axis=0`` (:issue:`31296`)
- Performance improvement in arithmetic operations between :class:`DataFrame` and :class:`Series` (:issue:`32997`)
- The internal index method :meth:`~Index._shallow_copy` now copies cached attributes over to the new index,
avoiding creating these again on the new index. This can speed up many operations that depend on creating copies of
existing indexes (:issue:`28584`, :issue:`32640`, :issue:`32669`)
Expand Down
34 changes: 34 additions & 0 deletions pandas/core/array_algos/npcompat.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
"""
Implementations of high-level numpy functions that are ExtensionArray-compatible.
"""
import numpy as np

from pandas._typing import ArrayLike


def tile(arr: ArrayLike, shape) -> ArrayLike:
raise NotImplementedError
Comment on lines +9 to +10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this leftover debug stuff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more or less, will remove



def broadcast_to(arr: ArrayLike, shape, orient=None) -> ArrayLike:
if isinstance(arr, np.ndarray):
values = arr
else:
# ExtensionArray
values = arr._values_for_factorize()[0]

# TODO: if we are ndim==size==1 it shouldnt matter whether rowlike/columnlike?
if values.ndim == 1 and orient is not None:
# SUpport treating a 1-dimensional array as either a row or column
assert orient in ["rowlike", "columnlike"]
if orient == "rowlike":
values = values.reshape(1, -1)
else:
values = values.reshape(-1, 1)

btvalues = np.broadcast_to(values, shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the result btvalues being returned to a user? broadcast_to is typically (always?) a readonly view, and I don't think we want to be returning readonly results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way its used in this PR, it is not returned to a user, but used as an intermediate object in the arithmetic op

if isinstance(arr, np.ndarray):
result = btvalues
else:
result = type(arr)._from_factorized(btvalues, arr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the result of that other discussion not to use _from_factorized outside of pd.factorize? Or was that just _values_for_factorize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That has more or less stalled. joris made a good point that e.g. fletcher doesnt use _values_for_factorize, so ive come around to the opinion that we shouldn't have _values_for_factorize/_from_factorized at all, since they are each only used once in EA.factorize.

Then the issue becomes whether we can use _values_for_argsort, and if we have a constructor that can round-trip those.

return result
3 changes: 2 additions & 1 deletion pandas/core/arrays/boolean.py
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,8 @@ def _values_for_factorize(self) -> Tuple[np.ndarray, int]:

@classmethod
def _from_factorized(cls, values, original: "BooleanArray") -> "BooleanArray":
return cls._from_sequence(values, dtype=original.dtype)
mask = values == -1
return cls(values.astype(bool), mask)

_HANDLED_TYPES = (np.ndarray, numbers.Number, bool, np.bool_)

Expand Down
18 changes: 14 additions & 4 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -455,6 +455,12 @@ def ravel(self, *args, **kwargs):
data = self._data.ravel(*args, **kwargs)
return type(self)(data, dtype=self.dtype)

@property
def T(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used? In an only datetimelike context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

# Note: we drop any freq
data = self._data.T
return type(self)(data, dtype=self.dtype)

@property
def _box_func(self):
"""
Expand Down Expand Up @@ -561,7 +567,7 @@ def __getitem__(self, key):
else:
key = check_array_indexer(self, key)

is_period = is_period_dtype(self)
is_period = is_period_dtype(self.dtype)
if is_period:
freq = self.freq
else:
Expand All @@ -577,7 +583,7 @@ def __getitem__(self, key):
freq = self.freq

result = getitem(key)
if result.ndim > 1:
if result.ndim > 1 and not is_period and not is_datetime64tz_dtype(self.dtype):
# To support MPL which performs slicing with 2 dim
# even though it only has 1 dim by definition
return result
Expand Down Expand Up @@ -1208,9 +1214,13 @@ def _add_timedelta_arraylike(self, other):

self_i8 = self.asi8
other_i8 = other.asi8
# TODO: do we need to worry about these having the same row/column order?
new_values = checked_add_with_arr(
self_i8, other_i8, arr_mask=self._isnan, b_mask=other._isnan
)
self_i8.ravel(),
other_i8.ravel(),
arr_mask=self._isnan.ravel(),
b_mask=other._isnan.ravel(),
).reshape(self.shape)
if self._hasnans or other._hasnans:
mask = (self._isnan) | (other._isnan)
new_values[mask] = iNaT
Expand Down
16 changes: 13 additions & 3 deletions pandas/core/arrays/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -624,10 +624,20 @@ def _addsub_int_array(
assert op in [operator.add, operator.sub]
if op is operator.sub:
other = -other
res_values = algos.checked_add_with_arr(self.asi8, other, arr_mask=self._isnan)

mask = self._isnan
if self.ndim == self.size == 1 and other.ndim == 2:
# TODO: more general case? should this be handled by DataFrame
# op before we get here?
arr = np.broadcast_to(self._data[:, None], other.shape)
self = type(self)(arr, freq=self.freq)

res_values = algos.checked_add_with_arr(
self.asi8.ravel(), other.ravel(), arr_mask=self._isnan,
)
res_values = res_values.view("i8")
res_values[self._isnan] = iNaT
return type(self)(res_values, freq=self.freq)
res_values[mask] = iNaT
return type(self)(res_values.reshape(self.shape), freq=self.freq)

def _add_offset(self, other):
assert not isinstance(other, Tick)
Expand Down
10 changes: 9 additions & 1 deletion pandas/core/arrays/timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
from pandas.core.algorithms import checked_add_with_arr
from pandas.core.arrays import datetimelike as dtl
import pandas.core.common as com
from pandas.core.construction import extract_array
from pandas.core.construction import array, extract_array

from pandas.tseries.frequencies import to_offset
from pandas.tseries.offsets import Tick
Expand Down Expand Up @@ -521,8 +521,16 @@ def __truediv__(self, other):
# Note: we do not do type inference on the result, so either
# an object array or numeric-dtyped (if numpy does inference)
# will be returned. GH#23829
# FIXME: the above comment is no longer accurate... sometimes
result = [self[n] / other[n] for n in range(len(self))]
result = np.array(result)
if self.ndim == 2:
# FIXME: kludge, just trying to get the tests passing
res = extract_array(array(result.ravel()), extract_numpy=True)
result = res.reshape(result.shape)
if result.dtype.kind == "m":
# TODO: no real reason for this, but we test it
result = np.asarray(result)
return result

else:
Expand Down
12 changes: 12 additions & 0 deletions pandas/core/indexes/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
validate_dtype_freq,
)
import pandas.core.common as com
from pandas.core.indexers import deprecate_ndim_indexing
import pandas.core.indexes.base as ibase
from pandas.core.indexes.base import (
InvalidIndexError,
Expand Down Expand Up @@ -350,6 +351,17 @@ def _int64index(self) -> Int64Index:
# ------------------------------------------------------------------------
# Index Methods

def __getitem__(self, key):
# PeriodArray.__getitem__ returns PeriodArray for 2D lookups,
# so we need to issue deprecation warning and cast here
Comment on lines +354 to +356
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to split this into it's own PR? Don't want to have it held up by review here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadnt planned on it, but might as well. this PR is back-burner until at least the frame-with-frame case is done

result = super().__getitem__(key)

if isinstance(result, PeriodIndex) and result._data.ndim == 2:
# this are not actually a valid Index object
deprecate_ndim_indexing(result._data)
return result._data._data
return result

def __array_wrap__(self, result, context=None):
"""
Gets called after a ufunc. Needs additional handling as
Expand Down
22 changes: 15 additions & 7 deletions pandas/core/internals/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -410,11 +410,7 @@ def apply(self: T, f, filter=None, align_keys=None, **kwargs) -> T:
if f == "where":
align_copy = True

aligned_args = {
k: kwargs[k]
for k in align_keys
if isinstance(kwargs[k], (ABCSeries, ABCDataFrame))
}
aligned_args = {k: kwargs[k] for k in align_keys}

for b in self.blocks:
if filter is not None:
Expand All @@ -426,8 +422,20 @@ def apply(self: T, f, filter=None, align_keys=None, **kwargs) -> T:
b_items = self.items[b.mgr_locs.indexer]

for k, obj in aligned_args.items():
axis = obj._info_axis_number
kwargs[k] = obj.reindex(b_items, axis=axis, copy=align_copy)._values
if isinstance(obj, (ABCSeries, ABCDataFrame)):
axis = obj._info_axis_number
kwargs[k] = obj.reindex(
b_items, axis=axis, copy=align_copy
)._values
else:
# We should have an ndarray or ExtensionArray
if obj.ndim == 2:
# FIXME: kludge; shouldnt need the ndim restriction
assert obj.shape[0] == self.shape[0], (
obj.shape,
self.shape,
)
kwargs[k] = obj[b.mgr_locs.indexer]

if callable(f):
applied = b.apply(f, **kwargs)
Expand Down
13 changes: 12 additions & 1 deletion pandas/core/ops/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from pandas.core.dtypes.generic import ABCDataFrame, ABCIndexClass, ABCSeries
from pandas.core.dtypes.missing import isna

from pandas.core.array_algos.npcompat import broadcast_to
from pandas.core.construction import extract_array
from pandas.core.ops.array_ops import (
arithmetic_op,
Expand Down Expand Up @@ -333,6 +334,16 @@ def column_op(a, b):
# in which case we specifically want to operate row-by-row
assert right.index.equals(left.columns)

rvals = right._values
if hasattr(rvals, "reshape"):
# i.e. ndarray, DatetimeArray, TimedeltaArray, PeriodArray
right = broadcast_to(rvals, left.shape, orient="rowlike").T

array_op = get_array_op(func, str_rep=str_rep)
bm = left._data.apply(array_op, right=right, align_keys=["right"])
return type(left)(bm)

# still needed for two tests with PeriodArray
if right.dtype == "timedelta64[ns]":
# ensure we treat NaT values as the correct dtype
# Note: we do not do this unconditionally as it may be lossy or
Expand All @@ -343,7 +354,7 @@ def column_op(a, b):
return {i: func(a.iloc[:, i], b[i]) for i in range(len(a.columns))}

else:

# FIXME: this will be wrong for Categorical `b`
def column_op(a, b):
return {i: func(a.iloc[:, i], b.iloc[i]) for i in range(len(a.columns))}

Expand Down
11 changes: 9 additions & 2 deletions pandas/tests/arithmetic/test_datetime64.py
Original file line number Diff line number Diff line change
Expand Up @@ -1065,6 +1065,7 @@ def test_dt64arr_add_sub_parr(
"unsupported operand",
"descriptor.*requires",
"ufunc.*cannot use operands",
"Addition/subtraction of integers and integer-arrays",
]
)
assert_invalid_addsub_type(dtarr, parr, msg)
Expand Down Expand Up @@ -1417,7 +1418,10 @@ def test_dt64arr_add_sub_offset_ndarray(self, tz_naive_fixture, box_with_array):

other = np.array([pd.offsets.MonthEnd(), pd.offsets.Day(n=2)])

warn = None if box_with_array is pd.DataFrame else PerformanceWarning
warn = PerformanceWarning
if box_with_array is pd.DataFrame and tz is not None:
warn = None

with tm.assert_produces_warning(warn):
res = dtarr + other
expected = DatetimeIndex(
Expand Down Expand Up @@ -2378,7 +2382,10 @@ def test_dti_addsub_object_arraylike(
expected = pd.DatetimeIndex(["2017-01-31", "2017-01-06"], tz=tz_naive_fixture)
expected = tm.box_expected(expected, xbox)

warn = None if box_with_array is pd.DataFrame else PerformanceWarning
warn = PerformanceWarning
if box_with_array is pd.DataFrame and tz is not None:
warn = None

with tm.assert_produces_warning(warn):
result = dtarr + other
tm.assert_equal(result, expected)
Expand Down
43 changes: 20 additions & 23 deletions pandas/tests/arithmetic/test_timedelta64.py
Original file line number Diff line number Diff line change
Expand Up @@ -1066,7 +1066,13 @@ def test_td64arr_sub_periodlike(self, box_with_array, tdi_freq, pi_freq):

# TODO: parametrize over box for pi?
tdi = tm.box_expected(tdi, box_with_array)
msg = "cannot subtract|unsupported operand type"
msg = "|".join(
[
"cannot subtract",
"unsupported operand type",
"Addition/subtraction of integers and integer-arrays",
]
)
with pytest.raises(TypeError, match=msg):
tdi - pi

Expand Down Expand Up @@ -1318,14 +1324,11 @@ def test_td64arr_add_offset_index(self, names, box):
tdi = tm.box_expected(tdi, box)
expected = tm.box_expected(expected, box)

# The DataFrame operation is transposed and so operates as separate
# scalar operations, which do not issue a PerformanceWarning
warn = PerformanceWarning if box is not pd.DataFrame else None
with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
res = tdi + other
tm.assert_equal(res, expected)

with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
res2 = other + tdi
tm.assert_equal(res2, expected)

Expand All @@ -1344,14 +1347,11 @@ def test_td64arr_add_offset_array(self, box_with_array):
tdi = tm.box_expected(tdi, box)
expected = tm.box_expected(expected, box)

# The DataFrame operation is transposed and so operates as separate
# scalar operations, which do not issue a PerformanceWarning
warn = PerformanceWarning if box is not pd.DataFrame else None
with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
res = tdi + other
tm.assert_equal(res, expected)

with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
res2 = other + tdi
tm.assert_equal(res2, expected)

Expand Down Expand Up @@ -1380,10 +1380,7 @@ def test_td64arr_sub_offset_index(self, names, box_with_array):
tdi = tm.box_expected(tdi, box)
expected = tm.box_expected(expected, xbox)

# The DataFrame operation is transposed and so operates as separate
# scalar operations, which do not issue a PerformanceWarning
warn = PerformanceWarning if box is not pd.DataFrame else None
with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
res = tdi - other
tm.assert_equal(res, expected)

Expand All @@ -1399,10 +1396,7 @@ def test_td64arr_sub_offset_array(self, box_with_array):
tdi = tm.box_expected(tdi, box_with_array)
expected = tm.box_expected(expected, box_with_array)

# The DataFrame operation is transposed and so operates as separate
# scalar operations, which do not issue a PerformanceWarning
warn = None if box_with_array is pd.DataFrame else PerformanceWarning
with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
res = tdi - other
tm.assert_equal(res, expected)

Expand Down Expand Up @@ -1473,28 +1467,31 @@ def test_td64arr_add_sub_object_array(self, box_with_array):
[pd.Timedelta(days=1), pd.offsets.Day(2), pd.Timestamp("2000-01-04")]
)

warn = PerformanceWarning if box_with_array is not pd.DataFrame else None
with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
result = tdarr + other

expected = pd.Index(
[pd.Timedelta(days=2), pd.Timedelta(days=4), pd.Timestamp("2000-01-07")]
)
expected = tm.box_expected(expected, box_with_array)
if box_with_array is pd.DataFrame:
expected = expected.astype(object)
tm.assert_equal(result, expected)

msg = "unsupported operand type|cannot subtract a datelike"
with pytest.raises(TypeError, match=msg):
with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
tdarr - other

with tm.assert_produces_warning(warn):
with tm.assert_produces_warning(PerformanceWarning):
result = other - tdarr

expected = pd.Index(
[pd.Timedelta(0), pd.Timedelta(0), pd.Timestamp("2000-01-01")]
)
expected = tm.box_expected(expected, box_with_array)
if box_with_array is pd.DataFrame:
expected = expected.astype(object)
tm.assert_equal(result, expected)


Expand Down
7 changes: 7 additions & 0 deletions pandas/tests/extension/base/reshaping.py
Original file line number Diff line number Diff line change
Expand Up @@ -324,3 +324,10 @@ def test_transpose(self, data):
self.assert_frame_equal(result, expected)
self.assert_frame_equal(np.transpose(np.transpose(df)), df)
self.assert_frame_equal(np.transpose(np.transpose(df[["A"]])), df[["A"]])

def test_factorize_roundtrip(self, data):
# GH#32673
values = data._values_for_factorize()[0]
result = type(data)._from_factorized(values, data)

self.assert_equal(result, data)
Loading