Skip to content

Commit 3f5870b

Browse files
phoflpooja-subramaniam
authored andcommitted
PERF: Avoid re-computing mask in nanmedian (pandas-dev#50838)
* PERF: Avoid re-computing mask in nanmedian * Add gh ref * Fix
1 parent 13ad1d2 commit 3f5870b

File tree

2 files changed

+9
-5
lines changed

2 files changed

+9
-5
lines changed

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -872,6 +872,7 @@ Performance improvements
872872
- Performance improvement in :class:`Period` constructor when constructing from a string or integer (:issue:`38312`)
873873
- Performance improvement in :func:`to_datetime` when using ``'%Y%m%d'`` format (:issue:`17410`)
874874
- Performance improvement in :func:`to_datetime` when format is given or can be inferred (:issue:`50465`)
875+
- Performance improvement in :meth:`Series.median` for nullable dtypes (:issue:`50838`)
875876
- Performance improvement in :func:`read_csv` when passing :func:`to_datetime` lambda-function to ``date_parser`` and inputs have mixed timezone offsetes (:issue:`35296`)
876877
- Performance improvement in :func:`isna` and :func:`isnull` (:issue:`50658`)
877878
- Performance improvement in :meth:`.SeriesGroupBy.value_counts` with categorical dtype (:issue:`46202`)

pandas/core/nanops.py

+8-5
Original file line numberDiff line numberDiff line change
@@ -746,16 +746,19 @@ def nanmedian(values, *, axis: AxisInt | None = None, skipna: bool = True, mask=
746746
2.0
747747
"""
748748

749-
def get_median(x):
750-
mask = notna(x)
751-
if not skipna and not mask.all():
749+
def get_median(x, _mask=None):
750+
if _mask is None:
751+
_mask = notna(x)
752+
else:
753+
_mask = ~_mask
754+
if not skipna and not _mask.all():
752755
return np.nan
753756
with warnings.catch_warnings():
754757
# Suppress RuntimeWarning about All-NaN slice
755758
warnings.filterwarnings(
756759
"ignore", "All-NaN slice encountered", RuntimeWarning
757760
)
758-
res = np.nanmedian(x[mask])
761+
res = np.nanmedian(x[_mask])
759762
return res
760763

761764
values, mask, dtype, _, _ = _get_values(values, skipna, mask=mask)
@@ -796,7 +799,7 @@ def get_median(x):
796799

797800
else:
798801
# otherwise return a scalar value
799-
res = get_median(values) if notempty else np.nan
802+
res = get_median(values, mask) if notempty else np.nan
800803
return _wrap_results(res, dtype)
801804

802805

0 commit comments

Comments
 (0)