Skip to content

Min max sparse #41159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 28, 2021
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b1fb854
Updated qcut for Float64DType
taytzehao Apr 16, 2021
7af63a2
Fixes from pre-commit [automated commit]
taytzehao Apr 16, 2021
c4259aa
Added test and documentation for qcut Float64DType support
taytzehao Apr 18, 2021
043ae70
Merge branch 'master' of github.com:pandas-dev/pandas
taytzehao Apr 18, 2021
48f21c6
Merge branch 'master' of https://github.com/taytzehao/pandas
taytzehao Apr 18, 2021
29fddab
Updated qcut test formatting
taytzehao Apr 18, 2021
e489742
Merge branch 'master' of github.com:pandas-dev/pandas
taytzehao Apr 19, 2021
fd75ac5
Merge branch 'master' of github.com:pandas-dev/pandas into min_max_sp…
taytzehao Apr 26, 2021
e0f4253
Update sparse array minmax method
taytzehao Apr 26, 2021
7b39607
Update sparse array minmax method 2
taytzehao Apr 26, 2021
3b22ab2
Update sparse array minmax method 3
taytzehao Apr 26, 2021
02d7f32
Update sparse array minmax method 4
taytzehao Apr 26, 2021
c6fe7e2
Update sparse array minmax method 5
taytzehao Apr 26, 2021
dddb6f6
Update sparse array minmax method 6
taytzehao Apr 26, 2021
9c03b52
Update sparse array minmax method 6
taytzehao Apr 26, 2021
227c282
Update sparse array minmax method 7
taytzehao Apr 26, 2021
5dabba3
Update sparse array minmax method 8
taytzehao Apr 26, 2021
85a99b4
Added test case and function to support NaN
taytzehao Apr 26, 2021
0e6a384
Resolve precommit issue
taytzehao Apr 27, 2021
dc7b704
Merge branch 'master' of github.com:pandas-dev/pandas into min_max_sp…
taytzehao Apr 27, 2021
10c40e5
Resolve precommit issue rst precommit issue resolved
taytzehao Apr 27, 2021
13545b4
Add test coverage percentage
taytzehao Apr 27, 2021
ff83373
Updated what's new and SparseArray minmax logic
taytzehao Apr 28, 2021
9e5fe39
Merge branch 'master' of github.com:pandas-dev/pandas into min_max_sp…
taytzehao Apr 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -877,7 +877,7 @@ Sparse

- Bug in :meth:`DataFrame.sparse.to_coo` raising ``KeyError`` with columns that are a numeric :class:`Index` without a 0 (:issue:`18414`)
- Bug in :meth:`SparseArray.astype` with ``copy=False`` producing incorrect results when going from integer dtype to floating dtype (:issue:`34456`)
-
- Bug in :class:`SparseArray` type as :meth:`max` and :meth:`min` do not exist (:issue:`40921`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: "Implemented :meth:`SparseArray.min` and :meth:`SparseArray.max` (:issue:`40921`)"


ExtensionArray
^^^^^^^^^^^^^^
Expand Down
24 changes: 24 additions & 0 deletions pandas/core/arrays/sparse/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -1392,6 +1392,30 @@ def mean(self, axis=0, *args, **kwargs):
nsparse = self.sp_index.ngaps
return (sp_sum + self.fill_value * nsparse) / (ct + nsparse)

def max(self, axis=0, *args, **kwargs):
nv.validate_max(args, kwargs)

if self.sp_index.ngaps > 0 and np.all(self._valid_sp_values < 0):

if self.size > 0 and self._valid_sp_values.size == 0:
return np.nan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldnt self.fill_value be relevant here?


return 0
else:
return np.amax(self._valid_sp_values, axis)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason for np.amax instead of np.max?


def min(self, axis=0, *args, **kwargs):
nv.validate_min(args, kwargs)

if self.sp_index.ngaps > 0 and np.all(self._valid_sp_values > 0):

if self.size > 0 and self._valid_sp_values.size == 0:
return np.nan

return 0
else:
return np.amin(self._valid_sp_values, axis)

# ------------------------------------------------------------------------
# Ufuncs
# ------------------------------------------------------------------------
Expand Down
22 changes: 22 additions & 0 deletions pandas/tests/arrays/sparse/test_array.py
Original file line number Diff line number Diff line change
Expand Up @@ -1311,3 +1311,25 @@ def test_dropna(fill_value):
df = pd.DataFrame({"a": [0, 1], "b": arr})
expected_df = pd.DataFrame({"a": [1], "b": exp}, index=pd.Int64Index([1]))
tm.assert_equal(df.dropna(), expected_df)


class TestMinMax:
plain_data = np.arange(5).astype(float)
data_neg = plain_data * (-1)
data_NaN = SparseArray(np.array([0, 1, 2, np.nan, 4]))
data_all_NaN = SparseArray(np.array([np.nan, np.nan, np.nan, np.nan, np.nan]))

@pytest.mark.parametrize(
"raw_data,max_expected,min_expected",
[
(plain_data, [4], [0]),
(data_neg, [0], [-4]),
(data_NaN, [4], [0]),
(data_all_NaN, [np.nan], [np.nan]),
],
)
def test_maxmin(self, raw_data, max_expected, min_expected):
max_result = SparseArray(raw_data).max()
min_result = SparseArray(raw_data).min()
assert max_result in max_expected
assert min_result in min_expected