-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
SubClassedDataFrame.groupby().mean()
etc. use method of SubClassedDataFrame
#51765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
df9b39a
de8c231
d057cd0
e56b488
5338d3f
5fe7f82
8f32b12
4aa2b85
8d7346d
37ae233
aa57cc2
b3df075
adc132a
31868ff
c0b2ad7
98b7986
053c865
2efa052
1505a1c
af9ac26
f4bc548
12a9fa8
f46eea9
185a3c1
bf9bde6
036d662
f348648
48ceb0a
a6be1ea
f0ed14a
6631b1e
94dc186
310c339
27c4ed9
9bafd9a
963b3fe
8a9f30f
518d42e
6320057
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -283,7 +283,9 @@ def aggregate(self, func=None, *args, engine=None, engine_kwargs=None, **kwargs) | |
) | ||
|
||
# result is a dict whose keys are the elements of result_index | ||
result = Series(result, index=self.grouper.result_index) | ||
result = self._obj_1d_constructor( | ||
result, index=self.grouper.result_index | ||
) | ||
result = self._wrap_aggregated_output(result) | ||
return result | ||
|
||
|
@@ -687,7 +689,7 @@ def value_counts( | |
# in a backward compatible way | ||
# GH38672 relates to categorical dtype | ||
ser = self.apply( | ||
Series.value_counts, | ||
self._obj_1d_constructor.value_counts, | ||
normalize=normalize, | ||
sort=sort, | ||
ascending=ascending, | ||
|
@@ -706,7 +708,7 @@ def value_counts( | |
llab = lambda lab, inc: lab[inc] | ||
else: | ||
# lab is a Categorical with categories an IntervalIndex | ||
cat_ser = cut(Series(val), bins, include_lowest=True) | ||
cat_ser = cut(self.obj._constructor(val), bins, include_lowest=True) | ||
cat_obj = cast("Categorical", cat_ser._values) | ||
lev = cat_obj.categories | ||
lab = lev.take( | ||
|
@@ -1289,9 +1291,9 @@ def aggregate(self, func=None, *args, engine=None, engine_kwargs=None, **kwargs) | |
elif relabeling: | ||
# this should be the only (non-raising) case with relabeling | ||
# used reordered index of columns | ||
result = cast(DataFrame, result) | ||
result = cast(self.obj._constructor, result) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these will be wrong if _constructor is not a class |
||
result = result.iloc[:, order] | ||
result = cast(DataFrame, result) | ||
result = cast(self.obj._constructor, result) | ||
# error: Incompatible types in assignment (expression has type | ||
# "Optional[List[str]]", variable has type | ||
# "Union[Union[Union[ExtensionArray, ndarray[Any, Any]], | ||
|
@@ -1334,7 +1336,7 @@ def aggregate(self, func=None, *args, engine=None, engine_kwargs=None, **kwargs) | |
else: | ||
# GH#32040, GH#35246 | ||
# e.g. test_groupby_as_index_select_column_sum_empty_df | ||
result = cast(DataFrame, result) | ||
result = cast(self._obj_1d_constructor, result) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. _constructor, not _1d_constructor. Also i think this is OK as is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have also wrestled with whether results should stick to the subclass or just be DataFrames There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does mypy resolve |
||
result.columns = self._obj_with_exclusions.columns.copy() | ||
|
||
if not self.as_index: | ||
|
@@ -1462,7 +1464,7 @@ def _wrap_applied_output_series( | |
is_transform: bool, | ||
) -> DataFrame | Series: | ||
kwargs = first_not_none._construct_axes_dict() | ||
backup = Series(**kwargs) | ||
backup = self._obj_1d_constructor(**kwargs) | ||
values = [x if (x is not None) else backup for x in values] | ||
|
||
all_indexed_same = all_indexes_same(x.index for x in values) | ||
|
@@ -1857,7 +1859,9 @@ def _apply_to_column_groupbys(self, func) -> DataFrame: | |
|
||
if not len(results): | ||
# concat would raise | ||
res_df = DataFrame([], columns=columns, index=self.grouper.result_index) | ||
res_df = self.obj._constructor( | ||
[], columns=columns, index=self.grouper.result_index | ||
) | ||
else: | ||
res_df = concat(results, keys=columns, axis=1) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is trouble bc in general we can't assume that _constructor is a class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What else could it be (practically speaking, I know it's Callable)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Geopandas has a callable that can dispatch to different classes. @jorisvandenbossche has argued against deprecating allowing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a check whether
self._obj_1d_constructor
is aSeries