ENH: add Groupby.attrs namespace to access groupby attributes #53642

topper-123 · 2023-06-13T06:56:41Z

Feature Type

Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas

Problem Description

The attributes of groupby objects are currently accessible from the groupby directly, but they are hidden, i.e. they don't show up in dir calls:

>>> df = pd.DataFrame({"a": [1, 2, 3], "b": [1, 2, 3]})
>>> dfg = df.groupby("a")
>>> dfg.keys
'a'
>>> "keys" in dir(dfg)
False
>>> dfg._hidden_attrs
frozenset({'as_index',
           'axis',
           'dropna',
           ...,
           'observed',
           'sort'})

I assume this has been done because we want the groupby attributes to be groupby methods / to not make its namespace noisy.

Feature Description

It is beneficial to be able to access the attributes and instead of using hidden attributes I propose a public/non-hidden attrs namespace, so to access an attribute, users can to e.g. dfg.attrs.keys.

This can also form the basis for a groupby repr and the groupby repr could take its data from the groupby attrs.

I'm not sure about the attrs name because we have already DataFrame.attrs, so I'm definitely open to suggestion for better names.

Alternative Solutions

The alternatives are:

1: keep things as they are / keep the attributes hidden
2. make the hidden attributes public

IMO these have disadvantages: For point 1 it is that the attributes are difficult to discover and for point 2 the disadvantage is that the groupby namespace becomes very large and groupby methods and attributes become mixed, making discoverability of groupby methods difficult.

An attrs attribute would avoid both of those disadvantages.

Additional Context

No response

The text was updated successfully, but these errors were encountered:

jreback · 2023-06-13T10:48:18Z

can u show what one would actually do with these?

topper-123 · 2023-06-27T08:33:22Z

can u show what one would actually do with these?

This would help introspecting groupby objects. Concretely, I often pass groupby objects through several functions, and if something unexpected happens I think it's worthwhile to be able to inspect the groupby object to understand what's happening.

rhshadrach · 2023-06-28T02:00:43Z

Overall I'm +0. I personally don't use groupby objects like this (they are always created / thrown away), but I can see the benefit. If we are going this route, I certainly don't think we should allow setting them (e.g. gb.attr.keys = [1, 2, 3]).

But I'm not sure on mutability - if a user is getting gb.attr.keys or gb.attr.obj, are we returning objects that they can mutate (perhaps accidentally) the internal state of the groupby object and create perhaps hard to understand errors, or are we going to return copies. Currently users can do this via gb.obj and it doesn't seem to cause issues, but maybe users just don't know about it.

I'd be opposed to exposing _obj_with_exclusions or _selected_obj in their current state, but would be more favorable once it gets cleaned up.

topper-123 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 13, 2023

rhshadrach mentioned this issue Jun 26, 2023

ENH: add columns attribute to DataFrameGroupBy #53583

Closed

3 tasks

mroeschke added Groupby and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: add Groupby.attrs namespace to access groupby attributes #53642

ENH: add Groupby.attrs namespace to access groupby attributes #53642

topper-123 commented Jun 13, 2023 •

edited

Loading

jreback commented Jun 13, 2023

topper-123 commented Jun 27, 2023

rhshadrach commented Jun 28, 2023 •

edited

Loading

ENH: add Groupby.attrs namespace to access groupby attributes #53642

ENH: add Groupby.attrs namespace to access groupby attributes #53642

Comments

topper-123 commented Jun 13, 2023 • edited Loading

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

jreback commented Jun 13, 2023

topper-123 commented Jun 27, 2023

rhshadrach commented Jun 28, 2023 • edited Loading

topper-123 commented Jun 13, 2023 •

edited

Loading

rhshadrach commented Jun 28, 2023 •

edited

Loading