ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' #40139

rsuhada · 2021-03-01T09:54:02Z

Code Sample, a copy-pastable example

df = pd.DataFrame({'birb':['Falcon', 'Falcon', 'Falcon', 'Falcon',
                           'Parrot', 'Parrot', 'Parrot', 'Parrot'],
                   'value': [390., 350.,390., 350.,
                             30., 20., 30., 20.]})
df.groupby('birb').aggregate({'value': ['count', 'mean', 'skew']})

Problem description

Code gives error: AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' instead of correct group-wise kurtosis.
Skew works ok.

Expected Output

       value            
       count   mean kurtosis
birb                    
Falcon     4  370.0  -6.0
Parrot     4   25.0  -6.0

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : 7d32926
python : 3.7.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-66-generic
Version : #74~18.04.2-Ubuntu SMP Fri Feb 5 11:17:31 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_IE.UTF-8
LOCALE : en_IE.UTF-8

pandas : 1.2.2
numpy : 1.17.5
pytz : 2021.1
dateutil : 2.8.1
pip : 21.0.1
setuptools : 49.2.1
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.20.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : 3.0.6
pandas_gbq : None
pyarrow : 3.0.0
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : 0.52.0

The text was updated successfully, but these errors were encountered:

rhshadrach · 2021-03-03T04:27:51Z

I don't believe SeriesGroupBy is documented as having a kurtosis method. Are you finding something different?

rsuhada · 2021-03-03T07:14:12Z

You are right... it just tripped me up that first three moments are implemented, but not the fourth one.
One can workaround it with the scipy implementation of kurtosis, but it would be nice to have it directly here.

rhshadrach · 2021-03-04T00:32:51Z

+1 on adding, I think this would just be adding to the common_apply_allowlist. In the meantime, from https://stackoverflow.com/questions/37345493/kurtosis-on-groupby-of-pandas-dataframe-doesnt-work

df.groupby('a').apply(pd.DataFrame.kurt)

This is essentially what is occurring under the hood for skew. Might want to use pd.Series instead of pd.DataFrame here.

…v#40139

jaepil-choi · 2024-10-03T18:12:12Z

Can't believe this is still a thing. mean, std, skew, count, min, max, median all other things work but not kurtosis? I mean.. why?

rhshadrach · 2024-10-03T19:28:08Z

@jaepil-choi - PRs are welcome!

jaepil-choi · 2024-10-03T19:38:58Z

@rhshadrach

Found out that pd.DataFrame.kurt workaround works. Would a very basic temporary fix like simply converting 'kurt' to pd.DataFrame.kurt get passed and merged?

rhshadrach · 2024-10-03T19:44:57Z

I believe you can implement this similar to corr:

SeriesGroupBy:

pandas/pandas/core/groupby/generic.py

Lines 1413 to 1423 in c47296a

    
           @doc(Series.corr.__doc__) 
        
           def corr( 
        
               self, 
        
               other: Series, 
        
               method: CorrelationMethod = "pearson", 
        
               min_periods: int | None = None, 
        
           ) -> Series: 
        
               result = self._op_via_apply( 
        
                   "corr", other=other, method=method, min_periods=min_periods 
        
               ) 
        
               return result

DataFrameGroupBy:

pandas/pandas/core/groupby/generic.py

Lines 2894 to 2904 in c47296a

    
           @doc(DataFrame.corr.__doc__) 
        
           def corr( 
        
               self, 
        
               method: str | Callable[[np.ndarray, np.ndarray], float] = "pearson", 
        
               min_periods: int = 1, 
        
               numeric_only: bool = False, 
        
           ) -> DataFrame: 
        
               result = self._op_via_apply( 
        
                   "corr", method=method, min_periods=min_periods, numeric_only=numeric_only 
        
               ) 
        
               return result

A PR would need the right arguments and tests.

saldanhad · 2024-10-09T09:42:37Z

take

Shubhank-Gyawali · 2024-11-16T04:52:20Z

take

snitish · 2024-11-27T07:29:19Z

take

snitish · 2024-12-01T05:43:41Z

@rhshadrach I raised a PR for this - #60433. Could you please review? Thanks.

rsuhada added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 1, 2021

RebwarBajallan95 pushed a commit to johbess/pandas that referenced this issue Mar 2, 2021

ADD: GroupBy kurtosis - fixes pandas-dev#40139

fd0dd18

rhshadrach added Enhancement Groupby and removed Bug labels Mar 3, 2021

rhshadrach changed the title ~~BUG:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis'~~ ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' Mar 3, 2021

lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label Mar 4, 2021

rhshadrach added this to the Contributions Welcome milestone Mar 5, 2021

johbess added a commit to johbess/pandas that referenced this issue Mar 10, 2021

ENH: added tests for the new kurtosis functionality - fixes pandas-de…

8356d99

…v#40139

svetakvsundhar mentioned this issue Nov 5, 2021

[BEAM-12550] Parallelizable kurtosis Implementation apache/beam#15909

Merged

4 tasks

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

rhshadrach added the good first issue label Oct 3, 2024

github-actions bot assigned saldanhad Oct 9, 2024

github-actions bot assigned Shubhank-Gyawali Nov 16, 2024

saldanhad removed their assignment Nov 16, 2024

github-actions bot assigned snitish Nov 27, 2024

snitish mentioned this issue Nov 27, 2024

ENH: Support kurtosis (kurt) in DataFrameGroupBy and SeriesGroupBy #60433

Merged

5 tasks

mroeschke closed this as completed in #60433 Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' #40139

ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' #40139

rsuhada commented Mar 1, 2021 •

edited

Loading

INSTALLED VERSIONS

rhshadrach commented Mar 3, 2021

rsuhada commented Mar 3, 2021

rhshadrach commented Mar 4, 2021 •

edited

Loading

jaepil-choi commented Oct 3, 2024

rhshadrach commented Oct 3, 2024

jaepil-choi commented Oct 3, 2024

rhshadrach commented Oct 3, 2024

saldanhad commented Oct 9, 2024

Shubhank-Gyawali commented Nov 16, 2024

snitish commented Nov 27, 2024

snitish commented Dec 1, 2024

ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' #40139

ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis' #40139

Comments

rsuhada commented Mar 1, 2021 • edited Loading

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

rhshadrach commented Mar 3, 2021

rsuhada commented Mar 3, 2021

rhshadrach commented Mar 4, 2021 • edited Loading

jaepil-choi commented Oct 3, 2024

rhshadrach commented Oct 3, 2024

jaepil-choi commented Oct 3, 2024

rhshadrach commented Oct 3, 2024

saldanhad commented Oct 9, 2024

Shubhank-Gyawali commented Nov 16, 2024

snitish commented Nov 27, 2024

snitish commented Dec 1, 2024

rsuhada commented Mar 1, 2021 •

edited

Loading

Output of `pd.show_versions()`

rhshadrach commented Mar 4, 2021 •

edited

Loading