Skip to content

BUG: DataError: No numeric types to aggregate during pd.pivot_table #33515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
retsyo opened this issue Apr 13, 2020 · 3 comments
Closed

BUG: DataError: No numeric types to aggregate during pd.pivot_table #33515

retsyo opened this issue Apr 13, 2020 · 3 comments
Labels
Bug Needs Info Clarification about behavior needed to assess issue

Comments

@retsyo
Copy link

retsyo commented Apr 13, 2020

I am using pandas 1.0.3 in anaconda python 3.6.6 64 bits on win7 64 bits

the following code

pd.pivot_table(df, index=['time', 'goods'], values=['price',])

says something like #17382

DataError                                 Traceback (most recent call last)
<ipython-input-62-dc402f731fc3> in <module>()
----> 1 pd.pivot_table(df.iloc[:5,:], index=['time', 'goods'], values=['price',])

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\reshape\pivot.py in pivot_table(data, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name, observed)
     99 
    100     grouped = data.groupby(keys, observed=observed)
--> 101     agged = grouped.agg(aggfunc)
    102     if dropna and isinstance(agged, ABCDataFrame) and len(agged.columns):
    103         agged = agged.dropna(how="all")

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\groupby\generic.py in aggregate(self, func, *args, **kwargs)
    926         func = _maybe_mangle_lambdas(func)
    927 
--> 928         result, how = self._aggregate(func, *args, **kwargs)
    929         if how is None:
    930             return result

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\base.py in _aggregate(self, arg, *args, **kwargs)
    309 
    310         if isinstance(arg, str):
--> 311             return self._try_aggregate_string_function(arg, *args, **kwargs), None
    312 
    313         if isinstance(arg, dict):

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\base.py in _try_aggregate_string_function(self, arg, *args, **kwargs)
    265         if f is not None:
    266             if callable(f):
--> 267                 return f(*args, **kwargs)
    268 
    269             # people may try to aggregate on a non-callable attribute

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\groupby\groupby.py in mean(self, *args, **kwargs)
   1223         nv.validate_groupby_func("mean", args, kwargs, ["numeric_only"])
   1224         return self._cython_agg_general(
-> 1225             "mean", alt=lambda x, axis: Series(x).mean(**kwargs), **kwargs
   1226         )
   1227 

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\groupby\generic.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
    992     ) -> DataFrame:
    993         agg_blocks, agg_items = self._cython_agg_blocks(
--> 994             how, alt=alt, numeric_only=numeric_only, min_count=min_count
    995         )
    996         return self._wrap_agged_blocks(agg_blocks, items=agg_items)

L:\prg\py\Anaconda3_64\lib\site-packages\pandas\core\groupby\generic.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   1093 
   1094         if not (agg_blocks or split_frames):
-> 1095             raise DataError("No numeric types to aggregate")
   1096 
   1097         if split_items:

DataError: No numeric types to aggregate

but if I do

df.to_excel('r:/5.xls')
dfm = pd.read_excel('r:/5.xls')
pd.pivot_table(dfm, index=['time', 'goods'], values=['price',])

then I get the pivot table as I expected

the question

so you may find that I can't make a minimal excel file to let you reproduce the DataError. On the other hand, my original df is made by concating some privacy excel files, which I can't share too.

Can anyone give me some hints? Thanks

df = pd.concat(
          [pd.read_excel(i, convertor={'ID2': int, 'pin': int}) for i in someXlsFilenames]
)
df.head(1)
_ time ID1 ID2 goods pin owener price weight from to material source
152 2013-2014_1 BS080003123 1311225 Data Transfomers 1200830216 Jack Chan 65 1.0 England USA wood 13-14-1.xls
@retsyo retsyo added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 13, 2020
@charlesdong1991
Copy link
Member

Hi, @retsyo could you please make up a df that could reproduce the issue? (understand that it is a private excel file you cannot share, but you could read it into a df and change column names and values etc)

@charlesdong1991 charlesdong1991 added Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 13, 2020
@MarcoGorelli
Copy link
Member

Thanks @retsyo for the report

Here's how you could have written a reproducible example:

>>> df = pd.pivot_table(dfm, index=['time', 'goods'], values=['price',])
>>> pd.pivot_table(dfm, index=['time', 'goods'], values=['price',])
DataError
>>> df.to_excel('temp.xlsx')
>>> pd.pivot_table(pd.read_excel('temp.xlsx'), index=['time', 'goods'], values=['price',])
            price
time goods       
2    3          1

I presume the problem is that you have some numbers formatted as strings. You can convert them to numbers with .astype.

If this doesn't solve your problem, feel free to reopen (as long as you include a reproducible example)

@linehammer
Copy link

This is due to the way GroupBy objects handle the different aggregation methods. In fact sum and mean are handled differently. GroupBy.mean function call dispatches to self._cython_agg_general which checks for numeric types and in case it doesn't find any (which is the case for your example) it raises a DataError. Though the call to self._cython_agg_general is wrapped in try/except in case of a GroupByError it just re-raises and DataError inherits from GroupByError. Thus the exception.

By applying pd.to_numeric you convert them to numeric type and the agg works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Info Clarification about behavior needed to assess issue
Projects
None yet
Development

No branches or pull requests

4 participants