Skip to content

Summing a sparse boolean series throws an exception TypeError: sum() got an unexpected keyword argument 'min_count' #25777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tsoernes opened this issue Mar 19, 2019 · 3 comments · Fixed by #34220
Labels
Bug Sparse Sparse Data Type
Milestone

Comments

@tsoernes
Copy link

Code Sample, a copy-pastable example if possible

In [202]: sparse_series.dtype
Out[203]: Sparse[bool, False]

In [208]: sparse_series.value_counts()
Out[208]: 
False    51386
True        13
Name: C_3D Printing, dtype: int64

In [209]: sparse_series.sum()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-209-8f6162710840> in <module>
----> 1 sparse_series.sum()

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, min_count, **kwargs)
  10929                                       skipna=skipna, min_count=min_count)
  10930         return self._reduce(f, name, axis=axis, skipna=skipna,
> 10931                             numeric_only=numeric_only, min_count=min_count)
  10932 
  10933     return set_function_name(stat_func, name, cls)

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/core/series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3613         # dispatch to ExtensionArray interface
   3614         if isinstance(delegate, ExtensionArray):
-> 3615             return delegate._reduce(name, skipna=skipna, **kwds)
   3616         elif is_datetime64_dtype(delegate):
   3617             # use DatetimeIndex implementation to handle skipna correctly

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/core/arrays/sparse.py in _reduce(self, name, skipna, **kwargs)
   1439         kwargs.pop('numeric_only', None)
   1440         kwargs.pop('op', None)
-> 1441         return getattr(arr, name)(**kwargs)
   1442 
   1443     def all(self, axis=None, *args, **kwargs):

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/core/arrays/sparse.py in sum(self, axis, *args, **kwargs)
   1491         sum : float
   1492         """
-> 1493         nv.validate_sum(args, kwargs)
   1494         valid_vals = self._valid_sp_values
   1495         sp_sum = valid_vals.sum()

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/compat/numpy/function.py in __call__(self, args, kwargs, fname, max_fname_arg_count, method)
     54                 validate_args_and_kwargs(fname, args, kwargs,
     55                                          max_fname_arg_count,
---> 56                                          self.defaults)
     57             else:
     58                 raise ValueError("invalid validation method "

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/util/_validators.py in validate_args_and_kwargs(fname, args, kwargs, max_fname_arg_count, compat_args)
    216 
    217     kwargs.update(args_dict)
--> 218     validate_kwargs(fname, kwargs, compat_args)
    219 
    220 

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/util/_validators.py in validate_kwargs(fname, kwargs, compat_args)
    154     """
    155     kwds = kwargs.copy()
--> 156     _check_for_invalid_keys(fname, kwargs, compat_args)
    157     _check_for_default_values(fname, kwds, compat_args)
    158 

~/anaconda3/envs/idp/lib/python3.6/site-packages/pandas/util/_validators.py in _check_for_invalid_keys(fname, kwargs, compat_args)
    125         raise TypeError(("{fname}() got an unexpected "
    126                          "keyword argument '{arg}'".
--> 127                          format(fname=fname, arg=bad_arg)))
    128 
    129 

TypeError: sum() got an unexpected keyword argument 'min_count'

In [210]: 

Problem description

Summing a sparse boolean series throws an exception TypeError: sum() got an unexpected keyword argument 'min_count'

Expected Output

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
In [198]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.20.15-200.fc29.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: nb_NO.UTF-8
LOCALE: nb_NO.UTF-8

pandas: 0.24.1+0.g1700680.dirty
pytest: None
pip: 10.0.1
setuptools: 39.0.1.post20180504
Cython: None
numpy: 1.16.1
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.8
feather: None
matplotlib: 3.0.1
openpyxl: 2.6.1
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: 2.7.7 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@tsoernes
Copy link
Author

A workaround is sparse_series.value_counts()[True]

@jbrockmendel jbrockmendel added Bug Sparse Sparse Data Type labels Jul 23, 2019
@ynshen
Copy link

ynshen commented Oct 11, 2019

I got the same error for pd.Series with SparseDtype (int and float). My temporary workaround is converting it to a dense one which is not a solution for large datasets.

Example:

> sparse_dtype = pd.SparseDtype(dtype='float', fill_value=0.0)
> df = pd.DataFrame(data=[[0, 1], [2, 1]], columns=['col1', 'col2']).astype(sparse_dtype)
> df['col1'].sparse.to_dense().sum()
2.0
> df['col1'].sum()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-80-608e3bce98fd> in <module>
      1 sparse_dtype = pd.SparseDtype(dtype='float', fill_value=0.0)
      2 df = pd.DataFrame(data=[[0, 1], [2, 1]], columns=['col1', 'col2']).astype(sparse_dtype)
----> 3 df['col1'].sum()

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, min_count, **kwargs)
  11583             skipna=skipna,
  11584             numeric_only=numeric_only,
> 11585             min_count=min_count,
  11586         )
  11587 

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/core/series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   4069         elif isinstance(delegate, ExtensionArray):
   4070             # dispatch to ExtensionArray interface
-> 4071             return delegate._reduce(name, skipna=skipna, **kwds)
   4072         elif is_datetime64_dtype(delegate):
   4073             # use DatetimeIndex implementation to handle skipna correctly

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/core/arrays/sparse.py in _reduce(self, name, skipna, **kwargs)
   1549         kwargs.pop("numeric_only", None)
   1550         kwargs.pop("op", None)
-> 1551         return getattr(arr, name)(**kwargs)
   1552 
   1553     def all(self, axis=None, *args, **kwargs):

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/core/arrays/sparse.py in sum(self, axis, *args, **kwargs)
   1601         sum : float
   1602         """
-> 1603         nv.validate_sum(args, kwargs)
   1604         valid_vals = self._valid_sp_values
   1605         sp_sum = valid_vals.sum()

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/compat/numpy/function.py in __call__(self, args, kwargs, fname, max_fname_arg_count, method)
     56             elif method == "both":
     57                 validate_args_and_kwargs(
---> 58                     fname, args, kwargs, max_fname_arg_count, self.defaults
     59                 )
     60             else:

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/util/_validators.py in validate_args_and_kwargs(fname, args, kwargs, max_fname_arg_count, compat_args)
    226 
    227     kwargs.update(args_dict)
--> 228     validate_kwargs(fname, kwargs, compat_args)
    229 
    230 

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/util/_validators.py in validate_kwargs(fname, kwargs, compat_args)
    163     """
    164     kwds = kwargs.copy()
--> 165     _check_for_invalid_keys(fname, kwargs, compat_args)
    166     _check_for_default_values(fname, kwds, compat_args)
    167 

~/.pyenv/versions/k-seq/lib/python3.7/site-packages/pandas/util/_validators.py in _check_for_invalid_keys(fname, kwargs, compat_args)
    132             (
    133                 "{fname}() got an unexpected "
--> 134                 "keyword argument '{arg}'".format(fname=fname, arg=bad_arg)
    135             )
    136         )

TypeError: sum() got an unexpected keyword argument 'min_count'

@choucavalier
Copy link
Contributor

I have the same issue today. is someone working on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants