Closed
Description
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- (optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
df = pd.DataFrame({'a': [1,1,2], 'b': [[10],[11],[12]]})
# ok:
>>> df['b'].cumsum()
0 [10]
1 [10, 11]
2 [10, 11, 12]
Name: b, dtype: object
# error:
>>> df.groupby('a')['b'].cumsum()
DataError: No numeric types to aggregate
Problem description
Given that taking the cumsum of a column of list-likes is allowed, it should similarly be allowed when doing the same to a groupby object.
Expected Output
a
0 [10]
1 [10, 11]
2 [12]
Name: b, dtype: object
Workaround
This is only a partial workaround because it loses the index name, which is normally kept when doing a groupby.
df.groupby('a')['b'].apply(lambda col: col.cumsum())
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit : b5958ee1999e9aead1938c0bba2b674378807b3d
python : 3.8.7.final.0
python-bits : 64
OS : Darwin
OS-release : 20.4.0
Version : Darwin Kernel Version 20.4.0: Fri Mar 5 03:57:04 PST 2021; root:xnu-7195.101.1~4/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.1.5
numpy : 1.19.4
pytz : 2020.5
dateutil : 2.8.1
pip : None
setuptools : 50.3.1.post0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.2
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.3
numexpr : 2.7.2
odfpy : None
openpyxl : 3.0.5
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.6.0
sqlalchemy : 1.3.20
tables : 3.6.1
tabulate : 0.8.7
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
numba : None