-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Unexpected TypeError using pd.NamedAgg
dict with pd.rolling
#28333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Should be doable, just not implemented yet. Interested in working on it? |
Looks like |
I suspect rolling / expanding / ewma will have a single implementation
that's shared.
…On Sat, Sep 7, 2019 at 9:52 AM Brian Bader ***@***.***> wrote:
Code Sample
import pandas as pd
animals = pd.DataFrame({'kind': ['cat', 'dog', 'cat', 'dog'],
'height': [9.1, 6.0, 9.5, 34.0],
'weight': [7.9, 7.5, 9.9, 198.0]})
# (1) - doesn't work
animals.groupby("kind").rolling(1).agg(**{'total weight': pd.NamedAgg(column='weight', aggfunc=sum),
'min_weight': pd.NamedAgg(column='weight', aggfunc=min)})
# (2) - works
animals.groupby("kind").agg(**{'total weight': pd.NamedAgg(column='weight', aggfunc=sum),
'min_weight': pd.NamedAgg(column='weight', aggfunc=min)})
# works, but returns multiindex
animals.groupby("kind").rolling(1).agg({'weight': {'total_weight': 'sum', 'min_weight': 'min'}})
# (3) - this is what I'd want, expected output of (1)
animals.groupby("kind").rolling(1).agg({'weight': {'total_weight': 'sum', 'min_weight': 'min'}}).droplevel(0, axis=1)
Problem description
Pandas version 0.25 provides new aggregate functionality through NamedAgg
<https://pandas-docs.github.io/pandas-docs-travis/whatsnew/v0.25.0.html#groupby-aggregation-with-relabeling>.
I would expect to be able to pass this in after a groupby as a pd.apply
statement, and also after a groupby *and* pd.rolling.
However, when I try to pass in a dict of NamedAgg tuples as shown in (1)
in the example code I get the following TypeError: TypeError: aggregate()
missing 1 required positional argument: 'arg'
Is this expected or am I incorrectly using this functionality? Thanks in
advance!
Output of pd.show_versions() INSTALLED VERSIONS
commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Darwin
OS-release : 18.7.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8
pandas : 0.25.1
numpy : 1.16.1
pytz : 2018.5
dateutil : 2.7.3
pip : 19.1.1
setuptools : 41.0.1
Cython : None
pytest : 4.3.1
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : 0.9.2
psycopg2 : 2.7.7 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.3.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.0.3
numexpr : None
odfpy : None
openpyxl : 2.6.1
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : 1.3.3
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#28333?email_source=notifications&email_token=AAKAOIS5JHCFLWPPGCMUEGLQIO5ZDA5CNFSM4IUQJFJ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJ6UI3A>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOITPDP7NWIXXMAQPGZ3QIO5ZDANCNFSM4IUQJFJQ>
.
|
This isn't implemented yet in |
A PR to implement this would still be welcome |
I would also love to see this functionality available to help with time time series aggregations |
Code Sample
Problem description
Pandas version 0.25 provides new aggregate functionality through NamedAgg. I would expect to be able to pass this in after a groupby as a
pd.apply
statement, and also after a groupby andpd.rolling
.However, when I try to pass in a dict of
NamedAgg
tuples as shown in (1) in the example code I get the following TypeError:TypeError: aggregate() missing 1 required positional argument: 'arg'
Is this expected or am I incorrectly using this functionality? Thanks in advance!
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Darwin
OS-release : 18.7.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8
pandas : 0.25.1
numpy : 1.16.1
pytz : 2018.5
dateutil : 2.7.3
pip : 19.1.1
setuptools : 41.0.1
Cython : None
pytest : 4.3.1
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : 0.9.2
psycopg2 : 2.7.7 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.3.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.0.3
numexpr : None
odfpy : None
openpyxl : 2.6.1
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : 1.3.3
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
The text was updated successfully, but these errors were encountered: