Skip to content

BUG:Resample with groupby & agg() #35642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task
BassKot opened this issue Aug 9, 2020 · 4 comments
Closed
1 task

BUG:Resample with groupby & agg() #35642

BassKot opened this issue Aug 9, 2020 · 4 comments
Labels
Bug Duplicate Report Duplicate issue or pull request

Comments

@BassKot
Copy link

BassKot commented Aug 9, 2020

  • [ X] I have checked that this issue has not already been reported.

  • [ all versions] I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd
import numpy as np

data = pd.DataFrame({
    'cat': ['cat_1', 'cat_1', 'cat_2', 'cat_1', 'cat_2', 'cat_1', 'cat_2', 'cat_1'],
    'num': [5,20,22,3,4,30,10,50],
    'date': ['2019-2-1', '2018-02-03','2020-3-11','2019-2-2', '2019-2-2', '2018-12-4','2020-3-11', '2020-12-12']
})
data['date'] = pd.to_datetime(data['date'])
aggreg = data.groupby('cat').resample('Y', on='date')
summ_ = aggreg.sum()
agg_summ_ = aggreg.agg({'num': 'sum'})
summ_
agg_summ_

Problem description

When I want aggregate all columns by sum, example with summ_ calculates normal, but if I do example with agg it calculates incorrect.

Expected Output

summ_

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : d9fff27 python : 3.7.4.final.0 python-bits : 64 OS : Darwin OS-release : 19.6.0 Version : Darwin Kernel Version 19.6.0: Sun Jul 5 00:43:10 PDT 2020; root:xnu-6153.141.1~9/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : en_US.UTF-8 LANG : ru_RU.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.1.0
numpy : 1.18.4
pytz : 2019.3
dateutil : 2.8.0
pip : 20.0.2
setuptools : 41.4.0
Cython : 0.29.13
pytest : 5.2.1
hypothesis : None
sphinx : 2.2.0
blosc : None
feather : None
xlsxwriter : 1.2.1
lxml.etree : 4.4.1
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.8.0
pandas_datareader: None
bs4 : 4.8.0
bottleneck : 1.2.1
fsspec : 0.5.2
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.0
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.9
tables : 3.5.2
tabulate : 0.8.7
xarray : 0.16.0
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.45.1

@BassKot BassKot added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 9, 2020
@jreback
Copy link
Contributor

jreback commented Aug 9, 2020

pls show an actually reproducible example ; construct the input with code and show the results and what is not correct

@BassKot
Copy link
Author

BassKot commented Aug 10, 2020

pls show an actually reproducible example ; construct the input with code and show the results and what is not correct

Done

@Liam3851
Copy link
Contributor

Looks like a dupe of #33548.

@simonjayhawkins
Copy link
Member

Looks like a dupe of #33548.

closing as duplicate

@simonjayhawkins simonjayhawkins added Duplicate Report Duplicate issue or pull request and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

4 participants