-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
pd.groupby
seems to mutate my pd.Grouper
in-place
#26564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this is probably inadvertant, so a PR to fix would be great. note that more typically you would do
IOW you only really would use this once |
|
I see now that the updates needed for this are a little bit more complex than I thought. To ensure nothing else breaks more thought will have to go into this. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Code Sample (copy-pastable)
Problem description
The above code sample throws the following error:
The
pd.Grouper
object seems to be modified inside the list. This is not the behavior I expect. I can resolve this by making explicit (deep) copies of the list so that new instances ofpd.Grouper
are passed into bothgroupby
methods. Like so:Expected Output
As show above, expected output is the empty dataframe, instead of the
ValueError
. Interestingly, if I reverse the order and run thedf2.groupby
first, then rundf1.groupby
, it works fine. However, doingdf2.groupby
again throws theValueError
. There's definitely something in thedf1.groupby
that is modifying thepd.Grouper
.Output of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]INSTALLED VERSIONS
commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-50-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 4.4.0
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.4.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.6
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.2.18
pymysql: None
psycopg2: 2.8.1 (dt dec pq3 ext lo64)
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
If this is expected behavior please let me know (we can close the issue). If this is not expected behavior, I'd love to take a crack at resolving this (any insight into the issue would be appreciated).
The text was updated successfully, but these errors were encountered: