Skip to content

BUG: Series groupby with by as Series raises FutureWarning #41683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
jsignell opened this issue May 26, 2021 · 4 comments
Closed
2 of 3 tasks

BUG: Series groupby with by as Series raises FutureWarning #41683

jsignell opened this issue May 26, 2021 · 4 comments
Labels
Bug Compat pandas objects compatability with Numpy or Python functions Dependencies Required and optional dependencies Upstream issue Issue related to pandas dependency Warnings Warnings that appear or should be added to pandas

Comments

@jsignell
Copy link
Contributor

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


I ran into this when testing dask against pandas nightly builds. dask/dask#7710

Code Sample, a copy-pastable example

import pandas as pd
df = pd.DataFrame(
    {"a": [1, 2, 6, 4, 4, 6, 4, 3, 7], "b": [4, 2, 7, 3, 3, 1, 1, 1, 2]},
    index=[0, 1, 3, 5, 6, 8, 9, 9, 9],
)
df.a.groupby(df.b).sum()

Problem description

According to the series groupby docstring, I should be able to groupby series objects, but I get a FutureWarning

/home/julia/conda/envs/dask-upstream/lib/python3.8/site-packages/pandas/core/indexes/base.py:3372: FutureWarning: Promotion of numbers and bools to strings is deprecated. In the future, code such as `np.concatenate((['string'], [0]))` will raise an error, while `np.asarray(['string', 0])` will return an array with `dtype=object`.  To avoid the warning while retaining a string result use `dtype='U'` (or 'S').  To get an array of Python objects use `dtype=object`. (Warning added in NumPy 1.21)
  return self._engine.get_loc(casted_key)

Expected Output

When I call .tolist() on the series before passing it as by I get the expected result:

df.a.groupby(df.b.tolist()).sum()

Output of pd.show_versions()

INSTALLED VERSIONS

commit : e7c1ef7
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-72-generic
Version : #80~18.04.1-Ubuntu SMP Mon Apr 12 23:26:25 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.0.dev0+1738.ge7c1ef71d8
numpy : 1.22.0.dev0+4.gb283e1632
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.2
setuptools : 46.1.3
Cython : None
pytest : 6.2.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.23.1
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : 2021.05.0+16.g07df543
fastparquet : 0.6.3
gcsfs : None
matplotlib : 3.4.2
numexpr : 2.7.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 4.0.0
pyxlsb : None
s3fs : 2021.05.0
scipy : 1.6.3
sqlalchemy : 1.3.23
tables : 3.6.1
tabulate : None
xarray : 0.18.2
xlrd : None
xlwt : None
numba : 0.53.1

@jsignell jsignell added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 26, 2021
@Lokeshrathi
Copy link

Hey @jsignell I ran the above code and got the results without any future warning.
no changes were done

pandas : 1.1.3
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.4
setuptools : 50.3.1.post20201107
Cython : 0.29.21
python : 3.8.5.final.0
python-bits : 64

Are you saying this is a issue for the latest pandas version?

@jsignell
Copy link
Contributor Author

This is on the latest nightly build of pandas so pandas : 1.3.0.dev0+1738.ge7c1ef71d8

@mzeitlin11
Copy link
Member

Probably same issue as #41632?

@lithomas1 lithomas1 added Warnings Warnings that appear or should be added to pandas Upstream issue Issue related to pandas dependency Dependencies Required and optional dependencies Compat pandas objects compatability with Numpy or Python functions and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 5, 2021
@mzeitlin11
Copy link
Member

Closing as this is being tracked in #41632

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Compat pandas objects compatability with Numpy or Python functions Dependencies Required and optional dependencies Upstream issue Issue related to pandas dependency Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

No branches or pull requests

4 participants