-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Bug: groupby with sort=False creates buggy MultiIndex #32259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
take |
(just some notes, will come back to this)
in |
I found that stack/unstack has a similar bug:
If you do I see that stack also relies on algorithms.factorize, which probably leads to the same issue. |
In this case, isn't the current result correct though?
The index is sorted by its first argument, and so it is correct to say that it's lexically sorted |
Oh you are right. I was confused a bit. Yes, this is actually correctly sorted. |
Decision has been to deprecate |
Where has this been discussed? |
Hey @jorisvandenbossche - it was in call when we had the sprint on the 31st of August |
The index of
t
is certainly not sorted, butt.index.is_lexsorted()
returns True.Another more subtle example is
This time the lexsort flag is correct. However, calling sortlevel will not sort the new MultiIndex correctly, that is,
t.index.sortlevel(['d', 'a'])[0]
returnsOutput of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]INSTALLED VERSIONS
commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-72-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.2.0.post20200210
Cython : 0.29.15
pytest : 5.3.5
hypothesis : 5.5.4
sphinx : 2.4.0
blosc : None
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: 0.8.1
bs4 : 4.8.2
bottleneck : 1.3.2
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : 0.13.0
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.7
numba : 0.45.1
The text was updated successfully, but these errors were encountered: