Skip to content

DataFrame.stack() with flat columns won't sort #18356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
toobaz opened this issue Nov 18, 2017 · 3 comments
Open

DataFrame.stack() with flat columns won't sort #18356

toobaz opened this issue Nov 18, 2017 · 3 comments
Labels
Docs MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@toobaz
Copy link
Member

toobaz commented Nov 18, 2017

Code Sample, a copy-pastable example if possible

In [2]: df = pd.DataFrame(1, index=range(3), columns=[1, 3, 2])

In [3]: df.stack()
Out[3]: 
0  1    1
   3    1
   2    1
1  1    1
   3    1
   2    1
2  1    1
   3    1
   2    1
dtype: int64

Problem description

The docs state "The level involved will automatically get sorted.", and this is indeed what happens if df.columns is a MultiIndex.

Related: #18310 (the opposite problem, when sorting shouldn't happen).

Expected Output

In [3]: df.stack()
Out[3]: 
0  1    1
   2    1
   3    1
1  1    1
   2    1
   3    1
2  1    1
   2    1
   3    1
dtype: int64

Output of pd.show_versions()

INSTALLED VERSIONS

commit: cfad581
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.22.0.dev0+151.gcfad581e9
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.7.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

In [1]: df = pd.DataFrame(1, index=range(3), columns=[1, 3, 2])

In [2]: df.stack()
Out[2]: 
0  1    1
   3    1
   2    1
1  1    1
   3    1
   2    1
2  1    1
   3    1
   2    1
dtype: int64

In [3]: df.stack().index.is_lexsorted()
Out[3]: True

In [5]: df.stack().index.is_monotonic
Out[5]: False

The docs should read lex-sorted. Not 'sorted'.

@jreback jreback added Docs MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 19, 2017
@jreback jreback added this to the Next Major Release milestone Nov 19, 2017
@toobaz
Copy link
Member Author

toobaz commented Nov 19, 2017

The docs should read lex-sorted.

Still, sorting takes place when there is a MultiIndex:

In [2]: df = pd.DataFrame(1, index=range(3), columns=pd.MultiIndex.from_product([[1, 3, 2]]*2))

In [3]: df.stack()
Out[3]: 
     1  2  3
0 1  1  1  1
  2  1  1  1
  3  1  1  1
1 1  1  1  1
  2  1  1  1
  3  1  1  1
2 1  1  1  1
  2  1  1  1
  3  1  1  1

@toobaz
Copy link
Member Author

toobaz commented Nov 19, 2017

(I recognize "lexsorted" would be correct, since in the new index the values were themselves sorted... still, it seems to me incoherent)

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

3 participants