Skip to content

DOC: Document column order in MultiIndex.to_frame #22662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
matthewgilbert opened this issue Sep 11, 2018 · 3 comments
Closed

DOC: Document column order in MultiIndex.to_frame #22662

matthewgilbert opened this issue Sep 11, 2018 · 3 comments

Comments

@matthewgilbert
Copy link
Contributor

conda create -q -n pandas1 python=3.6 pandas=0.23.4
source activate pandas1
>>> import pandas as pd
>>> 
>>> pd.MultiIndex.from_tuples(
...     [("B1", "A1"), ("B2", "A1")], names=["B", "A"]
... ).to_frame()
        B   A
B  A         
B1 A1  B1  A1
B2 A1  B2  A1
conda create -q -n pandas2 python=3.5 pandas=0.23.4
source activate pandas2
import pandas as pd

pd.MultiIndex.from_tuples(
    [("B1", "A1"), ("B2", "A1")], names=["B", "A"]
).to_frame()

        A   B
B  A         
B1 A1  A1  B1
B2 A1  A1  B2

Problem description

The ordering of columns from pandas.MultiIndex.to_frame varies in python=3.5 vs python=3.6 and the expected behaviour is not defined in the docs.

Expected Output

I would expect columns to be in the order of the MultiIndex levels. Barring that I would expect the columns to be lexicographical sorted, although I find that to be less intuitive.

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-33-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 40.2.0
Cython: None
numpy: 1.15.1
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-33-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 40.2.0
Cython: None
numpy: 1.15.1
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
@TomAugspurger
Copy link
Contributor

That comes from http://pandas.pydata.org/pandas-docs/version/0.23/whatsnew.html#instantiation-from-dicts-preserves-dict-insertion-order-for-python-3-6.

Can you make a PR updating the docstring to indicate that the order follows the same results as the DataFrame constructor, and link to those docs?

@TomAugspurger TomAugspurger changed the title MultiIndex.to_frame() column ordering not consistent DOC: Document column order in MultiIndex.to_frame Sep 11, 2018
@matthewgilbert
Copy link
Contributor Author

matthewgilbert commented Sep 11, 2018

Can do. @TomAugspurger, just to confirm, when you say link to those docs is the general practice to put that in a See Also section? Or actually add a hyperlink? i.e.

"""
Create a DataFrame with the levels of the MultiIndex as columns.
Column ordering is determined by the DataFrame constructor with data as a
dict.

.. versionadded:: 0.20.0

Parameters
----------
index : boolean, default True
    Set the index of the returned DataFrame as the original MultiIndex.

Returns
-------
DataFrame : a DataFrame containing the original MultiIndex data.

See also
--------
DataFrame
"""

vs

"""
Create a DataFrame with the levels of the MultiIndex as columns.
Column ordering is determined by the DataFrame constructor with data as a
dict.

.. versionadded:: 0.20.0

Parameters
----------
index : boolean, default True
    Set the index of the returned DataFrame as the original MultiIndex.

Returns
-------
DataFrame : a DataFrame containing the original MultiIndex data.

Notes
-----
A discussion on the ordering of DataFrames can be found at
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
"""

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 11, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants