You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a very obscure issue but I think it is responsible for things getting out of whack in #20945
In [1]: mi=pd.MultiIndex.from_product([[0], ['d', 'c']], names=['bar', 'baz'])
In [2]: df=pd.DataFrame([[0, 2], [1, 3]], index=mi, columns=['B', 'A'])
In [3]: df.columns.name='foo'In [4]: dfOut[4]:
fooBAbarbaz0d02c13In [5]: df.unstack().swaplevel(axis=1)
Out[5]:
bazcdcdfooBBAAbar01032In [6]: df.unstack().swaplevel(axis=1).sort_index(axis=1, level=0)
Out[6]:
bazcdfooABAB# Here subsequent levels get sortedbar03120In [7]: df.unstack().swaplevel(axis=1).sort_index(axis=1, level='baz')
Out[7]:
bazcdfooBABA# Here subsequent levels aren't getting sortedbar01302
If the DataFrame in step 5 above was constructed directly, the sorting would be the same regardless of whether or not you used the level index or label:
In [1]: mi=pd.MultiIndex.from_tuples([('c', 'B'), ('d', 'B'), ('c', 'A'), ('d', 'A')], names=['baz', 'foo'])
In [2]: df=pd.DataFrame([[1, 0, 3, 2]], columns=mi, index=pd.Index([0], name='bar'))
In [3]: df# Same as step 5 in above exampleOut[3]:
bazcdcdfooBBAAbar01032In [4]: df.sort_index(axis=1, level=0)
Out[4]:
bazcdfooABABbar03120In [5]: df.sort_index(axis=1, level='baz')
Out[5]:
bazcdfooABAB# Sort is the same as item above, regardless of using label or notbar03120
Note that this only happened when doing the unstack and swaplevel together. My original thought was that the latter would be solely responsible, but I could not reproduce the issue using just that alone, so I'm assuming the former is mutating some kind of state of the MultiIndex?
This is a very obscure issue but I think it is responsible for things getting out of whack in #20945
If the DataFrame in step 5 above was constructed directly, the sorting would be the same regardless of whether or not you used the level index or label:
Note that this only happened when doing the
unstack
andswaplevel
together. My original thought was that the latter would be solely responsible, but I could not reproduce the issue using just that alone, so I'm assuming the former is mutating some kind of state of the MultiIndex?INSTALLED VERSIONS
commit: eff1faf
python: 3.6.4.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.0rc2+27.geff1faf27
pytest: 3.4.1
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.27.3
numpy: 1.14.1
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.7.0
patsy: None
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.5
feather: None
matplotlib: 2.1.2
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.5
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: