Skip to content

.align() sorts indexes when they differ, does not sort otherwise #22466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ipozdeev opened this issue Aug 22, 2018 · 1 comment
Open

.align() sorts indexes when they differ, does not sort otherwise #22466

ipozdeev opened this issue Aug 22, 2018 · 1 comment
Labels

Comments

@ipozdeev
Copy link

Code Sample, a copy-pastable example if possible

# two frames w/similar indexes
df_1 = pd.DataFrame(np.eye(4))
df_2 = pd.DataFrame(np.eye(4)*-1)

# reverse, sort -> indexes are kept reversed
df_1.iloc[::-1].align(df_2.iloc[::-1], axis=0, join="outer")

# one frame has a different index
df_2 = pd.DataFrame(np.eye(4) * -1, index=range(1, 5))

# reverse, sort -> indexes are sorted
df_1.iloc[::-1].align(df_2.iloc[::-1], axis=0, join="outer")

Problem description

Inconsistent behavior, makes it impossible to use structures of the form:

def func(x, arg):
    if arg < 0:
        return func(arg.iloc[::-1])
    return ...

as the order gets switched without user's consent.

Expected Output

indexes are kept in the original order

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.1
pytest: 2.9.2
pip: 9.0.1
setuptools: 23.0.0
Cython: 0.24
numpy: 1.11.3
scipy: 0.18.1
pyarrow: None
xarray: 0.9.3
IPython: 4.2.0
sphinx: 1.3.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.1
feather: None
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
sqlalchemy: 1.1.13
pymysql: 0.7.9.None
psycopg2: None
jinja2: 2.8
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@TomAugspurger
Copy link
Contributor

Haven't looked yet, but I assume that's a result of Index.union sorting when they're not aligned.

In [3]: pd.Index(['b', 'a']) | pd.Index(['b', 'c'])
Out[3]: Index(['a', 'b', 'c'], dtype='object')

so this is blocked by #17839 at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants