Skip to content

Commit d6df87c

Browse files
phoflmroeschke
authored andcommitted
BUG: concat not sorting mixed column names when None is included (pandas-dev#47331)
* REGR: concat not sorting columns for mixed column names * Fix none in columns * BUG: concat not sorting column names when None is included * Update doc/source/whatsnew/v1.5.0.rst Co-authored-by: Matthew Roeschke <[email protected]> * Add gh reference Co-authored-by: Matthew Roeschke <[email protected]>
1 parent ecc7d80 commit d6df87c

File tree

3 files changed

+9
-5
lines changed

3 files changed

+9
-5
lines changed

doc/source/whatsnew/v1.5.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -924,6 +924,7 @@ Reshaping
924924
- Bug in :func:`get_dummies` that selected object and categorical dtypes but not string (:issue:`44965`)
925925
- Bug in :meth:`DataFrame.align` when aligning a :class:`MultiIndex` to a :class:`Series` with another :class:`MultiIndex` (:issue:`46001`)
926926
- Bug in concanenation with ``IntegerDtype``, or ``FloatingDtype`` arrays where the resulting dtype did not mirror the behavior of the non-nullable dtypes (:issue:`46379`)
927+
- Bug in :func:`concat` not sorting the column names when ``None`` is included (:issue:`47331`)
927928
- Bug in :func:`concat` with identical key leads to error when indexing :class:`MultiIndex` (:issue:`46519`)
928929
- Bug in :meth:`DataFrame.join` with a list when using suffixes to join DataFrames with duplicate column names (:issue:`46396`)
929930
- Bug in :meth:`DataFrame.pivot_table` with ``sort=False`` results in sorted index (:issue:`17041`)

pandas/core/algorithms.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -1771,9 +1771,12 @@ def safe_sort(
17711771
def _sort_mixed(values) -> np.ndarray:
17721772
"""order ints before strings in 1d arrays, safe in py3"""
17731773
str_pos = np.array([isinstance(x, str) for x in values], dtype=bool)
1774-
nums = np.sort(values[~str_pos])
1774+
none_pos = np.array([x is None for x in values], dtype=bool)
1775+
nums = np.sort(values[~str_pos & ~none_pos])
17751776
strs = np.sort(values[str_pos])
1776-
return np.concatenate([nums, np.asarray(strs, dtype=object)])
1777+
return np.concatenate(
1778+
[nums, np.asarray(strs, dtype=object), np.array(values[none_pos])]
1779+
)
17771780

17781781

17791782
def _sort_tuples(values: np.ndarray) -> np.ndarray:

pandas/tests/reshape/concat/test_concat.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -469,12 +469,12 @@ def __iter__(self):
469469
tm.assert_frame_equal(concat(CustomIterator2(), ignore_index=True), expected)
470470

471471
def test_concat_order(self):
472-
# GH 17344
472+
# GH 17344, GH#47331
473473
dfs = [DataFrame(index=range(3), columns=["a", 1, None])]
474-
dfs += [DataFrame(index=range(3), columns=[None, 1, "a"]) for i in range(100)]
474+
dfs += [DataFrame(index=range(3), columns=[None, 1, "a"]) for _ in range(100)]
475475

476476
result = concat(dfs, sort=True).columns
477-
expected = dfs[0].columns
477+
expected = Index([1, "a", None])
478478
tm.assert_index_equal(result, expected)
479479

480480
def test_concat_different_extension_dtypes_upcasts(self):

0 commit comments

Comments
 (0)