Skip to content

Commit f6b0d15

Browse files
chris-b1pcluo
authored andcommitted
BUG: sefault in concat of CategoricalIndex (pandas-dev#16133)
* BUG: sefault in concat of cat-idx * lint
1 parent e650d6d commit f6b0d15

File tree

3 files changed

+27
-2
lines changed

3 files changed

+27
-2
lines changed

doc/source/whatsnew/v0.20.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1629,6 +1629,7 @@ Indexing
16291629
- Bug in the display of ``.info()`` where a qualifier (+) would always be displayed with a ``MultiIndex`` that contains only non-strings (:issue:`15245`)
16301630
- Bug in ``pd.concat()`` where the names of ``MultiIndex`` of resulting ``DataFrame`` are not handled correctly when ``None`` is presented in the names of ``MultiIndex`` of input ``DataFrame`` (:issue:`15787`)
16311631
- Bug in ``DataFrame.sort_index()`` and ``Series.sort_index()`` where ``na_position`` doesn't work with a ``MultiIndex`` (:issue:`14784`, :issue:`16604`)
1632+
- Bug in in ``pd.concat()`` when combining objects with a ``CategoricalIndex`` (:issue:`16111`)
16321633

16331634
I/O
16341635
^^^

pandas/core/indexes/category.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
is_scalar)
1313
from pandas.core.common import _asarray_tuplesafe
1414
from pandas.core.dtypes.missing import array_equivalent
15+
from pandas.core.algorithms import take_1d
1516

1617

1718
from pandas.util.decorators import Appender, cache_readonly
@@ -470,8 +471,10 @@ def get_indexer(self, target, method=None, limit=None, tolerance=None):
470471
codes = target.codes
471472
else:
472473
if isinstance(target, CategoricalIndex):
473-
target = target.categories
474-
codes = self.categories.get_indexer(target)
474+
code_indexer = self.categories.get_indexer(target.categories)
475+
codes = take_1d(code_indexer, target.codes, fill_value=-1)
476+
else:
477+
codes = self.categories.get_indexer(target)
475478

476479
indexer, _ = self._engine.get_indexer_non_unique(codes)
477480

pandas/tests/reshape/test_concat.py

+21
Original file line numberDiff line numberDiff line change
@@ -1928,6 +1928,27 @@ def test_concat_multiindex_dfs_with_deepcopy(self):
19281928
result_no_copy = pd.concat(example_dict, names=['testname'])
19291929
tm.assert_frame_equal(result_no_copy, expected)
19301930

1931+
def test_concat_categoricalindex(self):
1932+
# GH 16111, categories that aren't lexsorted
1933+
categories = [9, 0, 1, 2, 3]
1934+
1935+
a = pd.Series(1, index=pd.CategoricalIndex([9, 0],
1936+
categories=categories))
1937+
b = pd.Series(2, index=pd.CategoricalIndex([0, 1],
1938+
categories=categories))
1939+
c = pd.Series(3, index=pd.CategoricalIndex([1, 2],
1940+
categories=categories))
1941+
1942+
result = pd.concat([a, b, c], axis=1)
1943+
1944+
exp_idx = pd.CategoricalIndex([0, 1, 2, 9])
1945+
exp = pd.DataFrame({0: [1, np.nan, np.nan, 1],
1946+
1: [2, 2, np.nan, np.nan],
1947+
2: [np.nan, 3, 3, np.nan]},
1948+
columns=[0, 1, 2],
1949+
index=exp_idx)
1950+
tm.assert_frame_equal(result, exp)
1951+
19311952

19321953
@pytest.mark.parametrize('pdt', [pd.Series, pd.DataFrame, pd.Panel])
19331954
@pytest.mark.parametrize('dt', np.sctypes['float'])

0 commit comments

Comments
 (0)