.info() fails with CategoricalIndex in the columns #14298

TomAugspurger · 2016-09-25T13:16:59Z

Just in the columns, the index doesn't cause any issues. This came up as the output of pd.get_dummies can have a CategoricalIndex in the columns.

A small, complete example of the issue

In [31]: idx = pd.CategoricalIndex(['a', 'b'])

In [32]: df = pd.DataFrame(np.zeros((2, 2)), index=idx, columns=idx)

In [33]: df.info()

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-33-857958930b27> in <module>()
----> 1 df.info()

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/core/frame.py in info(self, verbose, buf, max_cols, memory_usage, null_counts)
   1790                 if 'object' in counts or is_object_dtype(self.index):
   1791                     size_qualifier = '+'
-> 1792             mem_usage = self.memory_usage(index=True, deep=deep).sum()
   1793             lines.append("memory usage: %s\n" %
   1794                          _sizeof_fmt(mem_usage, size_qualifier))

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/core/frame.py in memory_usage(self, index, deep)
   1827         if index:
   1828             result = Series(self.index.memory_usage(deep=deep),
-> 1829                             index=['Index']).append(result)
   1830         return result
   1831

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/core/series.py in append(self, to_append, ignore_index, verify_integrity)
   1593             to_concat = [self, to_append]
   1594         return concat(to_concat, ignore_index=ignore_index,
-> 1595                       verify_integrity=verify_integrity)
   1596
   1597     def _binop(self, other, func, level=None, fill_value=None):

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
   1323                        keys=keys, levels=levels, names=names,
   1324                        verify_integrity=verify_integrity,
-> 1325                        copy=copy)
   1326     return op.get_result()
   1327

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy)
   1462         self.copy = copy
   1463
-> 1464         self.new_axes = self._get_new_axes()
   1465
   1466     def get_result(self):

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in _get_new_axes(self)
   1550                 new_axes[i] = ax
   1551
-> 1552         new_axes[self.axis] = self._get_concat_axis()
   1553         return new_axes
   1554

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in _get_concat_axis(self)
   1604
   1605         if self.keys is None:
-> 1606             concat_axis = _concat_indexes(indexes)
   1607         else:
   1608             concat_axis = _make_concat_multiindex(indexes, self.keys,

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in _concat_indexes(indexes)
   1622
   1623 def _concat_indexes(indexes):
-> 1624     return indexes[0].append(indexes[1:])
   1625
   1626

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/indexes/base.py in append(self, other)
   1425             # if any of the to_concat is category
   1426             from pandas.indexes.category import CategoricalIndex
-> 1427             return CategoricalIndex._append_same_dtype(self, to_concat, name)
   1428
   1429         if len(typs) == 1:

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/indexes/category.py in _append_same_dtype(self, to_concat, name)
    578         ValueError if other is not in the categories
    579         """
--> 580         to_concat = [self._is_dtype_compat(c) for c in to_concat]
    581         codes = np.concatenate([c.codes for c in to_concat])
    582         result = self._create_from_codes(codes, name=name)

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/indexes/category.py in <listcomp>(.0)
    578         ValueError if other is not in the categories
    579         """
--> 580         to_concat = [self._is_dtype_compat(c) for c in to_concat]
    581         codes = np.concatenate([c.codes for c in to_concat])
    582         result = self._create_from_codes(codes, name=name)

AttributeError: 'Index' object has no attribute '_is_dtype_compat'

Expected Output

df.info()

Output of `pd.show_versions()`

In [34]: pd.show_versions() ## INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.0rc1+21.ge596cbf
nose: 1.3.7
pip: 8.1.2
setuptools: 26.1.1
Cython: 0.25a0
numpy: 1.11.1
scipy: 0.18.0
statsmodels: 0.8.0rc1
xarray: 0.7.2
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: 1.4.1
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.2
openpyxl: 2.3.5
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: 3.4.4
bs4: 4.4.1
html5lib: 0.9999999
httplib2: 0.9.2
apiclient: 1.5.1
sqlalchemy: 1.0.12
pymysql: 0.7.6.None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2016-10-19T10:30:28Z

Similar example from #14450

df = pd.DataFrame({"data": np.arange(0, 10, 0.1)})
v = pd.cut(df.data, [0, 1, 2, 5, 10], include_lowest=True).rename("cuts")
df.join(pd.get_dummies(v))

TomAugspurger added Output-Formatting __repr__ of pandas objects, to_string Categorical Categorical Data Type labels Sep 25, 2016

jreback changed the title ~~.info fails with CategoricalIndex in the columns~~ .info() fails with CategoricalIndex in the columns Sep 25, 2016

pwaller mentioned this issue Oct 19, 2016

df.join(get_dummies(cut(df.v))) fails with AttributeError _is_dtype_compat #14450

Closed

jreback added Difficulty Intermediate labels Oct 19, 2016

jreback added this to the 0.19.1 milestone Oct 19, 2016

jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Oct 31, 2016

jorisvandenbossche mentioned this issue Oct 31, 2016

BUG/API: Index.append with mixed object/Categorical indices #14545

Merged

jorisvandenbossche closed this as completed in #14545 Nov 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.info() fails with CategoricalIndex in the columns #14298

.info() fails with CategoricalIndex in the columns #14298

TomAugspurger commented Sep 25, 2016 •

edited

Loading

jreback commented Oct 19, 2016

.info() fails with CategoricalIndex in the columns #14298

.info() fails with CategoricalIndex in the columns #14298

Comments

TomAugspurger commented Sep 25, 2016 • edited Loading

A small, complete example of the issue

Expected Output

Output of pd.show_versions()

jreback commented Oct 19, 2016

TomAugspurger commented Sep 25, 2016 •

edited

Loading

Output of `pd.show_versions()`