Skip to content

.info() fails with CategoricalIndex in the columns #14298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Sep 25, 2016 · 1 comment · Fixed by #14545
Closed

.info() fails with CategoricalIndex in the columns #14298

TomAugspurger opened this issue Sep 25, 2016 · 1 comment · Fixed by #14545
Labels
Categorical Categorical Data Type Output-Formatting __repr__ of pandas objects, to_string Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 25, 2016

Just in the columns, the index doesn't cause any issues. This came up as the output of pd.get_dummies can have a CategoricalIndex in the columns.

A small, complete example of the issue

In [31]: idx = pd.CategoricalIndex(['a', 'b'])

In [32]: df = pd.DataFrame(np.zeros((2, 2)), index=idx, columns=idx)

In [33]: df.info()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-33-857958930b27> in <module>()
----> 1 df.info()

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/core/frame.py in info(self, verbose, buf, max_cols, memory_usage, null_counts)
   1790                 if 'object' in counts or is_object_dtype(self.index):
   1791                     size_qualifier = '+'
-> 1792             mem_usage = self.memory_usage(index=True, deep=deep).sum()
   1793             lines.append("memory usage: %s\n" %
   1794                          _sizeof_fmt(mem_usage, size_qualifier))

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/core/frame.py in memory_usage(self, index, deep)
   1827         if index:
   1828             result = Series(self.index.memory_usage(deep=deep),
-> 1829                             index=['Index']).append(result)
   1830         return result
   1831

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/core/series.py in append(self, to_append, ignore_index, verify_integrity)
   1593             to_concat = [self, to_append]
   1594         return concat(to_concat, ignore_index=ignore_index,
-> 1595                       verify_integrity=verify_integrity)
   1596
   1597     def _binop(self, other, func, level=None, fill_value=None):

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, copy)
   1323                        keys=keys, levels=levels, names=names,
   1324                        verify_integrity=verify_integrity,
-> 1325                        copy=copy)
   1326     return op.get_result()
   1327

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy)
   1462         self.copy = copy
   1463
-> 1464         self.new_axes = self._get_new_axes()
   1465
   1466     def get_result(self):

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in _get_new_axes(self)
   1550                 new_axes[i] = ax
   1551
-> 1552         new_axes[self.axis] = self._get_concat_axis()
   1553         return new_axes
   1554

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in _get_concat_axis(self)
   1604
   1605         if self.keys is None:
-> 1606             concat_axis = _concat_indexes(indexes)
   1607         else:
   1608             concat_axis = _make_concat_multiindex(indexes, self.keys,

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/tools/merge.py in _concat_indexes(indexes)
   1622
   1623 def _concat_indexes(indexes):
-> 1624     return indexes[0].append(indexes[1:])
   1625
   1626

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/indexes/base.py in append(self, other)
   1425             # if any of the to_concat is category
   1426             from pandas.indexes.category import CategoricalIndex
-> 1427             return CategoricalIndex._append_same_dtype(self, to_concat, name)
   1428
   1429         if len(typs) == 1:

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/indexes/category.py in _append_same_dtype(self, to_concat, name)
    578         ValueError if other is not in the categories
    579         """
--> 580         to_concat = [self._is_dtype_compat(c) for c in to_concat]
    581         codes = np.concatenate([c.codes for c in to_concat])
    582         result = self._create_from_codes(codes, name=name)

/Users/tom.augspurger/Envs/py3/lib/python3.5/site-packages/pandas-0.19.0rc1+21.ge596cbf-py3.5-macosx-10.11-x86_64.egg/pandas/indexes/category.py in <listcomp>(.0)
    578         ValueError if other is not in the categories
    579         """
--> 580         to_concat = [self._is_dtype_compat(c) for c in to_concat]
    581         codes = np.concatenate([c.codes for c in to_concat])
    582         result = self._create_from_codes(codes, name=name)

AttributeError: 'Index' object has no attribute '_is_dtype_compat'

Expected Output

df.info()

Output of pd.show_versions()

In [34]: pd.show_versions() ## INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.0rc1+21.ge596cbf
nose: 1.3.7
pip: 8.1.2
setuptools: 26.1.1
Cython: 0.25a0
numpy: 1.11.1
scipy: 0.18.0
statsmodels: 0.8.0rc1
xarray: 0.7.2
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: 1.4.1
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.2
openpyxl: 2.3.5
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: 3.4.4
bs4: 4.4.1
html5lib: 0.9999999
httplib2: 0.9.2
apiclient: 1.5.1
sqlalchemy: 1.0.12
pymysql: 0.7.6.None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

@TomAugspurger TomAugspurger added Output-Formatting __repr__ of pandas objects, to_string Categorical Categorical Data Type labels Sep 25, 2016
@jreback jreback changed the title .info fails with CategoricalIndex in the columns .info() fails with CategoricalIndex in the columns Sep 25, 2016
@jreback jreback added this to the 0.19.1 milestone Oct 19, 2016
@jreback
Copy link
Contributor

jreback commented Oct 19, 2016

Similar example from #14450

df = pd.DataFrame({"data": np.arange(0, 10, 0.1)})
v = pd.cut(df.data, [0, 1, 2, 5, 10], include_lowest=True).rename("cuts")
df.join(pd.get_dummies(v))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Output-Formatting __repr__ of pandas objects, to_string Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants