Clarify is_bool_indexer for Extension dtypes #22326

TomAugspurger · 2018-08-13T20:07:38Z

What do we want here?

In [1]: import pandas as pd

In [2]: pd.core.common.is_bool_indexer(pd.Categorical([True, True]))
Out[2]: False

working around this in #22325

TomAugspurger · 2018-08-13T20:08:41Z

Likewise for is_bool_dtype.

TomAugspurger · 2018-08-21T20:24:10Z

This manifests in failures for .loc with Categoricalndex holding booleans

In [3]: pd.Series([1, 2, 3]).loc[pd.Categorical([True, False, True])]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-0532517e2922> in <module>()
----> 1 pd.Series([1, 2, 3]).loc[pd.Categorical([True, False, True])]

~/sandbox/pandas/pandas/core/indexing.py in __getitem__(self, key)
   1500
   1501             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1502             return self._getitem_axis(maybe_callable, axis=axis)
   1503
   1504     def _is_scalar_access(self, key):

~/sandbox/pandas/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1902                     raise ValueError('Cannot index with multidimensional key')
   1903
-> 1904                 return self._getitem_iterable(key, axis=axis)
   1905
   1906             # nested tuple slicing

~/sandbox/pandas/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
   1203             # A collection of keys
   1204             keyarr, indexer = self._get_listlike_indexer(key, axis,
-> 1205                                                          raise_missing=False)
   1206             return self.obj._reindex_with_indexers({axis: [keyarr, indexer]},
   1207                                                    copy=True, allow_dups=True)

~/sandbox/pandas/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
   1159         self._validate_read_indexer(keyarr, indexer,
   1160                                     o._get_axis_number(axis),
-> 1161                                     raise_missing=raise_missing)
   1162         return keyarr, indexer
   1163

~/sandbox/pandas/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1244                 raise KeyError(
   1245                     u"None of [{key}] are in the [{axis}]".format(
-> 1246                         key=key, axis=self.obj._get_axis_name(axis)))
   1247
   1248             # We (temporarily) allow for some missing keys with .loc, except in

KeyError: "None of [Index([True, False, True], dtype='object')] are in the [index]"

That should be Series([1, 3]).

jorisvandenbossche · 2018-08-22T08:56:30Z

Should we have something similar on the dtype as for numeric data? (the ExtensionDtype._is_numeric added recently) To indicate a certain dtype is considered boolean?

Because otherwise those inspection functions would need to know how to inspect the dtype (for categorical checking the dtype of the categories).

I am only a bit worried about a possible proliferation of such attributes ..

(now, categorical with boolean categories also doesn't sound that useful. We could also say we require an actual boolean dtype to do boolean indexing)

TomAugspurger · 2018-08-22T10:49:32Z

I suppose this is what the .kind attribute is for?

Right now Categorical.dtype.kind is always O:

In [8]: pd.Categorical([True, False]).dtype.kind
Out[8]: 'O'

If we changed that to be .categories.dtype.kind we wouldn't need a ._is_boolean, though we would need to implement a BooleanIndex.

Closes pandas-dev#22665 Closes pandas-dev#22326

Closes #22665 Closes #22326

Closes pandas-dev#22665 Closes pandas-dev#22326

TomAugspurger added Indexing Related to indexing on series/frames, not to indexes themselves ExtensionArray Extending pandas with custom dtypes or arrays. labels Aug 13, 2018

TomAugspurger mentioned this issue Aug 13, 2018

SparseArray is an ExtensionArray #22325

Merged

4 tasks

TomAugspurger added the Sparse Sparse Data Type label Aug 31, 2018

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018

BUG: EA-backed boolean indexers

2afcdf4

Closes pandas-dev#22665 Closes pandas-dev#22326

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018

BUG: EA-backed boolean indexers

7f05e57

Closes pandas-dev#22665 Closes pandas-dev#22326

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018

BUG: EA-backed boolean indexers

e1e6314

Closes pandas-dev#22665 Closes pandas-dev#22326

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018

BUG: EA-backed boolean indexers

04029ac

Closes pandas-dev#22665 Closes pandas-dev#22326

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 11, 2018

BUG: EA-backed boolean indexers

47da6d3

Closes pandas-dev#22665 Closes pandas-dev#22326

TomAugspurger mentioned this issue Sep 11, 2018

is_bool_dtype for ExtensionArrays #22667

Merged

jreback added this to the 0.24.0 milestone Sep 13, 2018

TomAugspurger closed this as completed in #22667 Sep 20, 2018

TomAugspurger added a commit that referenced this issue Sep 20, 2018

is_bool_dtype for ExtensionArrays (#22667)

e568fb0

Closes #22665 Closes #22326

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

is_bool_dtype for ExtensionArrays (pandas-dev#22667)

4f44f84

Closes pandas-dev#22665 Closes pandas-dev#22326

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify is_bool_indexer for Extension dtypes #22326

Clarify is_bool_indexer for Extension dtypes #22326

TomAugspurger commented Aug 13, 2018

TomAugspurger commented Aug 13, 2018

TomAugspurger commented Aug 21, 2018

jorisvandenbossche commented Aug 22, 2018

TomAugspurger commented Aug 22, 2018

Clarify is_bool_indexer for Extension dtypes #22326

Clarify is_bool_indexer for Extension dtypes #22326

Comments

TomAugspurger commented Aug 13, 2018

TomAugspurger commented Aug 13, 2018

TomAugspurger commented Aug 21, 2018

jorisvandenbossche commented Aug 22, 2018

TomAugspurger commented Aug 22, 2018