Skip to content

string methods that return boolean arrays should return bool if 'nan' is in series (instead of 'nan') #1689

Closed
@bmu

Description

@bmu
In [32]:  s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])

In [33]: s.str.startswith('A')
Out[33]: 
0     True
1    False
2    False
3     True
4    False
5      NaN
6    False
7    False
8    False

In [34]: s[s.str.startswith('A')]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-f0d46b8c76ff> in <module>()
----> 1 s[s.str.startswith('A')]

/net/home4/bmueller/.virtualenvs/myenv/lib/python2.6/site-packages/pandas/core/series.pyc in __getitem__(self, key)
    447         # special handling of boolean data with NAs stored in object
    448         # arrays. Since we can't represent NA with dtype=bool
--> 449         if _is_bool_indexer(key):
    450             key = self._check_bool_indexer(key)
    451             key = np.asarray(key, dtype=bool)

/net/home4/bmueller/.virtualenvs/myenv/lib/python2.6/site-packages/pandas/core/common.pyc in _is_bool_indexer(key)
    499         if not lib.is_bool_array(key):
    500             if isnull(key).any():
--> 501                 raise ValueError('cannot index with vector containing '
    502                                  'NA / NaN values')                                                                                                                                                                                  
    503             return False

ValueError: cannot index with vector containing NA / NaN values

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions