Skip to content

BUG: wrong errors when indexing with list that includes pd.NA #31948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jorisvandenbossche opened this issue Feb 13, 2020 · 2 comments
Open
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves NA - MaskedArrays Related to pd.NA and nullable extension arrays

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 13, 2020

Currently, indexing with a list including pd.NA (so the list version of indexing with a BooleanArray or IntegerArray) works on the array, but not on Series:

("works" = raising the correct error message)

In [21]: a = pd.array([1, 2, 3])  

In [22]: a[[True, False, pd.NA]]  
...
ValueError: Cannot mask with a boolean indexer containing NA values

In [23]: a[[0, 1, pd.NA]] 
...
ValueError: Cannot index with an integer indexer containing NA values

vs

In [24]: s = pd.Series(a)  

In [25]: s[[True, False, pd.NA]]    
...
KeyError: 'Passing list-likes to .loc or [] with any missing labels is no longer supported, see https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike'

In [26]: s[[0, 1, pd.NA]]  
...
TypeError: boolean value of NA is ambiguous

Because the validation of the indexer isn't yet updated to handle listlikes that include pd.NA.

And similar problems for setitem.

@jorisvandenbossche jorisvandenbossche added Bug Indexing Related to indexing on series/frames, not to indexes themselves NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Feb 13, 2020
@jorisvandenbossche jorisvandenbossche added this to the Contributions Welcome milestone Feb 13, 2020
@jorisvandenbossche
Copy link
Member Author

Note that the version with an actual array or series of "boolean", this works already fine:

In [27]: s[pd.array([True, False, pd.NA])]  
...
ValueError: cannot mask with array containing NA / NaN values

but for integer it is actually the same issue as for the list:

In [28]: s[pd.array([0, 1, pd.NA])]  
...
TypeError: boolean value of NA is ambiguous

@mroeschke mroeschke added the Error Reporting Incorrect or improved errors from pandas label Jul 28, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@lopof
Copy link
Contributor

lopof commented Feb 5, 2024

Update current behavior.
Array:

a = pd.array([1, 2, 3])
a[[True, False, pd.NA]]

<IntegerArray>
[1]
Length: 1, dtype: Int64
a[[0, 1, pd.NA]]

...
ValueError: Cannot index with an integer indexer containing NA values

Series:

s = pd.Series(a)
s[[True, False, pd.NA]]

...
KeyError: '[<NA>] not in index'
s[[0, 1, pd.NA]]

...
KeyError: '[<NA>] not in index'
s[pd.array([True, False, pd.NA])]

0    1
dtype: Int64
s[pd.array([0, 1, pd.NA])]

...
KeyError: '[nan] not in index'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

No branches or pull requests

3 participants