Skip to content

API: consistency in .loc indexing when no values are found in a list-like indexer GH7999) #8003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 14, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Aug 12, 2014

closes #7999

@jreback jreback added this to the 0.15.0 milestone Aug 12, 2014
@jreback
Copy link
Contributor Author

jreback commented Aug 12, 2014

@jorisvandenbossche this also fixes the case we talked about (forgot where), in a multiindex when no values are found it will now raise KeyError (I think it would return the original previously)

@jorisvandenbossche
Copy link
Member

@jreback Did you also fix the inconsistency between df.loc[[1, 3]] (works as one label is in the index) and df.loc[[1, 3],:] (fails with KeyError)? (as I don't see it mentioned in the whatsnew entry, and they should both work)

Secondly, you now solve the inconsistency between df.loc[[3]] and df.loc[[3], :] in favor of the behaviour of df.loc[[3], :]. But is there a good reason? We could also opt for the behaviour of the single axis (df.loc[[3]]), which is maybe(?) more used ànd is in line with reindex (+ see above, this is rather a bug in the df.loc[list, :] case as it definitely fails with if there is at least 1 label found in the index)

@jreback
Copy link
Contributor Author

jreback commented Aug 12, 2014

yes, this fixed df.loc[[1,3]] and df.loc[[1,3],:] issue as well (it was the same issue):

In [1]: df = DataFrame([['a'],['b']],index=[1,2])

In [2]: df
Out[2]:
0
1 a
2 b

In [3]: df.loc[[1, 3]]
Out[3]: 
     0
1    a
3  NaN

In [4]: df.loc[[1, 3],:]
Out[4]: 
     0
1    a
3  NaN

The reason I think that the indexer works liek this is: say you have a big list of labels you are getting. It is better to have it work then raise if 1 is off (and just reindex).

Conversely if nothing is found user prob screwed up so KeyError.

This PR doesn't really change much, just makes things consistent (wether you specify all axes or not).

@jorisvandenbossche
Copy link
Member

@jreback yes, that makes sense. The only 'pity' is that it is a deviation from reindex, because if you say "df.loc with a list is just like reindex" you have to add "apart from the case if is of length 1 ..."

For df.loc[[1,3],:] do you add that do the whatsnew entry?

@jreback
Copy link
Contributor Author

jreback commented Aug 12, 2014

cc @ruidc

going to do a do modification? (to include the un-documented prior behavior)

@jorisvandenbossche yeah I know this is different than reindex, but not sure what to do about it.

@jorisvandenbossche
Copy link
Member

So, for the whatsnew: just something like (after the other one): There was also a difference between ``df.loc[[1,3]]`` (returns a frame reindexed by [1, 3]) and ``df.loc[[1, 3],:]``(would raise ``KeyError`` prior to 0.15.0). Both will now return a reindexed frame.

@jreback
Copy link
Contributor Author

jreback commented Aug 12, 2014

ok, sure

@jreback
Copy link
Contributor Author

jreback commented Aug 12, 2014

ok, updated

@ruidc
Copy link
Contributor

ruidc commented Aug 13, 2014

just a typo: line 1254 of indexing.py:
raise KeyError("Not of [%s] are in the [%s]" %
should be "Not ALL of [%s]"...

jreback added a commit that referenced this pull request Aug 14, 2014
API: consistency in .loc indexing when no values are found in a list-like indexer GH7999)
@jreback jreback merged commit f8225c0 into pandas-dev:master Aug 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation/behaviour error for selecting by label
3 participants