Skip to content

bool(MultiIndex) seems to always return False #7897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
langmore opened this issue Aug 1, 2014 · 6 comments · Fixed by #7951
Closed

bool(MultiIndex) seems to always return False #7897

langmore opened this issue Aug 1, 2014 · 6 comments · Fixed by #7951

Comments

@langmore
Copy link

langmore commented Aug 1, 2014

This means that patterns such as:

if my_frame.index:
    # do something

with my_frame having a MultiIndex will never enter the if block. One can check that bool(my_frame.index) returns False. A single index (pandas.Index) will (correctly IMO) raise a ValueError warning that the truth value of an array is ambiguous. However, a MultiIndex fails silently.

This is such an awesome source of silent bugs that I'm surprised this hasn't come up before. Is there any reason why we want bool to return False every time for a MultiIndex?

@jreback
Copy link
Contributor

jreback commented Aug 1, 2014

hmm, actually it should always return True. You cannot construct an empty MultiIndex at all. But more to the point, this is a very odd thing to do and is ambiguous (maybe this should ALWAYS raise).. See docs (for Series), here: http://pandas.pydata.org/pandas-docs/stable/gotchas.html

and in reality this should simply raise. its a numpy artifact that it does not.

you should do:

if len(my_frame.index):
    pass 

or

if len(pandas_object):
    pass

as the index can never be None

@langmore
Copy link
Author

langmore commented Aug 1, 2014

The case that caused a bug for me was this:

# Reindex by a common index
new_index = frame_1.index.intersection(frame_2.index)
frame_1 = frame_1.reindex(index=new_index)
frame_2 = frame_2.reindex(index=new_index)

The natural thing to do, since you have new_index ready, is to check if it is empty. The natural python way to do this is with if new_index. Since the levels array in new_index is empty, I would expect bool(new_index) to be False (or to raise a ValueError such as numpy arrays do).

@jreback
Copy link
Contributor

jreback commented Aug 1, 2014

can u post the starting indexes? or a facsimile ?

@langmore
Copy link
Author

langmore commented Aug 1, 2014

My previous comment (about my bug) was confusing. The bug occurred when there was a non-empty index intersection, and I therefore expected to step into an if block. Since bool returned False, my code did not step in.

>>> frame_1 = pandas.DataFrame({'col1': [1, 2]},
                               index= pandas.MultiIndex.from_tuples([('A', 1), ('A', 2)]))
>>> frame_2 = pandas.DataFrame({'col1': [1, 2]},
                               index= pandas.MultiIndex.from_tuples([('A', 1), ('A', 3)]))
>>> common_index = frame_1.index.intersection(frame_2.index)
>>> common_index
MultiIndex(levels=[[u'A'], [1]],
           labels=[[0], [0]],
           sortorder=0)

And then

>>> if common_index:
         print "Stepped in"
        else:
          print "Did not step in"

Did not step in

@jreback
Copy link
Contributor

jreback commented Aug 1, 2014

ok I'll have a look

I don't this was defined and relied on numpy behavior. was always false because of the way a MultiIndex is defined

I think it's reasonable to make the nonzero simply return the len boolean
but let me see

@hayd
Copy link
Contributor

hayd commented Aug 1, 2014

This is such an awesome source of silent bugs

+1. IMO doing anything other than raising would be surprising (same argument as Series and DataFrame).

Wow to the other PR this inspired!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants