Skip to content

TST: Groupby filter tests involved len, closing #4447 #5096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 13, 2013

Conversation

danielballan
Copy link
Contributor

closed #4447 (tests only, so no release notes necessary)

...which was apparently fixed by #4657. More tests looking for shape-related bugs in filter would be nice. So far I have no discovered any.

@jreback
Copy link
Contributor

jreback commented Oct 3, 2013

maybe add some more functions here? that are mostly errors

e.g. return a weird shape, return different dtypes, raise an error

def f(x):
     if len(x) > 1:
           raise TypeError():
     return True

def f(x):
      return np.nan

def f(x):
       return x

def f(x):
       return x == 1

@danielballan
Copy link
Contributor Author

Thanks for those examples. In this WIP commit, the commented-out tests are not raising as they should, so some work is needed on filter itself. Also, there is some inconsistency between ValueError/TypeError depending on DataFrame/Series. Will keeping working...

@jreback
Copy link
Contributor

jreback commented Oct 3, 2013

yep....assume the user doesn't know what they are doing and don't read the docs :) then try to do something reasonable (including raising a helpful message that filter must return a boolean value!)

@jreback
Copy link
Contributor

jreback commented Oct 7, 2013

@danielballan ready to go on this?

@danielballan
Copy link
Contributor Author

Nearly. First, a question: if filter returns a non-boolean value, should we proceed to evaluate its truth? (As far as I understand, this calls __nonzero__ in 2.x, __bool__ in 3.x.) Specifcally, should grouped.filter(lambda x: 1) raise or behave like grouped.filter(lambda x: True)? What about np.nan, which also evaluates True? I lean toward raising on np.nan but tolerating 1. Opinions?

@jtratner
Copy link
Contributor

jtratner commented Oct 9, 2013

I'd expect it to accept anything bool-like, potentially with special case for nan being equivalent to False (and producing ndarrays with filter will raise as expected)

@danielballan
Copy link
Contributor Author

OK, assuming Travis does not object, this is good to go.

@jreback
Copy link
Contributor

jreback commented Oct 10, 2013

looks good

at some future point an example for docs/cookbook which used a non-trivial filter

@jreback
Copy link
Contributor

jreback commented Oct 10, 2013

@jtratner @cpcloud comments?

# Interpret np.nan as False.
def true_and_notnull(x, *args, **kwargs):
b = wrapper(x, *args, **kwargs)
return b and notnull(b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this and be &? can b be an array/frame/series?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has to be and in order to handle NaN, which we want to interpret as False. Thoughts, @cpcloud?

In [59]: np.nan & pd.notnull(np.nan)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-59-47e0df96d143> in <module>()
----> 1 np.nan & pd.notnull(np.nan)

TypeError: unsupported operand type(s) for &: 'float' and 'bool'

In [60]: np.nan and pd.notnull(np.nan)
Out[60]: False

@jreback
Copy link
Contributor

jreback commented Oct 10, 2013

@danielballan pls squash this down when you are ready

@danielballan
Copy link
Contributor Author

Ready, pending discussion on & vs. and with cpcloud above.

@jreback
Copy link
Contributor

jreback commented Oct 13, 2013

this looks fine....@cpcloud

@jreback
Copy link
Contributor

jreback commented Oct 13, 2013

this is ok...merging

jreback added a commit that referenced this pull request Oct 13, 2013
TST: Groupby filter tests involved len, closing #4447
@jreback jreback merged commit 9ec30ad into pandas-dev:master Oct 13, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter of grouped dataframe raises ValueError
4 participants