-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: Groupby filter tests involved len, closing #4447 #5096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
maybe add some more functions here? that are mostly errors e.g. return a weird shape, return different dtypes, raise an error
|
Thanks for those examples. In this WIP commit, the commented-out tests are not raising as they should, so some work is needed on filter itself. Also, there is some inconsistency between ValueError/TypeError depending on DataFrame/Series. Will keeping working... |
yep....assume the user doesn't know what they are doing and don't read the docs :) then try to do something reasonable (including raising a helpful message that filter must return a boolean value!) |
@danielballan ready to go on this? |
Nearly. First, a question: if filter returns a non-boolean value, should we proceed to evaluate its truth? (As far as I understand, this calls |
I'd expect it to accept anything bool-like, potentially with special case for nan being equivalent to False (and producing ndarrays with filter will raise as expected) |
OK, assuming Travis does not object, this is good to go. |
looks good at some future point an example for docs/cookbook which used a non-trivial filter |
# Interpret np.nan as False. | ||
def true_and_notnull(x, *args, **kwargs): | ||
b = wrapper(x, *args, **kwargs) | ||
return b and notnull(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this and
be &
? can b
be an array/frame/series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has to be and
in order to handle NaN
, which we want to interpret as False. Thoughts, @cpcloud?
In [59]: np.nan & pd.notnull(np.nan)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-59-47e0df96d143> in <module>()
----> 1 np.nan & pd.notnull(np.nan)
TypeError: unsupported operand type(s) for &: 'float' and 'bool'
In [60]: np.nan and pd.notnull(np.nan)
Out[60]: False
@danielballan pls squash this down when you are ready |
Ready, pending discussion on & vs. and with cpcloud above. |
this looks fine....@cpcloud |
this is ok...merging |
TST: Groupby filter tests involved len, closing #4447
closed #4447 (tests only, so no release notes necessary)
...which was apparently fixed by #4657. More tests looking for shape-related bugs in filter would be nice. So far I have no discovered any.