Skip to content

How should DataFrame.isin() handle DataFrame input? #7158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cpcloud opened this issue May 17, 2014 · 5 comments
Closed

How should DataFrame.isin() handle DataFrame input? #7158

cpcloud opened this issue May 17, 2014 · 5 comments
Labels
API Design Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@cpcloud
Copy link
Member

cpcloud commented May 17, 2014

n = 5
df = DataFrame({'a': randint(3, size=n),
                'b': randn(n)})
df2 = DataFrame({'a': randint(4, size=n),
                 'b': df.b + (rand(n) > 0.5).astype(float)})
df.isin(df2)

yields

a b
0 False True
1 False True
2 False False
3 False False
4 False True

I might expect it to go column by column calling isin on the intersection of the columns. I'm not sure if this was ever defined when DataFrame.isin() was implemented.

@cpcloud cpcloud added this to the 0.14.1 milestone May 17, 2014
@TomAugspurger
Copy link
Contributor

We did isin accepting DataFrames in a second PR. I'll see if I can find the discussion.

@TomAugspurger
Copy link
Contributor

Here's the PR / discussion: #5199

@jorisvandenbossche
Copy link
Member

It is also clearly stated in the docstring how it is handled: If values is a DataFrame, then both the index and column labels must match. http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.isin.html#pandas.DataFrame.isin

@hayd
Copy link
Contributor

hayd commented May 30, 2014

see new issue on this #7258.

If you want to do it column-wise, pass a dict:

df.isin(dict(df2))

@jreback
Copy link
Contributor

jreback commented May 30, 2014

closing this in favor of #7258

@jreback jreback closed this as completed May 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

5 participants