Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
DEPR: DataFrame.lookup #35224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: DataFrame.lookup #35224
Changes from 18 commits
5bbddce
02b50ac
ca8bf05
dea45b9
5f116ad
0c04e90
758468d
7ac3b32
2ab80cf
94a6c0f
a339131
269e4cc
0c40d69
1ca23bc
9681a3d
6342ad2
3dfe19d
187d47b
db63df7
227fad5
ce775ce
2ee7d09
4c78311
f447185
dc2d367
293bd7a
cbca163
90fa6a9
3eefd8e
4c3c163
b5a34e3
ba4fb8a
6b91db6
ff7724f
104e3cb
25e78dd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than query, use .loc here as its more idiomatic (and plus its a single statement). Do we actually need to convert to numpy? (e.g. to avoid alignment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I used
to_numpy
is indeed for the index alignment. With.loc
and withoutto_numpy
, I would needreset_index
. So that would look like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one do you prefer? I can understand that this is more idiomatic, althought it's a bit more code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw now I'm typing it, I realize this method would fail if the index was not
0, .., n-1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a horrible choice as an alternative for
lookup
. Essentially,melt = df.melt('col')
duplicates the data (not to mention the extravariable
column) while one can resolve to numpy indexing.I'm not sure what's the implementation of
lookup
, but it's a real request emerging from what I could see on SO. To mepivot
andpivot_table
are more redundant thanlookup
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@quanghm my response is to the tone of the argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am not averse to the function of lookup but it's not going to be a top level api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet another reason why I don't go and make a pull request. I don't understand what's other than top level api that I can implement
lookup
functionality, and still don't understand performant.Can you please be more specific? The way I see it, @erfannariman was basically saying that unless I made a pull request, I cannot criticize the suggested alternative as horrible even when I spent time to analyze, test and show that it is, in fact, horrible. OK, if horrible is an offensive word, my apology, English is not my first language. Let me re-phrase it is very bad.
I do feel that my effort to contribute here is not valued. That was the best I can afford to better Pandas. Not everyone can afford to make a pull request, modify the code, run and pass all the unit tests.
Thanks for all the work maintaining and developing Pandas anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opening a ticket with everything you wrote here has more chance to get reviewed by the core devs than to write all this on a closed PR. That way the other devs can see it as well, since I am quite sure they are not seeing this whole discussion.
If you don't feel comfortable open an issue, please let me know and I wil open one, since Jeff mentioned he's open for a (discussion) for re-implementing lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I already opened the ticket since it's better to have the discussion there: see #40140 @quanghm @jreback