-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC-Modified documentation for the issue GH42106 #42110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
316262f
9dd6e41
d3364d7
5e327bc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1523,18 +1523,17 @@ Looking up values by index/column labels | |
---------------------------------------- | ||
|
||
Sometimes you want to extract a set of values given a sequence of row labels | ||
and column labels, this can be achieved by ``DataFrame.melt`` combined by filtering the corresponding | ||
rows with ``DataFrame.loc``. For instance: | ||
and column labels, this can be achieved by ``pandas.factorize`` which extracts the distinct values of the intended column and can be indexed by passing the length of the dataframe to the | ||
``numpy.arange`` function and the distinct value array. For instance: | ||
|
||
.. ipython:: python | ||
|
||
df = pd.DataFrame({'col': ["A", "A", "B", "B"], | ||
'A': [80, 23, np.nan, 22], | ||
'B': [80, 55, 76, 67]}) | ||
df | ||
melt = df.melt('col') | ||
melt = melt.loc[melt['col'] == melt['variable'], 'value'] | ||
melt.reset_index(drop=True) | ||
idx, cols = pd.factorize(df['col']) | ||
df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The text above the example will also need to be changed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, you are right. I have modified the text which is there above the example,as per my understanding. Can you review it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. show the result in the PR itself as this doesn't look to replicate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On pandas 1.1.5: In [1]: df = pd.DataFrame({'col': ["A", "A", "B", "B"],
...: ...: 'A': [80, 23, np.nan, 22],
...: ...: 'B': [80, 55, 76, 67]})
...:
In [2]: df.lookup(df.index, df['col'])
Out[2]: array([80., 23., 76., 67.]) the example on master: In [2]: melt = df.melt('col')
...: melt = melt.loc[melt['col'] == melt['variable'], 'value']
...: melt.reset_index(drop=True)
Out[2]:
0 80.0
1 23.0
2 76.0
3 67.0
Name: value, dtype: float64 the example here: In [3]: idx, cols = pd.factorize(df['col'])
...: df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
Out[3]: array([80., 23., 76., 67.]) |
||
|
||
Formerly this could be achieved with the dedicated ``DataFrame.lookup`` method | ||
which was deprecated in version 1.2.0. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very complicated wording...how about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified the file as per your suggestion.