-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Add optional argument keep_index to dataframe melt method #17459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Setting keep_index to True will reuse the original DataFrame index + names of melted columns as additional level. closes issue pandas-dev#17440
Codecov Report
@@ Coverage Diff @@
## master #17459 +/- ##
==========================================
- Coverage 91.15% 91.13% -0.03%
==========================================
Files 163 163
Lines 49591 49599 +8
==========================================
- Hits 45207 45200 -7
- Misses 4384 4399 +15
Continue to review full report at Codecov.
|
Thanks. We'll need tests and docs as well. Test can go in About the implementation, the
But as I write this, I wonder if the last two would ever be useful? Do we just need a better name than |
Thank you for the comments! I agree that Maybe rename
I cannot think of a good usecase for the option Another idea:
Anyway I would go for @TomAugspurger‘s idea to use a keyword with multiple options. When we have decided what's best, I will challenge myself with writing tests and documentation :). |
@@ -4367,6 +4367,10 @@ def unstack(self, level=-1, fill_value=None): | |||
Name to use for the 'value' column. | |||
col_level : int or string, optional | |||
If columns are a MultiIndex then use this level to melt. | |||
keep_index : boolean, optional, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is commonly called index=False
everywhere else.
add a versionadded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So better to just name it index
and if True resulting in the original index with duplicate entries? What about the option @TomAugspurger proposed?
|
||
if keep_index: | ||
orig_index_values = list(np.tile(frame.index.get_values(), K)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is quite awkward, you have several cases which you need to disambiguate. e.g. if the original is a MI or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jreback for looking over my code and the comment.
I think what I wrote should work with any number of levels.
E. g.
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
idx_multi = pd.MultiIndex.from_tuples(tuples)
idx_single = pd.Index(arrays[0])
# Index
print(list(np.tile(idx_single, 1)))
print(list(np.tile(idx_single, 2)))
# MultiIndex
print(list(np.tile(idx_multi, 1)))
print(list(np.tile(idx_multi, 2)))
But do I have to make it more explicit (= Pythonic)? Or did I miss something else?
pls rebase. |
closing as stale |
@NiklasKeck @TomAugspurger What happened to this pull request? I came from #17440 and wish to contribute.
Do i have to choose 1 of Travis-CI, Appveyor , or CircleCI to hook onto my github? |
We needed to merge master into this PR to see if the tests still passed. You can see the changed fils in And then run the tests as described in the contributing docs. You don't have to do anything with the CI services. |
Setting keep_index to True will reuse the original DataFrame index +
names of melted columns as additional level. closes issue #17440
git diff upstream/master -u -- "*.py" | flake8 --diff
I appreciate any corrections, comments and/or help very much, as this is my first pull request on such a big project. Thank you.