Skip to content

API: Add DataFrame.droplevel #21871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 20, 2018
Merged

API: Add DataFrame.droplevel #21871

merged 10 commits into from
Jul 20, 2018

Conversation

vbarda
Copy link
Contributor

@vbarda vbarda commented Jul 12, 2018

@pep8speaks
Copy link

pep8speaks commented Jul 12, 2018

Hello @vbarda! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on July 19, 2018 at 13:18 Hours UTC

@datapythonista
Copy link
Member

Thanks for the contribution @vbarda.

From the issue, I'd say we want this method in index, not in DataFrame. @TomAugspurger can you please confirm?

Also, you're missing adding a comment to the whatsnew documentation (if you don't know how to do it, you can check closed PRs to see how it's done).

@datapythonista datapythonista added Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex API Design labels Jul 12, 2018
@codecov
Copy link

codecov bot commented Jul 12, 2018

Codecov Report

Merging #21871 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21871      +/-   ##
==========================================
+ Coverage   91.91%   91.96%   +0.05%     
==========================================
  Files         164      166       +2     
  Lines       49987    50334     +347     
==========================================
+ Hits        45947    46292     +345     
- Misses       4040     4042       +2
Flag Coverage Δ
#multiple 90.36% <100%> (+0.06%) ⬆️
#single 42.22% <20%> (+0.06%) ⬆️
Impacted Files Coverage Δ
pandas/core/generic.py 96.47% <100%> (+0.01%) ⬆️
pandas/core/indexes/timedeltas.py 91.53% <0%> (-0.65%) ⬇️
pandas/core/indexes/base.py 96.35% <0%> (-0.23%) ⬇️
pandas/core/indexes/datetimes.py 95.16% <0%> (-0.18%) ⬇️
pandas/core/algorithms.py 94.69% <0%> (-0.16%) ⬇️
pandas/io/packers.py 88.04% <0%> (-0.16%) ⬇️
pandas/core/reshape/merge.py 94.1% <0%> (-0.16%) ⬇️
pandas/core/common.py 92.08% <0%> (-0.12%) ⬇️
pandas/core/indexes/period.py 93.17% <0%> (-0.04%) ⬇️
pandas/core/dtypes/common.py 95.07% <0%> (ø) ⬆️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d94a95...6b8f22b. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good. can you add:

  • whatsnew note in 0.24.0, new features, with a reference to the method
  • add the method in api.rst
  • move to generic as indicated.


axis : {0 or 'index', 1 or 'columns'}, default 0

.. versionadded:: 0.24.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the versionadded to right after the summary (2nd line of doc-string)

@@ -4630,6 +4630,63 @@ def sortlevel(self, level=0, axis=0, ascending=True, inplace=False,
return self.sort_index(level=level, axis=axis, ascending=ascending,
inplace=inplace, sort_remaining=sort_remaining)

def droplevel(self, level, axis=0):
"""Return DataFrame with requested index / column level(s) removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to generic so Series gets this as well

1 2 3 4
5 6 7 8
9 10 11 12
>>> df.droplevel('a')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line between cases

"""
result = self.copy()
axis = self._get_axis_number(axis)
if axis == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be more generic (so works for series as well), eg.

can remove the .copy() (as set_axis copies)

labels = self._get_axis(axis)
labels = labels.droplevel(level)
result = self.set_axis(labels, axis=axis)
return result

@@ -1056,6 +1056,30 @@ def test_reindex_signature(self):
"limit", "copy", "level", "method",
"fill_value", "tolerance"}

def test_droplevel(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number as a comment

@jreback
Copy link
Contributor

jreback commented Jul 12, 2018

@datapythonista

From the issue, I'd say we want this method in index, not in DataFrame. @TomAugspurger can you please confirm?

it already exists for Index, this is exposing on Series/DataFrame.

@vbarda
Copy link
Contributor Author

vbarda commented Jul 12, 2018

Thank you for the comments -- addressing the changes now

@@ -74,6 +74,7 @@ Other Enhancements
- :func:`Series.mode` and :func:`DataFrame.mode` now support the ``dropna`` parameter which can be used to specify whether NaN/NaT values should be considered (:issue:`17534`)
- :func:`to_csv` now supports ``compression`` keyword when a file handle is passed. (:issue:`21227`)
- :meth:`Index.droplevel` is now implemented also for flat indexes, for compatibility with :class:`MultiIndex` (:issue:`21115`)
- :meth:`Series.droplevel` and `DataFrame.droplevel` are now implemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number here

@@ -716,6 +716,62 @@ def swapaxes(self, axis1, axis2, copy=True):

return self._constructor(new_values, *new_axes).__finalize__(self)

def droplevel(self, level, axis=0):
"""Return DataFrame with requested index / column level(s) removed.
.. versionadded:: 0.24.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need a blank line here? @jorisvandenbossche

----------
level : int, str, or list-like
If a string is given, must be the name of a level
If list-like, elements must be names or indexes of levels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indexes -> positional indexes

def test_droplevel(self):
# GH20342
df = pd.DataFrame([
[1, 2, 3, 4],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add tests for series as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, on lines 1085-1089 :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh not in THIS test. pls separate out and put in tests/series/test_alter_axes

@jreback jreback added this to the 0.24.0 milestone Jul 14, 2018
def test_droplevel(self):
# GH20342
df = pd.DataFrame([
[1, 2, 3, 4],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh not in THIS test. pls separate out and put in tests/series/test_alter_axes

names=['level_1', 'level_2'])

# test that dropping of a level in DataFrame index works
expected_df_no_level_a_in_index = df.reset_index('a', drop=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont' use these long names, expected and result are much easier to grok


# test that dropping of a level in Series index works
expected_ser_no_level_b_in_index = ser.reset_index('b', drop=True)
actual_ser_no_level_b_in_index = ser.droplevel('b')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you move add a test that raises when axis !=0 is specified

assert_frame_equal(expected_df_no_level_a_in_index,
actual_df_no_level_a_in_index)

# test that dropping of a level in DataFrame columns works
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize these on axis and also try with axis='columns', and axis='index'

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vbarda minor comments, but have a question about the API.

@@ -74,6 +74,7 @@ Other Enhancements
- :func:`Series.mode` and :func:`DataFrame.mode` now support the ``dropna`` parameter which can be used to specify whether NaN/NaT values should be considered (:issue:`17534`)
- :func:`to_csv` now supports ``compression`` keyword when a file handle is passed. (:issue:`21227`)
- :meth:`Index.droplevel` is now implemented also for flat indexes, for compatibility with :class:`MultiIndex` (:issue:`21115`)
- :meth:`Series.droplevel` and `DataFrame.droplevel` are now implemented (:issue:`20342`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need ref for DataFrame.droplevel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please explain more, not sure what you mean

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have

`DataFrame.droplevel`

Should be

:meth:`DataFrame.droplevel`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

...: [5, 6, 7, 8],
...: [9, 10, 11, 12]
...: ]).set_index([0, 1]).rename_axis(['a', 'b'])
>>> df.columns = pd.MultiIndex.from_tuples([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add blank lines in between here

@@ -716,6 +716,64 @@ def swapaxes(self, axis1, axis2, copy=True):

return self._constructor(new_values, *new_axes).__finalize__(self)

def droplevel(self, level, axis=0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyone think we should add inplace to make this consistent, with other axis ops (e.g. reset_index / set_index)?

cc @TomAugspurger

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me to expose inplace!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say leave it out for now, unless someone asks for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, will do!

@jreback jreback merged commit b9c1786 into pandas-dev:master Jul 20, 2018
@jreback
Copy link
Contributor

jreback commented Jul 20, 2018

thanks @vbarda nice patch!

can you create an issue about adding inplace to .droplevel.

@vbarda
Copy link
Contributor Author

vbarda commented Jul 20, 2018

Thank you -- will add the issue now!

@vbarda
Copy link
Contributor Author

vbarda commented Jul 20, 2018

Added here: #21990

@vbarda vbarda deleted the add-droplevel branch July 20, 2018 15:14
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: Add DataFrame.droplevel
5 participants