Skip to content

API: Closes #7879: (drops not nan in panel.to_frame() by default) #10908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ Highlights include:
- Development support for benchmarking with the `Air Speed Velocity library <https://github.com/spacetelescope/asv/>`_ (:issue:`8316`)
- Support for reading SAS xport files, see :ref:`here <whatsnew_0170.enhancements.sas_xport>`
- Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, see :ref:`here <whatsnew_0170.prior_deprecations>`
- Deprecated ``filter_observations`` by ``dropna`` in ``Panel.to_frame`` and changed default to ``True`` (:issue:`7879`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to the whatsnew/v0.17.0 in the deprecation section. This needs an example in a sub-section. Showing the previous usage and the new.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do


See the :ref:`v0.17.0 Whatsnew <whatsnew_0170>` overview for an extensive list
of all enhancements and bugs that have been fixed in 0.17.0.
Expand Down
11 changes: 8 additions & 3 deletions pandas/core/panel.py
Original file line number Diff line number Diff line change
Expand Up @@ -862,15 +862,20 @@ def groupby(self, function, axis='major'):
axis = self._get_axis_number(axis)
return PanelGroupBy(self, function, axis=axis)

def to_frame(self, filter_observations=True):
@deprecate_kwarg(old_arg_name='filter_observations', new_arg_name='dropna')
def to_frame(self, dropna=False):
"""
Transform wide format into long (stacked) format as DataFrame whose
columns are the Panel's items and whose index is a MultiIndex formed
of the Panel's major and minor axes.

Parameters
----------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave this in (and say its deprecated)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - will add it

filter_observations : boolean, default True
dropna : boolean, default False
Drop (major, minor) pairs without a complete set of observations
across all the items

filter_observations : boolean, default False, [deprecated]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default for the original needs to remain (e.g. True)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that possible? If you don't serve the "filter_observations" argument then you don't use the deprecated argument and use dropna instead. With that filter_observations is False implicitly.

Drop (major, minor) pairs without a complete set of observations
across all the items

Expand All @@ -880,7 +885,7 @@ def to_frame(self, filter_observations=True):
"""
_, N, K = self.shape

if filter_observations:
if dropna is True:
# shaped like the return DataFrame
mask = com.notnull(self.values).all(axis=0)
# size = mask.sum()
Expand Down
16 changes: 10 additions & 6 deletions pandas/tests/test_panel.py
Original file line number Diff line number Diff line change
Expand Up @@ -1525,19 +1525,19 @@ def test_transpose_copy(self):

def test_to_frame(self):
# filtered
filtered = self.panel.to_frame()
filtered = self.panel.to_frame(dropna=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a test for .to_frame(), which is equiv to .to_frame(dropna=False)

expected = self.panel.to_frame().dropna(how='any')
assert_frame_equal(filtered, expected)

# unfiltered
unfiltered = self.panel.to_frame(filter_observations=False)
unfiltered = self.panel.to_frame(dropna=False)
assert_panel_equal(unfiltered.to_panel(), self.panel)

# names
self.assertEqual(unfiltered.index.names, ('major', 'minor'))

# unsorted, round trip
df = self.panel.to_frame(filter_observations=False)
df = self.panel.to_frame(dropna=False)
unsorted = df.take(np.random.permutation(len(df)))
pan = unsorted.to_panel()
assert_panel_equal(pan, self.panel)
Expand All @@ -1554,6 +1554,10 @@ def test_to_frame(self):
self.assertEqual(rdf.index.names, df.index.names)
self.assertEqual(rdf.columns.names, df.columns.names)

# test kw filter_observations deprecation
with tm.assert_produces_warning(Warning):
filtered = self.panel.to_frame(filter_observations=True)

def test_to_frame_mixed(self):
panel = self.panel.fillna(0)
panel['str'] = 'foo'
Expand Down Expand Up @@ -1597,7 +1601,7 @@ def test_to_frame_multi_major(self):
assert_frame_equal(result, expected)

wp.iloc[0, 0].iloc[0] = np.nan # BUG on setting. GH #5773
result = wp.to_frame()
result = wp.to_frame(dropna=True)
assert_frame_equal(result, expected[1:])

idx = MultiIndex.from_tuples([(1, 'two'), (1, 'one'), (2, 'one'),
Expand Down Expand Up @@ -1651,7 +1655,7 @@ def test_to_frame_multi_drop_level(self):
idx = MultiIndex.from_tuples([(1, 'one'), (2, 'one'), (2, 'two')])
df = DataFrame({'A': [np.nan, 1, 2]}, index=idx)
wp = Panel({'i1': df, 'i2': df})
result = wp.to_frame()
result = wp.to_frame(dropna=True)
exp_idx = MultiIndex.from_tuples([(2, 'one', 'A'), (2, 'two', 'A')],
names=[None, None, 'minor'])
expected = DataFrame({'i1': [1., 2], 'i2': [1., 2]}, index=exp_idx)
Expand Down Expand Up @@ -2210,7 +2214,7 @@ def setUp(self):
tm.add_nans(panel)

self.panel = panel.to_frame()
self.unfiltered_panel = panel.to_frame(filter_observations=False)
self.unfiltered_panel = panel.to_frame(dropna=False)

def test_ops_differently_indexed(self):
# trying to set non-identically indexed panel
Expand Down