Skip to content

BUG: Fix for fillna ignoring axis=1 parameter (issues #17399 #17409) #28441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ Other
- Trying to set the ``display.precision``, ``display.max_rows`` or ``display.max_columns`` using :meth:`set_option` to anything but a ``None`` or a positive int will raise a ``ValueError`` (:issue:`23348`)
- Using :meth:`DataFrame.replace` with overlapping keys in a nested dictionary will no longer raise, now matching the behavior of a flat dictionary (:issue:`27660`)
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` now support dicts as ``compression`` argument with key ``'method'`` being the compression method and others as additional compression options when the compression method is ``'zip'``. (:issue:`26023`)
-
- :meth:`DataFrame.fillna` when using axis=1 previously failed to replace NaN values (:issue:`17399` and :issue:`17409`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true in general, or was it just under specific conditions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to make a more direct test case. This is the demo that made me find it:

https://imgur.com/a/3ZrXqSv

After that I found an existing PR which had grown stale, and @jreback asked me to update it, so I thought I'd give it a try.


.. _whatsnew_1000.contributors:

Expand Down
3 changes: 2 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -6274,7 +6274,8 @@ def fillna(
)

new_data = self._data.fillna(
value=value, limit=limit, inplace=inplace, downcast=downcast
value=value, limit=limit, inplace=inplace,
downcast=downcast, axis: Axis=axis
)

elif isinstance(value, (dict, ABCSeries)):
Expand Down
18 changes: 8 additions & 10 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@
from pandas.core.nanops import nanpercentile

from pandas.io.formats.printing import pprint_thing
from pandas_typing import Axis


class Block(PandasObject):
Expand Down Expand Up @@ -386,7 +387,7 @@ def apply(self, func, **kwargs):

return result

def fillna(self, value, limit=None, inplace=False, downcast=None):
def fillna(self, value, limit=None, inplace=False, downcast=None, axis: Axis = 0):
""" fillna on the block with the value. If we fail, then convert to
ObjectBlock and try again
"""
Expand All @@ -398,13 +399,10 @@ def fillna(self, value, limit=None, inplace=False, downcast=None):
raise ValueError("Limit must be an integer")
if limit < 1:
raise ValueError("Limit must be greater than 0")
mask[mask.cumsum(self.ndim - 1) > limit] = False
mask[mask.cumsum(int(axis == 0 and self.ndim > 1)) > limit] = False

if not self._can_hold_na:
if inplace:
return self
else:
return self.copy()
return self if inplace else self.copy()

if self._can_hold_element(value):
# equivalent: _try_coerce_args(value) would not raise
Expand Down Expand Up @@ -1842,7 +1840,7 @@ def concat_same_type(self, to_concat, placement=None):
placement = placement or slice(0, len(values), 1)
return self.make_block_same_class(values, ndim=self.ndim, placement=placement)

def fillna(self, value, limit=None, inplace=False, downcast=None):
def fillna(self, value, limit=None, inplace=False, downcast=None, axis: Axis = 0):
values = self.values if inplace else self.values.copy()
values = values.fillna(value=value, limit=limit)
return [
Expand Down Expand Up @@ -2406,15 +2404,15 @@ def concat_same_type(self, to_concat, placement=None):
return ObjectBlock(values, ndim=self.ndim, placement=placement)
return super().concat_same_type(to_concat, placement)

def fillna(self, value, limit=None, inplace=False, downcast=None):
def fillna(self, value, limit=None, inplace=False, downcast=None, axis: Axis = 0):
# We support filling a DatetimeTZ with a `value` whose timezone
# is different by coercing to object.
if self._can_hold_element(value):
return super().fillna(value, limit, inplace, downcast)
return super().fillna(value, limit, inplace, downcast, axis)

# different timezones, or a non-tz
return self.astype(object).fillna(
value, limit=limit, inplace=inplace, downcast=downcast
value, limit=limit, inplace=inplace, downcast=downcast, axis=axis
)

def setitem(self, indexer, value):
Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/frame/test_missing.py
Original file line number Diff line number Diff line change
Expand Up @@ -671,6 +671,21 @@ def test_fillna_columns(self):
expected = df.astype(float).fillna(method="ffill", axis=1)
assert_frame_equal(result, expected)

def test_fillna_rows(self):
#GH17399
df = pd.DataFrame({
"a": [1,2,3,4],
"b": [5,np.nan,7,8],
"c": [9,10,11,np.nan]})

expected = pd.DataFrame({
"a": [1,2,3,4],
"b": [5,6,7,8],
"c": [9,10,11,6]})

result = df.fillna(df.mean(axis=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see axis passed to fillna here? You pass it to mean but not fillna.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should back up and make sure the behavior I'm seeing is expected then an not a bug. In this notebook I first try to replace NaNs with the column average, then the row average. Doing the first works, but the second silently does nothing.

Setting axis=1 on fillna when using the mean where axis=1 causes an error: Currently only can fill with dict/Series column by column.

It seems like calling fillna on a frame with NaNs which doesn't fill them is an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I also think this test might be missing the point. Can you try getting the issue directly from #17399 to work? I think that is relatively clearly laid out

tm.assert_frame_equal(result, expected)

def test_fillna_invalid_method(self, float_frame):
with pytest.raises(ValueError, match="ffil"):
float_frame.fillna(method="ffil")
Expand Down