BUG/TST: Fillna limit Error Reporting #27074

Inevitable-Marzipan · 2019-06-27T10:33:03Z

fillna method was not throwing an ValueError with a dataframe full of
integers (no NAs) and limit keyword set to a negative number. Issue can
be seen in GH27042.

Begun with writing tests for checking the behaviour of fillna
and values of limit in test/frame/test_missing.py for both negative and
non-integer values of limit.

Then wrote new test file tests/util/test_validate_fillna_kwargs.py,
where the function would now also take limit as an argument and
perform necassary checks in there.

Made changes to util/_validators.py to correctly check values of limit
as was already done in core/internal/blocks.py

Updated calls to validate_fillna_kwargs in various files
to include the limit argument.

[x ] closes fillna not throwing error when limit is not positive #27042
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
[x ] whatsnew entry

fillna method was not throwing an ValueError with a dataframe full of integers (no NAs) and limit keyword set to a negative number. Issue can be seen in GH27042. Begun with writing tests for checking the behaviour of fillna and values of limit in test/frame/test_missing.py for both negative and non-integer values of limit. Then wrote new test file tests/util/test_validate_fillna_kwargs.py, where the function would now also take limit as an argument and perform necassary checks in there. Made changes to util/_validators.py to correctly check values of limit as was already done in core/internal/blocks.py Updated calls to validate_fillna_kwargs in various files to include the limit argument.

pep8speaks · 2019-06-27T10:33:13Z

Hello @Inevitable-Marzipan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-06-27 14:19:09 UTC

codecov · 2019-06-27T11:06:18Z

Codecov Report

Merging #27074 into master will decrease coverage by 50.17%.
The diff coverage is 38.46%.

@@             Coverage Diff             @@
##           master   #27074       +/-   ##
===========================================
- Coverage   92.04%   41.86%   -50.18%     
===========================================
  Files         180      180               
  Lines       50714    50715        +1     
===========================================
- Hits        46679    21234    -25445     
- Misses       4035    29481    +25446

Flag	Coverage Δ
#multiple	`?`
#single	`41.86% <38.46%> (-0.09%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/internals/blocks.py	`52.36% <ø> (-42.79%)`	⬇️
pandas/core/arrays/base.py	`58.98% <0%> (-40.45%)`	⬇️
pandas/core/arrays/datetimelike.py	`41.79% <0%> (-56.14%)`	⬇️
pandas/core/arrays/categorical.py	`41.73% <0%> (-54.2%)`	⬇️
pandas/core/arrays/numpy_.py	`41.28% <0%> (-53.22%)`	⬇️
pandas/core/generic.py	`38.45% <100%> (-55.75%)`	⬇️
pandas/util/_validators.py	`35.35% <50%> (-61.46%)`	⬇️
pandas/io/formats/latex.py	`0% <0%> (-100%)`	⬇️
pandas/io/gcs.py	`0% <0%> (-100%)`	⬇️
pandas/io/sas/sas_constants.py	`0% <0%> (-100%)`	⬇️
... and 139 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e955515...ee72cd7. Read the comment docs.

codecov · 2019-06-27T11:06:19Z

Codecov Report

Merging #27074 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #27074      +/-   ##
==========================================
- Coverage   92.04%   92.03%   -0.01%     
==========================================
  Files         180      180              
  Lines       50714    50715       +1     
==========================================
- Hits        46679    46677       -2     
- Misses       4035     4038       +3

Flag	Coverage Δ
#multiple	`90.67% <100%> (ø)`	⬆️
#single	`41.86% <38.46%> (-0.1%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/internals/blocks.py	`95.13% <ø> (-0.02%)`	⬇️
pandas/core/arrays/base.py	`99.43% <100%> (ø)`	⬆️
pandas/core/arrays/datetimelike.py	`97.92% <100%> (ø)`	⬆️
pandas/util/_validators.py	`96.96% <100%> (+0.16%)`	⬆️
pandas/core/arrays/categorical.py	`95.92% <100%> (ø)`	⬆️
pandas/core/arrays/numpy_.py	`94.49% <100%> (ø)`	⬆️
pandas/core/generic.py	`94.2% <100%> (ø)`	⬆️
pandas/io/gbq.py	`88.88% <0%> (-11.12%)`	⬇️
pandas/core/frame.py	`96.89% <0%> (-0.12%)`	⬇️
pandas/util/testing.py	`90.94% <0%> (+0.1%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e955515...c9c39c4. Read the comment docs.

doc/source/whatsnew/v0.25.0.rst

TomAugspurger · 2019-06-27T11:46:45Z

pandas/tests/util/test_validate_fillna_kwargs.py

+from pandas.util._validators import validate_fillna_kwargs
+
+
+def test_no_value_or_method():


Was this not tested before?

I couldn't find any mention of this specific function in any test files after doing a search for it. There may be higher level tests that check the behaviour of the fillna method for this though.

TomAugspurger · 2019-06-27T11:47:19Z

pandas/util/_validators.py

@@ -322,7 +322,8 @@ def validate_axis_style_args(data, args, kwargs, arg_name, method_name):
    return out


-def validate_fillna_kwargs(value, method, validate_scalar_dict_value=True):
+def validate_fillna_kwargs(value, method, limit,


Add limit to the parameters.

WillAyd

Nice change

WillAyd · 2019-06-27T12:50:03Z

pandas/core/internals/blocks.py

@@ -352,10 +352,6 @@ def fillna(self, value, limit=None, inplace=False, downcast=None):

        mask = isna(self.values)
        if limit is not None:
-            if not is_integer(limit):


This is removing validation in this method right? Shouldn't we replace with a call to validate_fillna_kwargs?

Digging into the fillna callstack, it first calls super().fillna() in core/frame.py, which then goes to core/generic.py where validate_fillna_kwargs is called (along with validate_bool_kwarg on inplace), before it then goes into core/internalls/blocks.py. Then in fillna in blocks.py it calls validate_bool_kwarg on inplace again, but originally never did the validate_fillna_kwargs call again. I'm not sure if the checks are just being repeated needlessly or not, first in frame and then in blocks. But at the start of the function validate_fillna_kwargs(value, None, limit) could be called I guess.

WillAyd · 2019-06-27T12:51:35Z

pandas/tests/frame/test_missing.py

@@ -516,6 +516,22 @@ def test_fillna_skip_certain_blocks(self):
        # it works!
        df.fillna(np.nan)

+    @pytest.mark.parametrize("type", [int, float])


I don't think these need to be parametrized - the dtype of the calling object is unrelated to the change here

The dtype was exactly what was causing the issue though, this test failed before my change. It has to do with IntBlock and FloatBlock having _can_hold_na set to different values, causing a return in blocks.py's fillna method before it checked the validity of limit.

Maybe np.random.uniform should be used here instead of np.random.rand which returns all zeros when cast to int, although there isn't much need for the change.

pandas/tests/util/test_validate_fillna_kwargs.py

jreback

so we already have checks in algos.pyx, we should consildate these either there or higher level.

jreback · 2019-06-27T17:21:02Z

closing in favor of #27077

TimJ added 2 commits June 27, 2019 11:08

Added Whatsnew Entry

b39604b

TimJ added 2 commits June 27, 2019 11:39

Style changes

01df11f

More stylistic changes

ee72cd7

Inevitable-Marzipan changed the title ~~Fillna limit er~~ BUG/TST: Fillna limit Error Reporting Jun 27, 2019

TomAugspurger reviewed Jun 27, 2019

View reviewed changes

Make whatsnew addition more consistent

5a27113

WillAyd requested changes Jun 27, 2019

View reviewed changes

WillAyd added the Error Reporting Incorrect or improved errors from pandas label Jun 27, 2019

Removed unnecessary parametrization

c9c39c4

jreback requested changes Jun 27, 2019

View reviewed changes

Inevitable-Marzipan mentioned this pull request Jun 27, 2019

BUG/TST: fillna limit checks and tests #27077

Merged

1 task

jreback closed this Jun 27, 2019

Inevitable-Marzipan deleted the fillna_limit_er branch June 27, 2019 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG/TST: Fillna limit Error Reporting #27074

BUG/TST: Fillna limit Error Reporting #27074

Inevitable-Marzipan commented Jun 27, 2019

pep8speaks commented Jun 27, 2019 •

edited

Loading

codecov bot commented Jun 27, 2019

codecov bot commented Jun 27, 2019 •

edited

Loading

TomAugspurger Jun 27, 2019

Inevitable-Marzipan Jun 27, 2019

TomAugspurger Jun 27, 2019

WillAyd left a comment

WillAyd Jun 27, 2019

Inevitable-Marzipan Jun 27, 2019

WillAyd Jun 27, 2019

Inevitable-Marzipan Jun 27, 2019

Inevitable-Marzipan Jun 27, 2019

jreback left a comment

jreback commented Jun 27, 2019

		from pandas.util._validators import validate_fillna_kwargs


		def test_no_value_or_method():

BUG/TST: Fillna limit Error Reporting #27074

BUG/TST: Fillna limit Error Reporting #27074

Conversation

Inevitable-Marzipan commented Jun 27, 2019

pep8speaks commented Jun 27, 2019 • edited Loading

Comment last updated at 2019-06-27 14:19:09 UTC

codecov bot commented Jun 27, 2019

Codecov Report

codecov bot commented Jun 27, 2019 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

jreback commented Jun 27, 2019

pep8speaks commented Jun 27, 2019 •

edited

Loading

codecov bot commented Jun 27, 2019 •

edited

Loading