Skip to content

ENH: Validation to only allow positive integers for options #27382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Aug 5, 2019

Conversation

Adam-Klaum
Copy link
Contributor

@Adam-Klaum Adam-Klaum commented Jul 13, 2019

I looked at several different options for providing this functionality. In the end it seemed best to create a a new function that returns a validation function to only allow positive integers. So now we have is_pos_int available along side the existing is_int and is_text function. display.max_rows has been modified to use this new validation. I went through the other integer based display options and tested each with a negative int to see what else might be broken. display.precision and display.max_columns both had issues when set to a negative number prior to displaying a dataframe. They have been changed as well.

set up so that they can be either a None type or a positive int. Updated
the option creation code for display options that failed when set to
negative numbers.
@WillAyd WillAyd added the Output-Formatting __repr__ of pandas objects, to_string label Jul 15, 2019
returning an additional function. This also allows it to be used in an
identical fasion to is_int, is_text etc.
Modified the validation calls to is_pos_int to remove ().
Modified the tests using is_pos_int to remove ().
Updated whatsnew documentation to be more user focused.

Parameters
----------
value - the value to be checked
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback comment for returns section is lost in diff so re-commenting here but I'm surprised this isn't failing the CI either

@datapythonista do we ignore private docstrings in the validation? Vaguely recall some comment around that but didn't see code to confirm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, shall I add the following?

Returns
----------
Nothing 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd yes, we don't validate docstrings of private functions. Ideally we should validate everything, but with all the pending work in the public API that has an impact in the web docs, I don't think we should spend time in fixing the private docstrings yet (and if we don't fix the existing cases, we can't easily validate the new ones).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification - do you know where that exclusion happens? Didn't see it in code_checks and couldn't find on quick glance in validate_docstrings.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups sorry, was going to mention and I forgot. It's not that we exclude them, it's that we get the docstrings to consider from the .rst files of the API docs. The code here is a good starting point to understand that: https://github.com/pandas-dev/pandas/blob/master/scripts/validate_docstrings.py#L873

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adam-Klaum these need to be updated to follow the doctoring stander. See https://dev.pandas.io/development/contributing.html#updating-a-pandas-docstring

@Adam-Klaum
Copy link
Contributor Author

Adam-Klaum commented Jul 16, 2019

py37_np_dev test is failing with the error below. Not sure why. I notice this is happening on other pull requests as well:

File "/home/vsts/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/pytest_nunit/plugin.py", line 276, in <genexpr> list(case for case in self.cases.values() if case["outcome"] == "passed") KeyError:  outcome'

@Adam-Klaum
Copy link
Contributor Author

Is there anything else I need to do on this one? It still shows as Changes requested, but those changes have been integrated into later commits. I just want to make sure nobody is waiting on me for this one.


Other
^^^^^
- Trying to set the display.precision, display.max_rows or display.max_columns using set_option to anything but a None or a positive int will raise a ``ValueError`` (:issue:`23348`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the option names like display.precision in double-backticks.

set_option -> :func:`pandas.set_option`

a None -> None

int -> integer.


Parameters
----------
value - the value to be checked
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Adam-Klaum these need to be updated to follow the doctoring stander. See https://dev.pandas.io/development/contributing.html#updating-a-pandas-docstring


"""

if not (value is None or isinstance(value, int)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use isistance(values, numbers.Integral)

Can you restructure this a bit?

if value is None:
    return value

elif isinstance(value, numbers.Integral):
    if value < 0:
         raise
    return value

raise ValueError(...)

"""

if not (value is None or isinstance(value, int)):
msg = "Value must be an instance of <class 'NoneType'>|<class 'int'>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most people won't be familiar with NoneType. Should probably say like 'value' must be a positive integer or None.

Simplified is_pos_int function and updated tests accordingly
Rewrote is_pos_int docstring to conform to standards
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment on the docstring.

General question: it seems like 0 is allowed? Should we call this is_nonnegative_int then, and changes references to "positive int" to "non-negative integer"?


Raises
------
ValueError if the value is not equal to a positive integer, 0 or None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be formatted like

Raises
-------
ValueError
    When the value is not a positive integer or None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, 0 is allowed, and is necessary for existing internal code. I think you are right, at this point "is_pos_int" is technically not an accurate name. I'm fine with "is_nonnegative_int". The docstring change seems fine to me as well. I'll make the changes and do another push.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-string, ping on green.


Other
^^^^^
- Trying to set the ``display.precision``, ``display.max_rows`` or ``display.max_columns`` using set_option to anything but a None or a positive int will raise a ``ValueError`` (:issue:`23348`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double backticks around None. use :meth:`set_option`

@jreback jreback added this to the 1.0 milestone Jul 31, 2019
@jreback
Copy link
Contributor

jreback commented Aug 5, 2019

lgtm. over to you @WillAyd

@WillAyd WillAyd merged commit f9f95c0 into pandas-dev:master Aug 5, 2019
@WillAyd
Copy link
Member

WillAyd commented Aug 5, 2019

thanks @Adam-Klaum !

quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pandas should validate display.max_rows option
5 participants