Skip to content

PEP: pandas/core round 2 #11951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2016
Merged

PEP: pandas/core round 2 #11951

merged 1 commit into from
Jan 16, 2016

Conversation

rockg
Copy link
Contributor

@rockg rockg commented Jan 3, 2016

Fix layout and pep8 compliance for categorical, common, and frame. Used YAPF with default settings and then manual modifications including:
-fixing line length of comments/docstrings for pep8 compliance
-converting lambdas to functions for pep8 compliance
-change YAPF default formatting of error strings
-change some YAPF line break choices to make them more natural in my eyes. For example converting

if (function1(argument) and function2(
         argument)):

to

if (function1(argument) and
   function2(argument)):

@@ -179,7 +192,8 @@ class Categorical(PandasObject):
[a, b, c, a, b, c]
Categories (3, object): [a < b < c]

>>> a = Categorical(['a','b','c','a','b','c'], ['c', 'b', 'a'], ordered=True)
>>> a = Categorical(['a','b','c','a','b','c'], ['c', 'b', 'a'], \
ordered=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be indented

@jreback jreback added the Code Style Code style, linting, code_checks label Jan 4, 2016
@jreback jreback added this to the 0.18.0 milestone Jan 4, 2016
@shoyer
Copy link
Member

shoyer commented Jan 4, 2016

Per PEP8, we should prefer using parenthesis to explicit continuations with \. I see a lot of \ uses here -- some existing, some new. Maybe you could take this as an opportunity to clean these up?

@jorisvandenbossche
Copy link
Member

@jreback Can we wait with merging these style PRs before the discussion has settled a bit? xref #11954

@wesm
Copy link
Member

wesm commented Jan 4, 2016

+1 @shoyer -- per #11954 I am in favor of outlawing \ completely for the codebase (except in the exception that I noted with triple-quote strings)

@jreback
Copy link
Contributor

jreback commented Jan 4, 2016

@jorisvandenbossche yes more than happy for others to deal with the formatting nit-picking :)

I agree a single tool is prob the best (for both usage) and validation via Travis. (and fix the docs to do the same).

@rockg rockg force-pushed the pep8-round2 branch 4 times, most recently from f816fae to 6a77470 Compare January 7, 2016 01:23
@hayd
Copy link
Contributor

hayd commented Jan 7, 2016

Please could you include the command you used in the commit messages/PR post? :)

@@ -179,13 +191,13 @@ class Categorical(PandasObject):
[a, b, c, a, b, c]
Categories (3, object): [a < b < c]

>>> a = Categorical(['a','b','c','a','b','c'], ['c', 'b', 'a'], ordered=True)
>>> a = Categorical(['a','b','c','a','b','c'], ['c', 'b', 'a'], \
ordered=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples shouldn't use \ for continuation. This line should just be formatted as you would do with normal python code, so something like

>>> a = Categorical(['a','b','c','a','b','c'], ['c', 'b', 'a'],
                    ordered=True)

There are also some cases of this in other examples I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure of those...good to know.

@rockg rockg force-pushed the pep8-round2 branch 2 times, most recently from 46f8158 to ab8ae80 Compare January 7, 2016 12:40
@rockg
Copy link
Contributor Author

rockg commented Jan 7, 2016

I'm not completely sure I like this. It looks nice, but maybe a bit too expansive. I know the PyDev debugger would force you to go over every single line before stepping into DataFrameFormatter which would get annoying after awhile.

@jreback
Copy link
Contributor

jreback commented Jan 7, 2016

@rockg personally I find 1 kw arg per line MUCH more clear. Yes debugging is a tiny bit trickier, but its all laid out better.

@rockg
Copy link
Contributor Author

rockg commented Jan 7, 2016

@jreback I agree that it looks nicer just thinking of some practical considerations (one could think of some rule that if it's >5 lines of arguments do something different, but that won't look as nice and will be difficult to automatically format).

@jreback
Copy link
Contributor

jreback commented Jan 7, 2016

@rockg yeh, certainly someone could play with the knobs on YAPF (and set in a config file).

@jreback
Copy link
Contributor

jreback commented Jan 7, 2016

-converting lambdas to functions for pep8 compliance

this is currently disabled in the flake8 checking, this is the E731. as I find lambda functions pretty intuitve to use in assignment. If you have already changed them ok, then.

as an aside can you put the notes you have above in contributing.rst at some point.

warn("Creating a 'Categorical' with 'levels' is deprecated, use 'categories' instead",
FutureWarning, stacklevel=2)
if levels is not None:
warn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no line break here

@rockg
Copy link
Contributor Author

rockg commented Jan 11, 2016

I brought this up above. That is an output of yapf. I think the algorithm design is essentially to move all the arguments to one line and then apply the rules to that (so even though the line is not 80 characters now, moving up the second line of arguments would make it so, causing it to be split as you see it). I think the only option would be:

SPLIT_PENALTY_FOR_ADDED_LINE_SPLIT
The penalty incurred by adding a line split to the unwrapped line. The more line splits added the higher the penalty.

However, I think that would not wrap lines with really long arguments and split lines with shorter argument lists or not wrap at all. Is there any case where we want to wrap (e.g., many arguments)?

@rockg
Copy link
Contributor Author

rockg commented Jan 11, 2016

@wesm Does "as long as it fits within 80 characters" mean the entire signature or separate lines of the signature. Should the example given by @jorisvandenbossche be modified or left as is?

@hayd
Copy link
Contributor

hayd commented Jan 11, 2016

Please could you include the command you used in the commit messages/PR post? :)

Would be good to see what knobs you used, so we can play along at home/tweak them.

@wesm
Copy link
Member

wesm commented Jan 11, 2016

@rockg each line of code (signature can be multiple lines)

@rockg
Copy link
Contributor Author

rockg commented Jan 11, 2016

This version you see used no knobs, but in spirit a knob of SPLIT_PENALTY_AFTER_OPENING_BRACKET=1000000 was used (it was applied manually rather than lose all the other work) after it was decided to change it. I think the only other one that may be needed is SPLIT_PENALTY_FOR_ADDED_LINE_SPLIT to be larger in order to not split function signatures.

@rockg
Copy link
Contributor Author

rockg commented Jan 11, 2016

@jreback What are your opinions on this? You were in favor of the multiple lines.

@jorisvandenbossche
Copy link
Member

@hayd you are somewhat more familiar with yapf?
Becuase I tried to experiment a little bit with the knobs, but as far as I can see, there seems to be no way to specify that once a signature is, if is put on one line, longer as 80 chars to not have this split into 'each keyword its own line'. Even the example from above, which is perfectly within 80 chars (by splitting it in two lines) gets reformatted.
I also commented about it here: google/yapf#193 (comment)

@rockg
Copy link
Contributor Author

rockg commented Jan 11, 2016

Seems like this does it for this case:

gfr@beta ~ $ yapf --style='{based_on_style: pep8}' test.py
def TestClass(object):
    def __init__(self,
                 values,
                 categories=None,
                 ordered=False,
                 name=None,
                 fastpath=False,
                 levels=None):
gfr@beta ~ $ yapf --style='{based_on_style: pep8, SPLIT_BEFORE_NAMED_ASSIGNS=0}' test.py
def TestClass(object):
    def __init__(self, values, categories=None, ordered=False, name=None,
                 fastpath=False, levels=None):

@rockg
Copy link
Contributor Author

rockg commented Jan 11, 2016

I'll try these two knobs on the files so we can see how it looks.

@jorisvandenbossche
Copy link
Member

@rockg ah yes indeed! Didn't try that one (hadn't guessed it from the name)

@rockg
Copy link
Contributor Author

rockg commented Jan 12, 2016

@jorisvandenbossche Please take a look. I ran with:

yapf -i --style='{based_on_style: pep8, SPLIT_BEFORE_NAMED_ASSIGNS=False, SPLIT_PENALTY_AFTER_OPENING_BRACKET=3000, SPLIT_PENALTY_IMPORT_NAMES=3000, SPLIT_PENALTY_LOGICAL_OPERATOR=30}'

@hayd
Copy link
Contributor

hayd commented Jan 12, 2016

That's looking great! I think this addresses most/all of @wesm's concerns ?

The couple of awkward looking lines I think will look awkward no matter what (without a refactor)...

We should add a [yapf] section to setup.cfg with those options.

@wesm
Copy link
Member

wesm commented Jan 12, 2016

Cool that's looking good.

@jorisvandenbossche
Copy link
Member

Indeed, looks good! @rockg the result we see now, is this fully the output of the yapf config you gave above, or is this still some manual clean-up as well?

@rockg
Copy link
Contributor Author

rockg commented Jan 12, 2016

There is still manual work including docstrings/error strings beyond 79 chars and then fixing any formatting yapf couldn't fix (this was quite small). I played with docformatter but that did not work very well as it ended up eliminating all the numpydoc formatting. Also, keeping split_penalty_after_opening_bracket to unbreakable would be preferred but then the imports became indented to match the opening bracket. I could not get the both of these to work together, but maybe that level of indentation in imports is fine and the open parentheses formatting is more important.

@rockg
Copy link
Contributor Author

rockg commented Jan 12, 2016

For example, in the settings above or split_penalty_after_opening_bracket set to unbreakable, instances like this still need fixing (both not splitting on the parenthesis and then splitting the error line). These were all fixed manually. We can try and open an issue with yapf and see if they can fix this.

        if isinstance(other, Categorical):
            # Two Categoricals can only be be compared if the categories are the same
            if (len(self.categories) != len(other.categories)) or \
                    not ((self.categories == other.categories).all()):
                raise TypeError(
                    "Categoricals can only be compared if 'categories' are the same")
            if not (self.ordered == other.ordered):
                raise TypeError(
                    "Categoricals can only be compared if 'ordered' is the same")

@jreback
Copy link
Contributor

jreback commented Jan 12, 2016

@rockg ok, can you update setup.cfg with a [yapf] section

and rebase.

@rockg rockg force-pushed the pep8-round2 branch 2 times, most recently from 4e0e52a to 00df805 Compare January 13, 2016 01:19
@jreback
Copy link
Contributor

jreback commented Jan 13, 2016

all ok with this?

@jorisvandenbossche
@wesm

ideally want to merge this and then have @rockg finish up the rest, so we can actually turn on the checking in travis (for pandas/core). I will add advisory checking (e.g. won't fail it) progressively to all other files as well.

@jreback
Copy link
Contributor

jreback commented Jan 15, 2016

@rockg can you rebase again, ping when green.

any final comments?

@rockg rockg force-pushed the pep8-round2 branch 2 times, most recently from 3c45f56 to 5419f10 Compare January 16, 2016 04:01
@rockg
Copy link
Contributor Author

rockg commented Jan 16, 2016

@jreback All set.

jreback added a commit that referenced this pull request Jan 16, 2016
@jreback jreback merged commit 1945eed into pandas-dev:master Jan 16, 2016
@jreback
Copy link
Contributor

jreback commented Jan 16, 2016

@rockg thank you sir!

of course followups welcome

  • amend docs to show using yapf (and add to the ci/* where flake8 is installed)
  • more core peps

@TomAugspurger
Copy link
Contributor

FYI I'm mucking around in core/reshape.py right now working on from-dummies, so I can clean that up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants