-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add Styler.pipe() method (#23229) #23384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @nmusolino! Thanks for updating the PR.
Comment last updated on November 18, 2018 at 02:04 Hours UTC |
2047052
to
c6a92cd
Compare
Codecov Report
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -214,6 +214,7 @@ Other Enhancements | |||
- Compatibility with Matplotlib 3.0 (:issue:`22790`). | |||
- Added :meth:`Interval.overlaps`, :meth:`IntervalArray.overlaps`, and :meth:`IntervalIndex.overlaps` for determining overlaps between interval-like objects (:issue:`21998`) | |||
- :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` have gained the ``nonexistent`` argument for alternative handling of nonexistent times. See :ref:`timeseries.timezone_nonexsistent` (:issue:`8917`) | |||
- :meth:`Styler.pipe` method added, to simplify application of user-defined functions that operate on stylers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference the issue number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you should create a mini-section explaining the enhancement in more detail so that end users can understand the benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can do that.
pandas/io/formats/style.py
Outdated
@@ -1222,6 +1222,35 @@ class MyStyler(cls): | |||
|
|||
return MyStyler | |||
|
|||
def pipe(self, func, *args, **kwargs): | |||
""" | |||
Apply func(self, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a little more descriptive. Apply func
to what? Also, for what purpose?
(yes, as reviewers, we can see what the rationale was from your PR, so just put that rationale into the docstring)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll change the doctoring, and try to provide some additional context/background.
def g(**kwargs): | ||
assert 'styler' in kwargs | ||
return kwargs['styler'].data | ||
assert self.df.style.pipe((g, 'styler')) is self.df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you comment here explaining why you're using is
instead of tm.assert_frame_equal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, add a comment explaining why you're creating this g
function in the first place (perhaps a more descriptive function name is needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can rewrite this a bit to simplify it. Essentially, I wanted to test that when pipe()
is called with a (callable, string) tuple, it works as advertised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @gfyoung, for the prompt review and feedback. All of your comments make sense, and I'll try to make the suggested changes.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -214,6 +214,7 @@ Other Enhancements | |||
- Compatibility with Matplotlib 3.0 (:issue:`22790`). | |||
- Added :meth:`Interval.overlaps`, :meth:`IntervalArray.overlaps`, and :meth:`IntervalIndex.overlaps` for determining overlaps between interval-like objects (:issue:`21998`) | |||
- :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` have gained the ``nonexistent`` argument for alternative handling of nonexistent times. See :ref:`timeseries.timezone_nonexsistent` (:issue:`8917`) | |||
- :meth:`Styler.pipe` method added, to simplify application of user-defined functions that operate on stylers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can do that.
def g(**kwargs): | ||
assert 'styler' in kwargs | ||
return kwargs['styler'].data | ||
assert self.df.style.pipe((g, 'styler')) is self.df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can rewrite this a bit to simplify it. Essentially, I wanted to test that when pipe()
is called with a (callable, string) tuple, it works as advertised.
pandas/io/formats/style.py
Outdated
@@ -1222,6 +1222,35 @@ class MyStyler(cls): | |||
|
|||
return MyStyler | |||
|
|||
def pipe(self, func, *args, **kwargs): | |||
""" | |||
Apply func(self, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll change the doctoring, and try to provide some additional context/background.
I'll summarize the changes made since the original push:
I wasn't able to test how the whatsnew will be rendered. At this point, the main questions are:
|
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -180,6 +180,26 @@ array, but rather an ``ExtensionArray``: | |||
This is the same behavior as ``Series.values`` for categorical data. See | |||
:ref:`whatsnew_0240.api_breaking.interval_values` for more. | |||
|
|||
New ``Styler.pipe()`` method | |||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |||
A new method :meth:`pandas.formats.style.Styler.pipe()` was added (:issue:`#23229`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Styler as gained a pipe method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments about the docstring.
pandas/io/formats/style.py
Outdated
@@ -1222,6 +1222,76 @@ class MyStyler(cls): | |||
|
|||
return MyStyler | |||
|
|||
def pipe(self, func, *args, **kwargs): | |||
""" | |||
Apply func(self, \*args, \*\*kwargs), and return the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add double backticks around func(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pandas/io/formats/style.py
Outdated
``callable`` that expects the Styler. | ||
args : iterable, optional | ||
positional arguments passed into ``func``. | ||
kwargs : mapping, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the *
before args and kwargs. I'd prefer dict
intead of mapping. Parameter descriptions should start with a capital letter.
If you run ./scripts/validate_docstrings.py pandas.Styler.pipe
you should get a report with most of the docstring issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. As mentioned above, now validating.
pandas/io/formats/style.py
Outdated
|
||
Returns | ||
------- | ||
object : the value returned by ``func``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not always returning a pandas.Styler
object?
The description should go in the next line (indented), and without the colon. And it should start by a capital letter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type is determined by func
, the user-provided function.
pandas/io/formats/style.py
Outdated
-------- | ||
Styler.apply | ||
Styler.applymap | ||
pandas.DataFrame.pipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.
prefix not needed. Please add a description on why those are relevant in the context of Styler.pipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pandas/io/formats/style.py
Outdated
... ''' Highlight the indicated element, and its containing row. ''' | ||
... cell, row = pd.IndexSlice[row_label, column], pd.IndexSlice[row_label, :] | ||
... return (styler.set_properties(cell, **{'background-color': '#ffffcc'}) | ||
... .set_properties(row, **{'background-color': '#ffccc0'})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great if you can find a simpler example to illustrate .pipe()
. Something like adding quotes around the values, or something as simple as that, so the function does not distract on what .pipe()
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good suggestion. I chose this to illustrate functionality that seems to belong in a function, and to use a function that could sensibly be applied twice.
BUT that's not the clearest example for the docs. I'll use a much simpler function as you suggest.
1f99d7b
to
a0e758e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/io/formats/style.py
Outdated
@@ -1222,6 +1222,76 @@ class MyStyler(cls): | |||
|
|||
return MyStyler | |||
|
|||
def pipe(self, func, *args, **kwargs): | |||
""" | |||
Apply func(self, \*args, \*\*kwargs), and return the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pandas/io/formats/style.py
Outdated
``callable`` that expects the Styler. | ||
args : iterable, optional | ||
positional arguments passed into ``func``. | ||
kwargs : mapping, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. As mentioned above, now validating.
pandas/io/formats/style.py
Outdated
|
||
Returns | ||
------- | ||
object : the value returned by ``func``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type is determined by func
, the user-provided function.
pandas/io/formats/style.py
Outdated
-------- | ||
Styler.apply | ||
Styler.applymap | ||
pandas.DataFrame.pipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pandas/io/formats/style.py
Outdated
... ''' Highlight the indicated element, and its containing row. ''' | ||
... cell, row = pd.IndexSlice[row_label, column], pd.IndexSlice[row_label, :] | ||
... return (styler.set_properties(cell, **{'background-color': '#ffffcc'}) | ||
... .set_properties(row, **{'background-color': '#ffccc0'})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good suggestion. I chose this to illustrate functionality that seems to belong in a function, and to use a function that could sensibly be applied twice.
BUT that's not the clearest example for the docs. I'll use a much simpler function as you suggest.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -180,6 +180,26 @@ array, but rather an ``ExtensionArray``: | |||
This is the same behavior as ``Series.values`` for categorical data. See | |||
:ref:`whatsnew_0240.api_breaking.interval_values` for more. | |||
|
|||
New ``Styler.pipe()`` method | |||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |||
A new method :meth:`pandas.formats.style.Styler.pipe()` was added (:issue:`#23229`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -180,6 +180,27 @@ array, but rather an ``ExtensionArray``: | |||
This is the same behavior as ``Series.values`` for categorical data. See | |||
:ref:`whatsnew_0240.api_breaking.interval_values` for more. | |||
|
|||
New ``Styler.pipe()`` method | |||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |||
The dataframe Styler class has gained a :meth:`~pandas.io.formats.style.Styler.pipe` method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newline before this line.
You should be able to link to Styler
class here with :class:`pandas.io.formats.style.Styler`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make those changes. I’ll also (a) add an anchor, like .. _whatsnew_0240.enhancements.styler_pipe:
and (b) change the code block style to match everything else in the file.
I'm fine with skipping the example with f and g. That could also be made
into a ..code-block:: python directive.
…On Thu, Nov 1, 2018 at 8:03 AM Nicholas Musolino ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/io/formats/style.py
<#23384 (comment)>:
> @@ -1222,6 +1222,76 @@ class MyStyler(cls):
return MyStyler
+ def pipe(self, func, *args, **kwargs):
I am traveling at the moment and am not able to copy-paste the output at
this moment. But my recollection is:
1. There were no docstring style issues identified by the
validate_docstrings.py script, in my most recent version.
2. The initial doctests on line 1260 (namely >>>
f(g(df.style.set_precision(3), arg1=a), arg2=b, arg3=c)) and the
subsequent rewritten version fail because the functions f and g are not
defined. I suppose the same thing happens in the analogous DataFrame.pipe
doctest
<https://github.com/pandas-dev/pandas/blob/0409521665bd436a10aea7e06336066bf07ff057/pandas/core/generic.py#L4230>
.
3. The later doctest on line 1287 executes as intended, but fails when
the actual output (a string like <pandas.io.formats.style.Styler at
0xaabcd538364>) does not match the listed output in the doctest (no
output included). I could change this to make it formally pass, but it
would require a doctest: +ELLIPSIS directive (to accommodate the
memory address in the string), and it would be inconsistent with other
doctests in this module (which don’t show the styler’s string
representation as output).
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#23384 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHImhjtI1ukK57kFlbGC6WyVYQWrNxks5uqw0lgaJpZM4X9o-F>
.
|
Merge conflict in 0.24.0, if you could update. |
I can resolve the merge conflict and turn the first example into a |
Here is the updated method documentation: [Large, outdated image removed for brevity] Here is the output from the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments, still needs some work.
Make sure that the docstring validation does not have any error (and that the CI is green).
Thanks for working on this!
pandas/io/formats/style.py
Outdated
*args : iterable, optional | ||
Positional arguments passed into ``func``. | ||
**kwargs : dict, optional | ||
Dictionary of keyword arguments passed into ``func``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can merge args and kwargs in a single line, and you don't need types or optional, as those are always the caseÑ:
*args, **kwargs
Arguments passed to `func`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as suggested. As a result of these changes, the validate_docstring.py
script reports an error, but I agree this is better.
pandas/io/formats/style.py
Outdated
The user-defined highlight function above can be called within a | ||
sequence of other style modifications: | ||
|
||
>>> df = pd.DataFrame({'A': list(range(-1, 4)), 'X': np.arange(0.2, 1.2, 0.2)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use something that looks real, no range
fuctions, and no a
, x
, foo
columns. We start to have many examples with animal names (e.g. cat
, penguin
, falcon
...). I'd use that for consistency (hopefully at some point we have all the examples with similar data). Just have couple of columns that are percentages, and 2 or 3 rows.
Then rename set_standard_formatting
to percentage_format
(if I understand correctly, that's what it is, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes here, and fixed the reference in the text, which was incorrect. I changed the function name as you suggested, while trying to make it clear that the function was user-defined and application-specific.
I'll be honest, using animal names in examples seems a little silly to me, but that's up to the pandas developers.
In this case, using numeric data rather than strings made more sense to me, since numeric data presents styling/formatting choices, and special values (min/max/null) can be highlighted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/io/formats/style.py
Outdated
*args : iterable, optional | ||
Positional arguments passed into ``func``. | ||
**kwargs : dict, optional | ||
Dictionary of keyword arguments passed into ``func``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as suggested. As a result of these changes, the validate_docstring.py
script reports an error, but I agree this is better.
pandas/io/formats/style.py
Outdated
The user-defined highlight function above can be called within a | ||
sequence of other style modifications: | ||
|
||
>>> df = pd.DataFrame({'A': list(range(-1, 4)), 'X': np.arange(0.2, 1.2, 0.2)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes here, and fixed the reference in the text, which was incorrect. I changed the function name as you suggested, while trying to make it clear that the function was user-defined and application-specific.
I'll be honest, using animal names in examples seems a little silly to me, but that's up to the pandas developers.
In this case, using numeric data rather than strings made more sense to me, since numeric data presents styling/formatting choices, and special values (min/max/null) can be highlighted.
@nmusolino a few lines are too long: |
doc/source/whatsnew/v0.24.0.txt
Outdated
return (styler.format({'N': '{:,}', 'X': '{:.1%}'}) | ||
.set_properties(**{'text-align': 'right'})) | ||
|
||
(df.style.set_precision(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is executed. I think the exception at https://travis-ci.org/pandas-dev/pandas/jobs/453316565#L2108 is from this. I'd recommend defining df
in this code block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@TomAugspurger, I've fixed the line length issues, and made sure the "whatsnew" documentation builds correctly, and looks reasonable. Any other changes you would suggest? And would you like me to squash, rebase, and force-push to clean up the commit history? An updated docstring rendering can be found below. |
@jreback @TomAugspurger , could you please take another look at this? I have rebased twice in order to keep up with changes in the whatsnew file, and I'd like to get this PR ready for completion before rebasing again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, if you could fix the conflict and ping when the CI passes I'll merge.
@TomAugspurger , I resolved the conflict, and CI is passing. Would you be able to merge? |
Looks good, thanks @nmusolino! |
* Add Styler.pipe() method, akin to DataFrame.pipe()
* Add Styler.pipe() method, akin to DataFrame.pipe()
Added
Styler.pipe()
method. This allows users to easily apply and compose functions that operate on Styler objects, just like theDataFrame.pipe()
method does for dataframes.pipe()
method, akin to DataFrame.pipe() #23229git diff upstream/master -u -- "*.py" | flake8 --diff