Skip to content

EHN: add ability to format index and col names to Styler #57880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Mar 26, 2024

Conversation

quangngd
Copy link
Contributor

@quangngd quangngd commented Mar 18, 2024

xref #57362

Stage one. Part of an multi-stages effort: #57880 (comment)

Adding a method to Styler and ensuring it works for the default HTML cases with tests in the appropriate pages.

@quangngd quangngd requested a review from attack68 as a code owner March 18, 2024 05:58
@attack68
Copy link
Contributor

Hi thanks for this PR.

A Styler.format_names or Styler.format_index_names method has been on the requested agenda for a while now.
This PR mixes three concepts, however.

  • One is adding a method to Styler and ensuring it works for the default HTML cases with tests in the appropriate pages.

  • Two is extending it to the Styler.to_latex implementation and adding specific to_latex tests.

  • Three is ensuring that the legacy DataFrame.to_latex implementation is capable of using it (in its limited form).

  • (Four would be about checking if Styler.to_string and Styler.to_excel will function accordingly)

  • (Five is ensuring documentation and examples exist in Styler.format_names )

Since a method addition is quite a significant undertaking. I would ask that you split this PR into multiple stages.
The first stage I would look to review is part One. Adding the method just to Styler and adding tests in generic places, including testing the default implementation for HTML output.

From what I can see you have followed the template of previous code and that is a great start.

@attack68 attack68 added Styler conditional formatting using DataFrame.style Enhancement labels Mar 18, 2024
@quangngd quangngd force-pushed the styler_format_names branch from 9062cff to 3cbe658 Compare March 19, 2024 03:29
@quangngd
Copy link
Contributor Author

Hi @attack68,

I've updated my code. It's now my attempt at the first point in your list.
Please help me take a look.

@attack68
Copy link
Contributor

I think we should probably call this method format_index_names. Even though its longer I think it is more obvious what it does.

I think we also need greater test coverage to cover lines.
Take a look at https://github.com/pandas-dev/pandas/blob/main/pandas/tests/io/formats/style/test_format.py

You should probably look to replicate the tests for the specific case :

  • test_format_index_level
  • test_format_clear
  • test_format_callable
  • test_format_dict
  • test_display_format_index

@attack68
Copy link
Contributor

Since your code also involves cases where things are hidden you should also include a test to show that, say if a multi-index has a level hidden, the format function still works correctly on the levels that are still shown.

@quangngd
Copy link
Contributor Author

@attack68 added a bunch of new tests.

don't immediately see the point of test_display_format_index for this case so i didnt adapt it.

@attack68
Copy link
Contributor

Are you familiar with coverage ? The idea was that that collection of tests should pretty much cover all your code branches. Some of the re-used material like _maybe_wrap_formatter is already well covered.


Notes
-----
This method has a similar signature to :meth:`Styler.format_index`. Since `names` are generally label
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls fix this indentation. my fault from previous i expect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@quangngd
Copy link
Contributor Author

quangngd commented Mar 19, 2024

Are you familiar with coverage ? The idea was that that collection of tests should pretty much cover all your code branches. Some of the re-used material like _maybe_wrap_formatter is already well covered.

Yeah. I ran coverage and see my new method is well covered. Mostly because i reuse a lot of things, as you have pointed out.

So i don't really understand what your suggestion is. Can you elaborate? Or maybe most of the material is reuse so you suggest we could achieve the same coverage without so much test? I hope not.

@attack68
Copy link
Contributor

My point is you need to cover your code with your tests. Aim for 100%. It seems you have achieved that. Do you agree?

@quangngd
Copy link
Contributor Author

Thought so too.

If there is nothing major, I think i would fix up whats left and wrap this up. Then ill start to work on the next point in a new PR.

In case if things go stale, this PR can still be merged as an enhancement on its own. (ofc versionadded and whatsnew have to be filled before merge).

Does that sound good to you?

@attack68
Copy link
Contributor

Indeed, is what I would suggest. Thanks for the good work here and following the structure. Hopefully the other reviewers will recognise this as quite a tidy addition and be fairly quick approve also. I'll look to add my approval when all the bits in and tests pass.

There is one additional file you have to add to: doc/source/reference/style.rst

@quangngd quangngd requested a review from mroeschke as a code owner March 20, 2024 04:01
Comment on lines 1038 to 1055
<thead>
<tr>
<th class="blank" >&nbsp;</th>
<th class="index_name level0" >{expected_columns[0]}</th>
<th id="T_test_level0_col0" class="col_heading level0 col0" colspan="2">A</th>
<th id="T_test_level0_col2" class="col_heading level0 col2" colspan="2">B</th>
</tr>
<tr>
<th class="blank" >&nbsp;</th>
<th class="index_name level1" >{expected_columns[1]}</th>
<th id="T_test_level1_col0" class="col_heading level1 col0" >a</th>
<th id="T_test_level1_col1" class="col_heading level1 col1" >b</th>
<th id="T_test_level1_col2" class="col_heading level1 col2" >a</th>
<th id="T_test_level1_col3" class="col_heading level1 col3" >b</th>
</tr>
<tr>
<th class="index_name level0" >{expected_index[0]}</th>
<th class="index_name level1" >{expected_index[1]}</th>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its probably sufficient to assert expected in result rather than equate the large bulk of text. This also future proofs the test so that if other changes are implemented which dont affect the specific format_index_names function this test wont need updating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Contributor

@attack68 attack68 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a whatsnew entry.
Otherwise LGTM.

@quangngd
Copy link
Contributor Author

Do you know the target version? Or should we wait for an active member?

Comment on lines 1643 to 1644
When using a ``formatter`` string the dtypes must be compatible, otherwise a
`ValueError` will be raised.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put this in its own

Raises
------

section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a whatsnew note in v3.0.0.rst under Other Enhancements?

@quangngd
Copy link
Contributor Author

Could you add a whatsnew note in v3.0.0.rst under Other Enhancements?

Added

@mroeschke mroeschke added this to the 3.0 milestone Mar 26, 2024
@mroeschke mroeschke merged commit 010328f into pandas-dev:main Mar 26, 2024
46 checks passed
@mroeschke
Copy link
Member

Thanks @quangngd

@attack68
Copy link
Contributor

@quangngd are you interested to work on stage 2 of this. I think it would be good to get in so that there is no disconnect for users between these functions and the LaTeX output.

To my knowledge there are currently no undocumented issues between the styler and the LaTeX output so good for me to keep an eye on.

@quangngd
Copy link
Contributor Author

@attack68
Sure! Been a bit busy at work lately but I'll find some time to do it in the upcoming weekends.

pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
…57880)

* add new method to styler

* add html test

* fix type

* rename to format_index_names

* Update pandas/io/formats/style_render.py

Co-authored-by: JHM Darbyshire <[email protected]>

* Update pandas/io/formats/style_render.py

Co-authored-by: JHM Darbyshire <[email protected]>

* add tests

* add test

* more doc

* doc

* update code_checks

* add example

* update test

* Update pandas/io/formats/style_render.py

Co-authored-by: Matthew Roeschke <[email protected]>

* Update pandas/io/formats/style_render.py

Co-authored-by: Matthew Roeschke <[email protected]>

* update doc

---------

Co-authored-by: JHM Darbyshire <[email protected]>
Co-authored-by: Matthew Roeschke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Styler conditional formatting using DataFrame.style
Projects
None yet
3 participants