Skip to content

ENH: add kwargs rename to Styler.format_index for overwriting index labels #45288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 38 commits into from

Conversation

attack68
Copy link
Contributor

@attack68 attack68 commented Jan 9, 2022

In methods such as DataFrame.to_latex and DataFrame.to_html the header keyword arg was implemented to provide aliases for column headers, instead of using the underlying DataFrame labels.

Currently Styler.format_index allows adding a generic formatter to change the display of index labels but this is not an easy way to simply provide a list of string aliases to use instead. PR adds this function, which will simplify the transition of the above methods to Styler implementation.

@jreback
Copy link
Contributor

jreback commented Jan 10, 2022

ok so is the plan to deprecate header? why are you adding this rather than we just use the given labels?

@attack68
Copy link
Contributor Author

ok so is the plan to deprecate header? why are you adding this rather than we just use the given labels?

3 small arguments for use case:

First, progressing Styler as a visualisation tool allowing full customisation of the display, independent of the data and label values. This is already possible where you can now apply a formatting function (1 variable input -> str) to indexes, which is very useful, for example, when formatting dates. But a generic function cannot just overwrite the (visible) labels with user specified values, but this aliases arg will allow that.

Second, the existing IO methods have this headers arg, which allows overwriting column labels with aliases. Personally I think its not sensible to try and squish all the Styler functionality (via chaining) into a single DataFrame.to_html and to_latex call, but the path of least resistance (community approvals wise) seems to be to not take away functionality from existing kwargs. I.e we would likely keep headers=xx and use the styler implementation just refactor as a call to format_index(aliases=xx).

Finally, in a more specialised scenario you might want a formatter that uses a loop index, or combines all level values simulatneously to determine an alias, which you then pass directly here. Note that the indexing of the data can still be maintained underneath via the MultiIndex.

multiindex = MultiIndex.from_product([["Mac", "Windows"],  ["2GB", "4GB"]])
def relabel(os, ram, loop):
    return f"{loop}_{os}_{ram}"
aliases = [relabel(idx[0], idx[1], i) for i, idx in enumerate(multiindex)]
styler.hide(level=0).format_index(aliases=aliases)

      level 1           data
0_Mac_2GB                  x
1_Mac_4GB                  x
2_Windows_2GB              x
3_Windows_4GB              x

@attack68
Copy link
Contributor Author

Screen Shot 2022-01-20 at 19 01 25
Screen Shot 2022-01-20 at 19 01 40
Screen Shot 2022-01-20 at 19 01 55
Screen Shot 2022-01-20 at 19 02 02
Screen Shot 2022-01-20 at 19 02 16

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not objecting to the need to do this, i get it. but we don't use alias anywhere; why isn't this just .rename()?

@jreback jreback added the Styler conditional formatting using DataFrame.style label Feb 9, 2022
@attack68
Copy link
Contributor Author

attack68 commented Feb 9, 2022

not objecting to the need to do this, i get it. but we don't use alias anywhere; why isn't this just .rename()?

rename loosely suggests the data is modified, when it is really just displayed. Therefore, included within the format_index method.

Didn't really feel this warranted its own method when all other index formatting is done within format_index.
(and its not possible to combine the two so there was that also)

But im not tied to anything; you tell me which you prefer and ill make changes...

…_multi

# Conflicts:
#	doc/source/whatsnew/v1.5.0.rst
@attack68
Copy link
Contributor Author

@jreback what about a different kwarg other that alias, how about names or labels?

@attack68 attack68 changed the title ENH: add kwargs aliases to Styler.format_index for overwriting index labels ENH: add kwargs rename to Styler.format_index for overwriting index labels Mar 4, 2022
@@ -1203,6 +1207,17 @@ def format_index(
Convert string patterns containing https://, http://, ftp:// or www. to
HTML <a> tags as clickable URL hyperlinks if "html", or LaTeX \href
commands if "latex".
rename : list of str, list of list of str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm the user can simply .rename_axis on the frame no? why adding api here that is duplicative

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my design objective with styler has been to customise the display of tables without altering the underlying dataframe object (which is of course possible to get around many of the styler functions).

you can lose the power of indexing if you relabel for the purpose of printout.

additionally, methods like DataFrame.to_html and DataFrame.to_latex use a headers argument which does this relabelling. So in order to replicate that behaviour with the styler code need this pr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's a good example:
Screenshot 2022-03-07 at 18 14 58

@attack68
Copy link
Contributor Author

@jreback feels like this one has stalled and getting push back anyway. If poss please take one last look since it factors into the PRs where I convert the DataFrame.to_latex and DataFrame.to_html to use the Styler implementation. And if this one is closed then I have to come up with another way of refactoring them. thx

…_multi

# Conflicts:
#	pandas/tests/io/formats/style/test_format.py
@jreback
Copy link
Contributor

jreback commented Apr 27, 2022

@attack68 circling back around on this.

@attack68
Copy link
Contributor Author

In order to be able to completely re-write DataFrame.to_latex with the Styler implementation, this is the last piece of functionality needed on the Styler side. Then issues being faced by users, such as stackoverflow q, can be automatically addressed.

@jreback
Copy link
Contributor

jreback commented Jul 24, 2022

@attack68 i am sympathetic here as i know you want to make this fully compatible with the existing formaters. The issues i have here are:

  • adding yet another keyword to the api, this bloating the api even more.
  • we are merging the concerns here. formatting output data and replacing the data, which we already have a good way of doing via .rename_axis / .set_axis. This leads to more than one way to do something.

I agree that this could be viewed as on the line, but its much nicer to keep this api nice and tight and use the existing fluent api to solve this problem.

I see that we already have styler methods to do some index manipulation e.g. https://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.io.formats.style.Styler.format_index.html?highlight=styler (.apply_index, format_index). I guess ok to add a .rename_index as a method here might be ok. (this preserves the fluent api and doesn't add yet another keyword).

@attack68
Copy link
Contributor Author

@jreback, OK, yes I see your point, rename_index could also serve an additional purpose, which I will explain in the docs write up. Will look to do that instead.

@attack68 attack68 closed this Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Styler conditional formatting using DataFrame.style
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants