Skip to content

BUG/ENH: Styler.format() always inherits from .set_na_rep() #40060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from

Conversation

attack68
Copy link
Contributor

@attack68 attack68 commented Feb 25, 2021

  • closes BUG: Styler format na_rep arg overwrites existing when not given. #40032
  • tests added: test_display_format_subset_interaction: why: see below.
  • tests removed: test_format_bad_na_rep: why: na_rep works if not string in some cases - changed to duck-typing.
  • tests lefts unchanged that pass: test_display_format, test_display_set_precision, test_format_subset, test_display_dict
  • tests expanded: test_display_format_raises
  • Ensure all linting tests pass, see here for how to run them
  • whatsnew entry

The issue is that there are currently two competing methods in Styler for missing value represenatation:

  • .set_na_rep('x') and
  • .format(na_rep='y').

The two do not interact well together dynamically. This PR brings the format wrapping into the class so that format can be updated and inherit values in set_na_rep after it has been called. The image below highlights the impact this PR has and the added test replicates this pattern:

Screen Shot 2021-02-25 at 20 38 45

Also at the bottom here, we add a KeyError to remove the ambiguity between what happens when a dict is provided with a column Key that is not specifically included within the provided subset slice.

@attack68 attack68 changed the title Fix formatter na rep BUG/ENH: Styler.format() always inherits from .set_na_rep() Feb 25, 2021
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a number of questions

def format(
self,
formatter: Optional[
Union[Dict[Any, Optional[Union[str, Callable]]], str, Callable]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you type this at the top / in conftest.py (note that we can probably generally type these formatters across IO functions, but that's for another day)

"""
Format the text display value of cells.

Parameters
----------
formatter : str, callable, dict or None
If ``formatter`` is None, the default formatter is used.
Format specification to use for displaying values. If ``None``, the default
formatter is used. If ``dict``, keys should corresponcd to column names,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correspond

Representation for missing values.
If ``na_rep`` is None, no special formatting is applied.
Representation for missing values. If ``None``, will revert to using
``Styler.na_rep``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm why are you adding this special function? we don't do this anywhere else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your comment about the fact that nowhere else is there a set_na_rep method that lends itself to the answer here. It is possible for there to be a solution where both functions (that currently exists) operate consistently, as opposed to somewhat inconsistently and inconveniently at present

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add that this might not even be interpreted as a change. no special formatting applied might be expected to be that `Styler.na_rep`` is used as default, it is ambiguous. I was just clarifying what the bug fix did here.

@@ -1294,6 +1307,40 @@ def hide_columns(self, subset) -> Styler:
self.hidden_columns = self.columns.get_indexer_for(hidden_df.columns)
return self

def _default_formatter(self, x):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not a module level function?

return f"{x:.{self.precision}f}"
return x

def _maybe_wrap_formatter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this moved? what is the point of self.na_rep?

Copy link
Contributor Author

@attack68 attack68 Feb 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its moved because as a module level it can't act dynamically and update self.na_rep if it is later changed. At class level it can so this goes to inheritance.

@jreback
Copy link
Contributor

jreback commented Feb 27, 2021

.set_na_rep('x') and
.format(na_rep='y').

if there are 2 let's deprecate one. we use .format() everywhere else, I don't really see having a specific method na rep as all that useful (we also have set_precision as well).

we need one canoncial way to do things.

@jreback jreback added API - Consistency Internal Consistency of API/Behavior Code Style Code style, linting, code_checks labels Feb 27, 2021
@jreback
Copy link
Contributor

jreback commented Feb 27, 2021

try not to mix bug fixes with api changes in a single pr. keep them as focused as possible.

@attack68
Copy link
Contributor Author

@jreback I agree somewhat that maybe multiple different set_na_rep and set_precision and format methods are not ideal and maybe incorporating everything to a format function is preferable. I've never suggested a deprecation previously (and all of these are legacy decisions from my perspective) but am happy to work on a solution otherwise, if you think thats better.

@attack68
Copy link
Contributor Author

attack68 commented Mar 1, 2021

I think the PR above is better both in terms of API and code maintenance. Do close this PR if you agree.

@attack68 attack68 closed this Mar 3, 2021
@attack68 attack68 deleted the fix_formatter_na_rep branch March 5, 2021 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Styler format na_rep arg overwrites existing when not given.
2 participants