Skip to content

BUG-17280 to_html follows display.precision for column numbers in notebooks #25914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 4, 2019
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,7 @@ I/O
- Bug in :meth:`DataFrame.to_string` and :meth:`DataFrame.to_latex` that would lead to incorrect output when the ``header`` keyword is used (:issue:`16718`)
- Bug in :func:`read_csv` not properly interpreting the UTF8 encoded filenames on Windows on Python 3.6+ (:issue:`15086`)
- Improved performance in :meth:`pandas.read_stata` and :class:`pandas.io.stata.StataReader` when converting columns that have missing values (:issue:`25772`)
- Bug in :meth:`DataFrame.to_html` where header numbers would ignore display options when rounding (:issue:`17280`)


Plotting
Expand Down
17 changes: 16 additions & 1 deletion pandas/io/formats/html.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ def __init__(self, formatter, classes=None, border=None):
self.classes = classes

self.frame = self.fmt.frame
self.columns = self.fmt.tr_frame.columns
self.columns = self._get_columns_formatted_values()
self.elements = []
self.bold_rows = self.fmt.kwds.get('bold_rows', False)
self.escape = self.fmt.kwds.get('escape', True)
Expand Down Expand Up @@ -70,6 +70,9 @@ def row_levels(self):
# not showing (row) index
return 0

def _get_columns_formatted_values(self):
return self.fmt.tr_frame.columns

@property
def is_truncated(self):
return self.fmt.is_truncated
Expand Down Expand Up @@ -491,9 +494,21 @@ class NotebookFormatter(HTMLFormatter):
DataFrame._repr_html_() and DataFrame.to_html(notebook=True)
"""

def __init__(self, formatter, classes=None, border=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this really need to be in __init__, rather have this call a method in the base class which is overriden here, e.g.

self.columns = self._get_columns_formatted_values()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather have this call a method in the base class which is overriden here

in #24651, i separated out the notebook functionality from HTMLFormatter into NotebookFormatter using HTMLFormatter as the base class. the base class of HTMLFormatter is TableFormatter which is also the base class of DataFrameFormatter (used for to_string and to _latex) and LatexFormatter.

I envisaged that at some point in the development cycle it may become desirable to also create a ToHTMLFormatter using HTMLFormatter as the base.

IMO HTMLFormatter and NotebookFormatter are not the appropriate location for non-markup related formatting issues that are common across output-formatting methods. I think this issue is in that category and should ideally be in DataFrameFormatter or TableFormatter.

However, if to close the open issue the fix is somewhere in io/formats/html.py, then i think a TODO to remove the code at a later date should be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the bug was present in to_html(notebook=False) as well

I would argue it is not a bug here. since the display options should not affect the to_html() output. compare with the display options, max_rows, max_columns, show_dimensions, max_colwidth, etc. which only apply to the notebook display.

I think I'm getting a bit confused here. Doesn't this mean that this is not an issue that is common across formatters? Because to_html should only check display preferences when used for display purposes in a notebook and not when generating HTML for a web page?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. i can understand why your confused, there is a slight problem with the current class hierarchy. the workaround for max_colwidth was to use with option_context to ignore the display option for to_html, see #24841

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the ideal solution would be to have say a use_display_options attribute in DataFrameFormatter and then in HTMLFormatter we just have self.fmt.use_display_options = False and in NoteBookFormatter we have self.fmt.use_display_options = True and then all the display formatting could be handled in io/formats/format.py.

But this is way outside the scope of this PR. So a with option_context work-around would be fine for now IMO.

super(NotebookFormatter, self).__init__(formatter,
classes=classes,
border=border)
self.columns = self._get_columns_formatted_values()

def _get_formatted_values(self):
return {i: self.fmt._format_col(i) for i in range(self.ncols)}

def _get_columns_formatted_values(self):
precision = get_option('display.precision')
fmt = lambda f: '{float:.{p}f}'.format(float=f, p=precision)
fmt_floats = lambda f: fmt(f) if isinstance(f, float) else f
return self.fmt.tr_frame.columns.map(fmt_floats)

def write_style(self):
# We use the "scoped" attribute here so that the desired
# style properties for the data frame are not then applied
Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/io/formats/test_to_html.py
Original file line number Diff line number Diff line change
Expand Up @@ -633,3 +633,12 @@ def test_to_html_invalid_classes_type(classes):

with pytest.raises(TypeError, match=msg):
df.to_html(classes=classes)


def test_to_html_round_column_headers():
df = DataFrame([1], columns=[0.55555])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number

with pd.option_context('display.precision', 3):
html = df.to_html(notebook=False)
notebook = df.to_html(notebook=True)
assert "0.55555" in html
assert "0.556" in notebook