Skip to content

CLN: io/formats/html.py: refactor #22726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Nov 18, 2018
54 changes: 9 additions & 45 deletions pandas/io/formats/html.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def _write_cell(self, s, kind='td', indent=0, tags=None):
self.write(u'{start}{rs}</{kind}>'
.format(start=start_tag, rs=rs, kind=kind), indent)

def write_tr(self, line, indent=0, indent_delta=4, header=False,
def write_tr(self, line, indent=0, indent_delta=0, header=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the default argument is an API change. Could you explain why you are doing this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the only public method of HTMLFormatter should be write_result().

The other methods are currently a mix of private and public. There does not seem to be any consistency.

If write_tr should be private, the API change would not be an issue. The indent_delta is not left to the default by any calls in the current implementation. indent_delta=0 would be a cleaner default.

Following on from the discussion regarding making indent a stateful entity rather than leaving it as a parameter to individual methods: The write_tr method accesses the bold_rows instance variable. The value should probably be passed through the method's nindex_levels parameter which was probably the intention of adding that parameter to the method.

The other HTML formatting methods write, write_th, write_td and _write_cell do not access instance variables with the exception that write needs access to elements which effectively the output buffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should have said: the use of bold_rows in write_tr is why I stopped refactoring at that point for the moment. I'm not sure how many changes should be in one PR.

I've added some additional tests, but not all paths can be tested until the fixes for #22655 have been implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is ok, as its internal. I agree its less that clear what is external / internal (and we don't usually use _ prefixed). If you want to indicate in the doc-string of an external API would be ok.

align=None, tags=None, nindex_levels=0):
if tags is None:
tags = {}
Expand Down Expand Up @@ -201,26 +201,6 @@ def _write_header(self, indent):
# write nothing
return indent

def _column_header():
if self.fmt.index:
row = [''] * (self.frame.index.nlevels - 1)
else:
row = []

if isinstance(self.columns, ABCMultiIndex):
if self.fmt.has_column_names and self.fmt.index:
row.append(single_column_table(self.columns.names))
else:
row.append('')
style = "text-align: {just};".format(just=self.fmt.justify)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatively nuanced but in the case of a MultiIndex isn't this formatting the very first column row different from the rest? If so is that accounted for in the refactor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if clause is unreachable since _column_header() was only called from the else clause of another if isinstance(self.columns, ABCMultiIndex): so the styling is not accounted for in this refactor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense thanks for clarifying

row.extend([single_column_table(c, self.fmt.justify, style)
for c in self.columns])
else:
if self.fmt.index:
row.append(self.columns.name or '')
row.extend(self.columns)
return row

self.write('<thead>', indent)

indent += self.indent_delta
Expand Down Expand Up @@ -302,14 +282,19 @@ def _column_header():
self.write_tr(row, indent, self.indent_delta, tags=tags,
header=True)
else:
col_row = _column_header()
if self.fmt.index:
row = [''] * (self.frame.index.nlevels - 1)
row.append(self.columns.name or '')
else:
row = []
row.extend(self.columns)
align = self.fmt.justify

if truncate_h:
ins_col = row_levels + self.fmt.tr_col_num
col_row.insert(ins_col, '...')
row.insert(ins_col, '...')

self.write_tr(col_row, indent, self.indent_delta, header=True,
self.write_tr(row, indent, self.indent_delta, header=True,
align=align)

if all((self.fmt.has_index_names,
Expand Down Expand Up @@ -484,24 +469,3 @@ def _write_hierarchical_rows(self, fmt_values, indent):
row.insert(row_levels + self.fmt.tr_col_num, '...')
self.write_tr(row, indent, self.indent_delta, tags=None,
nindex_levels=frame.index.nlevels)


def single_column_table(column, align=None, style=None):
table = '<table'
if align is not None:
table += (' align="{align}"'.format(align=align))
if style is not None:
table += (' style="{style}"'.format(style=style))
table += '><tbody>'
for i in column:
table += ('<tr><td>{i!s}</td></tr>'.format(i=i))
table += '</tbody></table>'
return table


def single_row_table(row): # pragma: no cover
table = '<table><tbody><tr>'
for i in row:
table += ('<td>{i!s}</td>'.format(i=i))
table += '</tr></tbody></table>'
return table