Skip to content

BUG: to_html() contains deprecated and poorly supported HTML #39951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
attack68 opened this issue Feb 21, 2021 · 2 comments
Open

BUG: to_html() contains deprecated and poorly supported HTML #39951

attack68 opened this issue Feb 21, 2021 · 2 comments
Labels
Bug IO HTML read_html, to_html, Styler.apply, Styler.applymap

Comments

@attack68
Copy link
Contributor

df = pd.DataFrame([[1,2],[3,4]], index=pd.MultiIndex.from_product([['a'], [1,2]]), columns=pd.MultiIndex.from_product([['b'], [3,4]]))
df.to_html().split('\n')

['<table border="1" class="dataframe">',
 '  <thead>',
 '    <tr>',
 '      <th></th>',
 '      <th></th>',
 '      <th colspan="2" halign="left">b</th>',
 '    </tr>',
 '    <tr>',
 '      <th></th>',
 '      <th></th>',
 '      <th>3</th>',
 '      <th>4</th>',
 '    </tr>',
 '  </thead>',
 '  <tbody>',
 '    <tr>',
 '      <th rowspan="2" valign="top">a</th>',
 '      <th>1</th>',
 '      <td>1</td>',
 '      <td>2</td>',
 '    </tr>',
 '    <tr>',
 '      <th>2</th>',
 '      <td>3</td>',
 '      <td>4</td>',
 '    </tr>',
 '  </tbody>',
 '</table>']

valign is a deprecated HTML attribute and halign is reported to be poorly supported even when it was not deprecated.
CSS solutions are recommended, but inline css like 'style="text-align:center;"' is not recommended as inline styles can break the more secure website CSPs (content security policies).

See here..
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td

@attack68 attack68 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 21, 2021
@rhshadrach rhshadrach added IO HTML read_html, to_html, Styler.apply, Styler.applymap and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 21, 2021
@jorisvandenbossche
Copy link
Member

but inline css like 'style="text-align:center;"' is not recommended

Does thestyle formatter "solve" this by adding unique identifiers to the table, so that the inline css can be targetted to only that table? (no web expert though, I don't know if this is sufficient for CSPs)

@attack68
Copy link
Contributor Author

The style formatter doesn't use inline styles. i.e. it never does this:
<tbody><tr><td style="text-align: center;">some text</td></tr></tbody>

It will either do this with identifiers as you mention:

<style> #xyz_row0_col0 {text-align: center;} </style>
...
<tbody><tr><td id="xyz_row0_col0">some text</td></tr></tbody>

or this with css classes:

<style> .my-cls {text-align: center;} </style>
...
<tbody><tr><td class="my-cls">some text</td></tr></tbody>

Both these solutions would be allowed under a no inline styles rule on a CSP

Currently anyone using df.to_html() doesn't get all the added HTML bloat that style creates to function, and it seems to have developed its own set of kwargs options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HTML read_html, to_html, Styler.apply, Styler.applymap
Projects
None yet
Development

No branches or pull requests

3 participants