Skip to content

ENH: add environment, e.g. "longtable", to Styler.to_latex #41866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Jul 28, 2021
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
cacb041
add latex environment variable and longtable template
attack68 May 29, 2021
a4f75ab
add latex environment variable and longtable template
attack68 May 29, 2021
1962c58
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 1, 2021
24d0e9c
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 7, 2021
084b2f6
longtable with captions
attack68 Jun 7, 2021
3a054c3
add tests for longtable
attack68 Jun 8, 2021
bcf1168
add tests for longtable
attack68 Jun 8, 2021
2529b1d
mypy fix
attack68 Jun 8, 2021
c6e608b
test fix
attack68 Jun 8, 2021
bd7c5f8
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 9, 2021
dbe4154
improve docs, and whats new
attack68 Jun 9, 2021
b412a12
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 11, 2021
ea7f956
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 12, 2021
38fa8c4
Merge branch 'rls1.3.0' into longtable_to_latex
attack68 Jun 15, 2021
cf70df1
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 16, 2021
7ad8802
merge into master and add extra needed tests
attack68 Jun 16, 2021
ab7e6f1
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 18, 2021
e2ea44a
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 20, 2021
ec6ebc4
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 24, 2021
06962c6
parametrize multindex columns (ivan request)
attack68 Jun 24, 2021
8e3ff5b
parametrize caption label (ivan request)
attack68 Jun 24, 2021
67ffe24
more readable (ivan request)
attack68 Jun 24, 2021
903923b
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 27, 2021
17e090f
ivan requests
attack68 Jun 28, 2021
4ada8ba
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 29, 2021
78159fe
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 30, 2021
8164c2e
imporve tests (ivan request)
attack68 Jun 30, 2021
9d4972c
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jun 30, 2021
175e1e3
ValueError on position_float tests (simon request)
attack68 Jun 30, 2021
8081971
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jul 5, 2021
42151ac
whatsnew 1.4.0
attack68 Jul 5, 2021
2311db2
whatsnew 1.4.0
attack68 Jul 5, 2021
3da4ed7
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jul 7, 2021
4cd8263
add to doc packages
attack68 Jul 7, 2021
3b7427d
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jul 20, 2021
3c5a2e9
Merge remote-tracking branch 'upstream/master' into longtable_to_latex
attack68 Jul 24, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ which has been revised and improved (:issue:`39720`, :issue:`39317`, :issue:`404
- Many features of the :class:`.Styler` class are now either partially or fully usable on a DataFrame with a non-unique indexes or columns (:issue:`41143`)
- One has greater control of the display through separate sparsification of the index or columns using the :ref:`new styler options <options.available>`, which are also usable via :func:`option_context` (:issue:`41142`)
- Added the option ``styler.render.max_elements`` to avoid browser overload when styling large DataFrames (:issue:`40712`)
- Added the method :meth:`.Styler.to_latex` (:issue:`21673`), which also allows some limited CSS conversion (:issue:`40731`)
- Added the method :meth:`.Styler.to_latex` (:issue:`21673`, :issue:`41866`), which also allows some limited CSS conversion (:issue:`40731`)
- Added the method :meth:`.Styler.to_html` (:issue:`13379`)
- Added the method :meth:`.Styler.set_sticky` to make index and column headers permanently visible in scrolling HTML frames (:issue:`29072`)

Expand Down
7 changes: 7 additions & 0 deletions pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,7 @@ def to_latex(
multirow_align: str = "c",
multicol_align: str = "r",
siunitx: bool = False,
environment: str | None = None,
encoding: str | None = None,
convert_css: bool = False,
):
Expand Down Expand Up @@ -484,6 +485,11 @@ def to_latex(
the left, centrally, or at the right.
siunitx : bool, default False
Set to ``True`` to structure LaTeX compatible with the {siunitx} package.
environment : str, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure i love this name, maybe template: and this should be 'longtable', 'table' ?

your comment on position_float is hard to interpret here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

environment is the proper LaTeX name for these blocks: LaTeX environments

'longtable' and 'table' are common environments for this but there are others. e.g. see #37443

some arguments are nullified by the use of the 'longtable' environment, such as 'position_float'. can try and rephrase this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe should raise ValueError if invalid combinations of parameters are given

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was already validation on position_float so this was non-contentious addition.

If given, the environment that will replace 'table' in ``\\begin{table}``.
If 'longtable' is specified then a more suitable template is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comment here. can you list the valid values? (or provide a link to them). can be a followon.

rendered for which the ``position_float`` argument is nullified and does not
impact the result.
encoding : str, default "utf-8"
Character encoding setting.
convert_css : bool, default False
Expand Down Expand Up @@ -787,6 +793,7 @@ def to_latex(
sparse_columns=sparse_columns,
multirow_align=multirow_align,
multicol_align=multicol_align,
environment=environment,
convert_css=convert_css,
)

Expand Down
55 changes: 4 additions & 51 deletions pandas/io/formats/templates/latex.tpl
Original file line number Diff line number Diff line change
@@ -1,52 +1,5 @@
{% if parse_wrap(table_styles, caption) %}
\begin{table}
{%- set position = parse_table(table_styles, 'position') %}
{%- if position is not none %}
[{{position}}]
{%- endif %}

{% set position_float = parse_table(table_styles, 'position_float') %}
{% if position_float is not none%}
\{{position_float}}
{% endif %}
{% if caption and caption is string %}
\caption{% raw %}{{% endraw %}{{caption}}{% raw %}}{% endraw %}

{% elif caption and caption is sequence %}
\caption[{{caption[1]}}]{% raw %}{{% endraw %}{{caption[0]}}{% raw %}}{% endraw %}

{% endif %}
{% for style in table_styles %}
{% if style['selector'] not in ['position', 'position_float', 'caption', 'toprule', 'midrule', 'bottomrule', 'column_format'] %}
\{{style['selector']}}{{parse_table(table_styles, style['selector'])}}
{% endif %}
{% endfor %}
{% endif %}
\begin{tabular}
{%- set column_format = parse_table(table_styles, 'column_format') %}
{% raw %}{{% endraw %}{{column_format}}{% raw %}}{% endraw %}

{% set toprule = parse_table(table_styles, 'toprule') %}
{% if toprule is not none %}
\{{toprule}}
{% endif %}
{% for row in head %}
{% for c in row %}{%- if not loop.first %} & {% endif %}{{parse_header(c, multirow_align, multicol_align, True)}}{% endfor %} \\
{% endfor %}
{% set midrule = parse_table(table_styles, 'midrule') %}
{% if midrule is not none %}
\{{midrule}}
{% endif %}
{% for row in body %}
{% for c in row %}{% if not loop.first %} & {% endif %}
{%- if c.type == 'th' %}{{parse_header(c, multirow_align, multicol_align)}}{% else %}{{parse_cell(c.cellstyle, c.display_value, convert_css)}}{% endif %}
{%- endfor %} \\
{% endfor %}
{% set bottomrule = parse_table(table_styles, 'bottomrule') %}
{% if bottomrule is not none %}
\{{bottomrule}}
{% endif %}
\end{tabular}
{% if parse_wrap(table_styles, caption) %}
\end{table}
{% if environment == "longtable" %}
{% include "latex_longtable.tpl" %}
{% else %}
{% include "latex_table.tpl" %}
{% endif %}
73 changes: 73 additions & 0 deletions pandas/io/formats/templates/latex_longtable.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
\begin{longtable}
{%- set position = parse_table(table_styles, 'position') %}
{%- if position is not none %}
[{{position}}]
{%- endif %}
{%- set column_format = parse_table(table_styles, 'column_format') %}
{% raw %}{{% endraw %}{{column_format}}{% raw %}}{% endraw %}

{% for style in table_styles %}
{% if style['selector'] not in ['position', 'position_float', 'caption', 'toprule', 'midrule', 'bottomrule', 'column_format', 'label'] %}
\{{style['selector']}}{{parse_table(table_styles, style['selector'])}}
{% endif %}
{% endfor %}
{% if caption and caption is string %}
\caption{% raw %}{{% endraw %}{{caption}}{% raw %}}{% endraw %}
{%- set label = parse_table(table_styles, 'label') %}
{%- if label is not none %}
\label{{label}}
{%- endif %} \\
{% elif caption and caption is sequence %}
\caption[{{caption[1]}}]{% raw %}{{% endraw %}{{caption[0]}}{% raw %}}{% endraw %}
{%- set label = parse_table(table_styles, 'label') %}
{%- if label is not none %}
\label{{label}}
{%- endif %} \\
{% endif %}
{% set toprule = parse_table(table_styles, 'toprule') %}
{% if toprule is not none %}
\{{toprule}}
{% endif %}
{% for row in head %}
{% for c in row %}{%- if not loop.first %} & {% endif %}{{parse_header(c, multirow_align, multicol_align, True)}}{% endfor %} \\
{% endfor %}
{% set midrule = parse_table(table_styles, 'midrule') %}
{% if midrule is not none %}
\{{midrule}}
{% endif %}
\endfirsthead
{% if caption and caption is string %}
\caption[]{% raw %}{{% endraw %}{{caption}}{% raw %}}{% endraw %} \\
{% elif caption and caption is sequence %}
\caption[]{% raw %}{{% endraw %}{{caption[0]}}{% raw %}}{% endraw %} \\
{% endif %}
{% if toprule is not none %}
\{{toprule}}
{% endif %}
{% for row in head %}
{% for c in row %}{%- if not loop.first %} & {% endif %}{{parse_header(c, multirow_align, multicol_align, True)}}{% endfor %} \\
{% endfor %}
{% if midrule is not none %}
\{{midrule}}
{% endif %}
\endhead
{% if midrule is not none %}
\{{midrule}}
{% endif %}
\multicolumn{% raw %}{{% endraw %}{{column_format|length}}{% raw %}}{% endraw %}{r}{Continued on next page} \\
{% if midrule is not none %}
\{{midrule}}
{% endif %}
\endfoot
{% set bottomrule = parse_table(table_styles, 'bottomrule') %}
{% if bottomrule is not none %}
\{{bottomrule}}
{% endif %}
\endlastfoot
{% for row in body %}
{% for c in row %}{% if not loop.first %} & {% endif %}
{%- if c.type == 'th' %}{{parse_header(c, multirow_align, multicol_align)}}{% else %}{{parse_cell(c.cellstyle, c.display_value, convert_css)}}{% endif %}
{%- endfor %} \\
{% endfor %}
\end{longtable}
{% raw %}{% endraw %}
53 changes: 53 additions & 0 deletions pandas/io/formats/templates/latex_table.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
{% if environment or parse_wrap(table_styles, caption) %}
\begin{% raw %}{{% endraw %}{{environment if environment else "table"}}{% raw %}}{% endraw %}
{%- set position = parse_table(table_styles, 'position') %}
{%- if position is not none %}
[{{position}}]
{%- endif %}

{% set position_float = parse_table(table_styles, 'position_float') %}
{% if position_float is not none%}
\{{position_float}}
{% endif %}
{% if caption and caption is string %}
\caption{% raw %}{{% endraw %}{{caption}}{% raw %}}{% endraw %}

{% elif caption and caption is sequence %}
\caption[{{caption[1]}}]{% raw %}{{% endraw %}{{caption[0]}}{% raw %}}{% endraw %}

{% endif %}
{% for style in table_styles %}
{% if style['selector'] not in ['position', 'position_float', 'caption', 'toprule', 'midrule', 'bottomrule', 'column_format'] %}
\{{style['selector']}}{{parse_table(table_styles, style['selector'])}}
{% endif %}
{% endfor %}
{% endif %}
\begin{tabular}
{%- set column_format = parse_table(table_styles, 'column_format') %}
{% raw %}{{% endraw %}{{column_format}}{% raw %}}{% endraw %}

{% set toprule = parse_table(table_styles, 'toprule') %}
{% if toprule is not none %}
\{{toprule}}
{% endif %}
{% for row in head %}
{% for c in row %}{%- if not loop.first %} & {% endif %}{{parse_header(c, multirow_align, multicol_align, True)}}{% endfor %} \\
{% endfor %}
{% set midrule = parse_table(table_styles, 'midrule') %}
{% if midrule is not none %}
\{{midrule}}
{% endif %}
{% for row in body %}
{% for c in row %}{% if not loop.first %} & {% endif %}
{%- if c.type == 'th' %}{{parse_header(c, multirow_align, multicol_align)}}{% else %}{{parse_cell(c.cellstyle, c.display_value, convert_css)}}{% endif %}
{%- endfor %} \\
{% endfor %}
{% set bottomrule = parse_table(table_styles, 'bottomrule') %}
{% if bottomrule is not none %}
\{{bottomrule}}
{% endif %}
\end{tabular}
{% if environment or parse_wrap(table_styles, caption) %}
\end{% raw %}{{% endraw %}{{environment if environment else "table"}}{% raw %}}{% endraw %}

{% endif %}
126 changes: 126 additions & 0 deletions pandas/tests/io/formats/style/test_to_latex.py
Original file line number Diff line number Diff line change
Expand Up @@ -484,8 +484,134 @@ def test_parse_latex_css_conversion(css, expected):
assert result == expected


@pytest.mark.parametrize("environment", ["tabular", "longtable"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tabular or table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tabular" is correct, this is a quirk of the structure of longtable to parametrize the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but should it work with table as well?

Copy link
Contributor Author

@attack68 attack68 Jun 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no because {tabular} is a sub environment of any other environment including {table}, except {longtable}, i.e. we have:

\begin{table}
\begin{tabular}
...           <<-- CSS conversion only done here
\end{tabular}
\end{table}

this tests the inner environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think it is necessary to add one end-to-end test for table with inner tabular.
For now this particular test function test_parse_latex_css_convert_minimal tests only a portion of the output (end of inner environment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessary. There is already test_comprehensive which performs a full output check.

This test is designed to selectively test only the relevant parts of the template(s) which convert css to latex styles, which only occurs in data-cells. Since there are two templates, where the inner environment is defined as either longtable or tabular environment, the test covers both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but in test_comprehensive you do not test environment kwarg.

My point is as follows.
You have a good test for environment='longtable', which compares the full output.
Meanwhile for environment='tabular' there is only a check for the final portion of the output.
How can we ensure that some future changes would not break the upper part of the output when environment='tabular'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I can see some confusion has arisen here, since environment="tabular" should never be used by a user. It would result in an outer and inner {tabular} environment erroneously. I have used it here as a shortcut to not needing 2 variables. However, I have now amended this test to avoid the confusion.

But with respect to broader testing I think cases are covered through dependencies, for example:

  • test_minimal_latex_tabular: tests just the core {tabular} structure without additions.
  • test_longtable_minimal: tests just the core {longtable} structure without additions.
  • test_comprehensive: tests the core outer {table} and inner {tabular} structure.
  • test_longtable_comprehensive: tests the core {longtable} structure with features.
  • test_latex_environment: checks {table} is properly substituted for another value in the outer structure.

However, reflecting on your comments, I have removed test_latex_enviroment and have, instead, incorporated it now into test_comprehensive as parameters, which I think you will prefer, and also this increased the power of the test.

@pytest.mark.parametrize(
"convert, exp", [(True, "bfseries"), (False, "font-weightbold")]
)
def test_parse_latex_css_convert_minimal(styler, environment, convert, exp):
# parameters ensure longtable template is also tested
styler.highlight_max(props="font-weight:bold;")
result = styler.to_latex(convert_css=convert, environment=environment)
expected = dedent(
f"""\
0 & 0 & \\{exp} -0.61 & ab \\\\
1 & \\{exp} 1 & -1.22 & \\{exp} cd \\\\
\\end{{{environment}}}
"""
)
assert expected in result


def test_parse_latex_css_conversion_option():
css = [("command", "option--latex--wrap")]
expected = [("command", "option--wrap")]
result = _parse_latex_css_conversion(css)
assert result == expected


def test_longtable_comprehensive(styler):
result = styler.to_latex(
environment="longtable", hrules=True, label="fig:A", caption=("full", "short")
)
expected = dedent(
"""\
\\begin{longtable}{lrrl}
\\caption[short]{full} \\label{fig:A} \\\\
\\toprule
{} & {A} & {B} & {C} \\\\
\\midrule
\\endfirsthead
\\caption[]{full} \\\\
\\toprule
{} & {A} & {B} & {C} \\\\
\\midrule
\\endhead
\\midrule
\\multicolumn{4}{r}{Continued on next page} \\\\
\\midrule
\\endfoot
\\bottomrule
\\endlastfoot
0 & 0 & -0.61 & ab \\\\
1 & 1 & -1.22 & cd \\\\
\\end{longtable}
"""
)
assert result == expected


def test_longtable_minimal(styler):
result = styler.to_latex(environment="longtable")
expected = dedent(
"""\
\\begin{longtable}{lrrl}
{} & {A} & {B} & {C} \\\\
\\endfirsthead
{} & {A} & {B} & {C} \\\\
\\endhead
\\multicolumn{4}{r}{Continued on next page} \\\\
\\endfoot
\\endlastfoot
0 & 0 & -0.61 & ab \\\\
1 & 1 & -1.22 & cd \\\\
\\end{longtable}
"""
)
assert result == expected


@pytest.mark.parametrize(
"sparse, exp",
[
(True, "{} & \\multicolumn{2}{r}{A} & {B}"),
(False, "{} & {A} & {A} & {B}"),
],
)
def test_longtable_multiindex_columns(df, sparse, exp):
cidx = MultiIndex.from_tuples([("A", "a"), ("A", "b"), ("B", "c")])
df.columns = cidx
expected = dedent(
f"""\
\\begin{{longtable}}{{lrrl}}
{exp} \\\\
{{}} & {{a}} & {{b}} & {{c}} \\\\
\\endfirsthead
{exp} \\\\
{{}} & {{a}} & {{b}} & {{c}} \\\\
\\endhead
"""
)
assert expected in df.style.to_latex(environment="longtable", sparse_columns=sparse)


@pytest.mark.parametrize(
"caption, cap_exp",
[
("full", ("{full}", "")),
(("full", "short"), ("{full}", "[short]")),
],
)
@pytest.mark.parametrize("label, lab_exp", [(None, ""), ("tab:A", " \\label{tab:A}")])
def test_longtable_caption_label(styler, caption, cap_exp, label, lab_exp):
cap_exp1 = f"\\caption{cap_exp[1]}{cap_exp[0]}"
cap_exp2 = f"\\caption[]{cap_exp[0]}"

expected = dedent(
f"""\
{cap_exp1}{lab_exp} \\\\
{{}} & {{A}} & {{B}} & {{C}} \\\\
\\endfirsthead
{cap_exp2} \\\\
"""
)
assert expected in styler.to_latex(
environment="longtable", caption=caption, label=label
)


def test_latex_environment(styler):
result = styler.to_latex(environment="figure*")
assert "\\begin{table}" not in result
assert "\\end{table}" not in result
assert "\\begin{figure*}" in result
assert "\\end{figure*}" in result