Skip to content

REF: Use Styler implementation for DataFrame.to_latex #47970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Jan 19, 2023
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
02d7d0c
Base implementation
attack68 Aug 3, 2022
0c12f3a
Base implementation
attack68 Aug 3, 2022
d644065
test fix up
attack68 Aug 3, 2022
9dcd254
test fix up
attack68 Aug 3, 2022
cd05038
test fix up
attack68 Aug 4, 2022
c625610
doc change
attack68 Aug 4, 2022
493b884
doc change
attack68 Aug 4, 2022
41fa426
doc change
attack68 Aug 4, 2022
2d5c419
mypy fixes
attack68 Aug 5, 2022
c9c61c3
ivanov doc comment
attack68 Aug 9, 2022
ab6d3ec
ivanov doc comment
attack68 Aug 9, 2022
b913583
rhshadrach reduction
attack68 Aug 11, 2022
801bf42
Merge branch 'main' into to_latex_styler_implement
attack68 Nov 18, 2022
c803b73
change text from 1.5.0 to 2.0.0
attack68 Nov 18, 2022
df0b334
remove argument col_space and add whatsnew
attack68 Nov 18, 2022
f4e4bf7
Merge remote-tracking branch 'upstream/main' into to_latex_styler_imp…
attack68 Dec 7, 2022
9774341
mroeschke requests
attack68 Dec 7, 2022
10b5ec9
mroeschke requests
attack68 Dec 7, 2022
2c42a4e
pylint fix
attack68 Dec 8, 2022
3c2f964
Merge remote-tracking branch 'upstream/main' into to_latex_styler_imp…
attack68 Jan 5, 2023
0b4144a
Whats new text improvements and description added
attack68 Jan 5, 2023
fae1b3f
Update doc/source/whatsnew/v2.0.0.rst
attack68 Jan 13, 2023
9c7d780
Update doc/source/whatsnew/v2.0.0.rst
attack68 Jan 13, 2023
c0bdc6a
remove trailing whitespace
attack68 Jan 16, 2023
c740279
remove trailing whitespace
attack68 Jan 16, 2023
9b6bba8
Merge remote-tracking branch 'upstream/main' into to_latex_styler_imp…
attack68 Jan 17, 2023
6b314cd
Whats new linting fixes
attack68 Jan 17, 2023
d6333cb
mroeschke requests
attack68 Jan 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 234 additions & 47 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,6 @@
Window,
)

from pandas.io.formats import format as fmt
from pandas.io.formats.format import (
DataFrameFormatter,
DataFrameRenderer,
Expand Down Expand Up @@ -2121,7 +2120,7 @@ def _repr_latex_(self):
Returns a LaTeX representation for a particular object.
Mainly for use with nbconvert (jupyter notebook conversion to pdf).
"""
if config.get_option("display.latex.repr"):
if config.get_option("styler.render.repr") == "latex":
return self.to_latex()
else:
return None
Expand Down Expand Up @@ -3225,7 +3224,6 @@ def to_latex(
...

@final
@doc(returns=fmt.return_docstring)
def to_latex(
self,
buf: FilePath | WriteBuffer[str] | None = None,
Expand Down Expand Up @@ -3264,6 +3262,9 @@ def to_latex(
.. versionchanged:: 1.2.0
Added position argument, changed meaning of caption argument.

.. versionchanged:: 1.5.0
Refactored to use the Styler implementation via jinja2 templating.

Parameters
----------
buf : str, Path or StringIO-like, optional, default None
Expand All @@ -3272,6 +3273,9 @@ def to_latex(
The subset of columns to write. Writes all columns by default.
col_space : int, optional
The minimum width of each column.

.. deprecated:: 1.5.0
Whitespace does not affect a rendered LaTeX file and is ignored.
header : bool or list of str, default True
Write out the column names. If a list of strings is given,
it is assumed to be aliases for the column names.
Expand Down Expand Up @@ -3345,37 +3349,86 @@ def to_latex(
``\begin{{}}`` in the output.

.. versionadded:: 1.2.0
{returns}

Returns
-------
str or None
If buf is None, returns the result as a string. Otherwise returns None.

See Also
--------
Styler.to_latex : Render a DataFrame to LaTeX with conditional formatting.
DataFrame.to_string : Render a DataFrame to a console-friendly
tabular output.
DataFrame.to_html : Render a DataFrame as an HTML table.

Notes
-----

.. note::
As of v1.5.0 this method has changed to use the ``Styler`` implementation of
``to_latex`` via ``jinja2`` templating. It is advised that
users switch to using Styler, since this implementation is more frequently
updated and contains much more flexibility with the output. The following
examples indicate how this method now replicates the Styler implementation
for its legacy arguments.

.. code-block:: python

styler = df.style

Styler methods are designed to be chained, so we can build complex combinations
of displays. To hide ``index`` and ``columns`` headers we use,

.. code-block:: python

styler.hide(axis="index").hide(axis="columns")

To use ``formatters``, ``na_rep``, ``decimal``, ``float_format``, and
``escape`` we use,

.. code-block:: python

styler.format(
formatter={"name": str.upper}, na_rep="-", precision=1,
escape="latex", decimal=","
)

To control other aspects we use the ``Styler.to_latex`` arguments, as
documented, such as,

.. code-block:: python

styler.to_latex(
column_format="lrr", caption="my table", environment="longtable"
)

Examples
--------
Convert a general DataFrame to LaTeX with formatting:

>>> df = pd.DataFrame(dict(name=['Raphael', 'Donatello'],
... mask=['red', 'purple'],
... weapon=['sai', 'bo staff']))
>>> print(df.to_latex(index=False)) # doctest: +SKIP
\begin{{tabular}}{{lll}}
\toprule
name & mask & weapon \\
\midrule
Raphael & red & sai \\
Donatello & purple & bo staff \\
... age=[26, 45],
... height=[181.23, 177.65]))
>>> print(df.to_latex(index=False,
... formatters={"name": str.upper},
... float_format="{:.1f}".format,
... ) # doctest: +SKIP
\begin{tabular}{lrr}
\toprule
name & age & height \\
\midrule
RAPHAEL & 26 & 181.2 \\
DONATELLO & 45 & 177.7 \\
\bottomrule
\end{{tabular}}
\end{tabular}
"""
msg = (
"In future versions `DataFrame.to_latex` is expected to utilise the base "
"implementation of `Styler.to_latex` for formatting and rendering. "
"The arguments signature may therefore change. It is recommended instead "
"to use `DataFrame.style.to_latex` which also contains additional "
"functionality."
"`col_space` is deprecated. Whitespace in LaTeX does not impact "
"the rendered version, and this argument is ignored."
)
warnings.warn(msg, FutureWarning, stacklevel=find_stack_level())
if col_space is not None:
warnings.warn(msg, DeprecationWarning, stacklevel=find_stack_level())

# Get defaults from the pandas config
if self.ndim == 1:
Expand All @@ -3391,35 +3444,169 @@ def to_latex(
if multirow is None:
multirow = config.get_option("display.latex.multirow")

self = cast("DataFrame", self)
formatter = DataFrameFormatter(
self,
columns=columns,
col_space=col_space,
na_rep=na_rep,
header=header,
index=index,
formatters=formatters,
float_format=float_format,
bold_rows=bold_rows,
sparsify=sparsify,
index_names=index_names,
escape=escape,
decimal=decimal,
)
return DataFrameRenderer(formatter).to_latex(
buf=buf,
column_format=column_format,
longtable=longtable,
encoding=encoding,
multicolumn=multicolumn,
multicolumn_format=multicolumn_format,
multirow=multirow,
caption=caption,
label=label,
position=position,
if column_format is not None and not isinstance(column_format, str):
raise ValueError("`column_format` must be str or unicode")
length = len(self.columns) if columns is None else len(columns)
if isinstance(header, (list, tuple)) and len(header) != length:
raise ValueError(f"Writing {length} cols but got {len(header)} aliases")

# Refactor formatters/float_format/decimal/na_rep/escape to Styler structure
base_format_ = {
"na_rep": na_rep,
"escape": "latex" if escape else None,
"decimal": decimal,
}
index_format_: dict[str, Any] = {"axis": 0, **base_format_}
column_format_: dict[str, Any] = {"axis": 1, **base_format_}

if isinstance(float_format, str):
float_format_: Callable | None = lambda x: float_format % x
else:
float_format_ = float_format

def _wrap(x, alt_format_):
if isinstance(x, (float, complex)) and float_format_ is not None:
return float_format_(x)
else:
return alt_format_(x)

formatters_: list | tuple | dict | Callable | None = None
if isinstance(formatters, list):
formatters_ = {
c: functools.partial(_wrap, alt_format_=formatters[i])
for i, c in enumerate(self.columns)
}
elif isinstance(formatters, dict):
index_formatter = formatters.pop("__index__", None)
column_formatter = formatters.pop("__columns__", None)
if index_formatter is not None:
index_format_.update({"formatter": index_formatter})
if column_formatter is not None:
column_format_.update({"formatter": column_formatter})

formatters_ = formatters
float_columns = self.select_dtypes(include="float").columns
for col in [c for c in float_columns if c not in formatters.keys()]:
formatters_.update({col: float_format_})
elif formatters is None and float_format is not None:
formatters_ = functools.partial(_wrap, alt_format_=lambda v: v)
format_index_ = [index_format_, column_format_]

# Deal with hiding indexes and relabelling column names
hide_: list[dict] = []
relabel_index_: list[dict] = []
if columns:
hide_.append(
{
"subset": [c for c in self.columns if c not in columns],
"axis": "columns",
}
)
if header is False:
hide_.append({"axis": "columns"})
elif isinstance(header, (list, tuple)):
relabel_index_.append({"labels": header, "axis": "columns"})
format_index_ = [index_format_] # column_format is overwritten

if index is False:
hide_.append({"axis": "index"})
if index_names is False:
hide_.append({"names": True, "axis": "index"})

render_kwargs_ = {
"hrules": True,
"sparse_index": sparsify,
"sparse_columns": sparsify,
"environment": "longtable" if longtable else None,
"column_format": column_format,
"multicol_align": multicolumn_format
if multicolumn
else f"naive-{multicolumn_format}",
"multirow_align": "t" if multirow else "naive",
"encoding": encoding,
"caption": caption,
"label": label,
"position": position,
"column_format": column_format,
"clines": "skip-last;data" if multirow else None,
"bold_rows": bold_rows,
}

return self._to_latex_via_styler(
buf,
hide=hide_,
relabel_index=relabel_index_,
format={"formatter": formatters_, **base_format_},
format_index=format_index_,
render_kwargs=render_kwargs_,
)

def _to_latex_via_styler(
self,
buf=None,
*,
hide: dict | list[dict] | None = None,
relabel_index: dict | list[dict] | None = None,
format: dict | list[dict] | None = None,
format_index: dict | list[dict] | None = None,
render_kwargs: dict = {},
):
"""
Render object to a LaTeX tabular, longtable, or nested table.

Uses the ``Styler`` implementation with the following, ordered, method chaining:

.. code-block:: python
styler = Styler(DataFrame)
styler.hide(**hide)
styler.relabel_index(**relabel_index)
styler.format(**format)
styler.format_index(**format_index)
styler.to_latex(buf=buf, **render_kwargs)

Parameters
----------
buf : str, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
hide : dict, list of dict
Keyword args to pass to the method call of ``Styler.hide``. If a list will
call the method numerous times.
relabel_index : dict, list of dict
Keyword args to pass to the method of ``Styler.relabel_index``. If a list
will call the method numerous times.
format : dict, list of dict
Keyword args to pass to the method call of ``Styler.format``. If a list will
call the method numerous times.
format_index : dict, list of dict
Keyword args to pass to the method call of ``Styler.format_index``. If a
list will call the method numerous times.
render_kwargs : dict
Keyword args to pass to the method call of ``Styler.to_latex``.

Returns
-------
str or None
If buf is None, returns the result as a string. Otherwise returns None.
"""
from pandas.io.formats.style import Styler

self = cast("DataFrame", self)
styler = Styler(self, uuid="")

for kw_name in ["hide", "relabel_index", "format", "format_index"]:
kw = vars()[kw_name]
if isinstance(kw, dict):
getattr(styler, kw_name)(**kw)
elif isinstance(kw, list):
for sub_kw in kw:
getattr(styler, kw_name)(**sub_kw)

# bold_rows is not a direct kwarg of Styler.to_latex
if render_kwargs.pop("bold_rows"):
styler.applymap_index(lambda v: "textbf:--rwrap;")

return styler.to_latex(buf=buf, **render_kwargs)

@overload
def to_csv(
self,
Expand Down
8 changes: 4 additions & 4 deletions pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ def __init__(
precision: int | None = None,
table_styles: CSSStyles | None = None,
uuid: str | None = None,
caption: str | tuple | None = None,
caption: str | tuple | list | None = None,
table_attributes: str | None = None,
cell_ids: bool = True,
na_rep: str | None = None,
Expand Down Expand Up @@ -2336,13 +2336,13 @@ def set_uuid(self, uuid: str) -> Styler:
self.uuid = uuid
return self

def set_caption(self, caption: str | tuple) -> Styler:
def set_caption(self, caption: str | tuple | list) -> Styler:
"""
Set the text added to a ``<caption>`` HTML element.

Parameters
----------
caption : str, tuple
caption : str, tuple, list
For HTML output either the string input is used or the first element of the
tuple. For LaTeX the string input provides a caption and the additional
tuple input allows for full captions and short captions, in that order.
Expand All @@ -2352,7 +2352,7 @@ def set_caption(self, caption: str | tuple) -> Styler:
self : Styler
"""
msg = "`caption` must be either a string or 2-tuple of strings."
if isinstance(caption, tuple):
if isinstance(caption, (list, tuple)):
if (
len(caption) != 2
or not isinstance(caption[0], str)
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/formats/style_render.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def __init__(
uuid_len: int = 5,
table_styles: CSSStyles | None = None,
table_attributes: str | None = None,
caption: str | tuple | None = None,
caption: str | tuple | list | None = None,
cell_ids: bool = True,
precision: int | None = None,
) -> None:
Expand Down
Loading