diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst index 9faef9b15bfb4..2448a00d7288f 100644 --- a/doc/source/user_guide/io.rst +++ b/doc/source/user_guide/io.rst @@ -21,7 +21,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like text;`CSV `__;:ref:`read_csv`;:ref:`to_csv` text;Fixed-Width Text File;:ref:`read_fwf` text;`JSON `__;:ref:`read_json`;:ref:`to_json` - text;`HTML `__;:ref:`read_html`;:ref:`to_html` + text;`HTML `__;:ref:`read_html`;:ref:`Styler.to_html` text;`LaTeX `__;;:ref:`Styler.to_latex` text;`XML `__;:ref:`read_xml`;:ref:`to_xml` text; Local clipboard;:ref:`read_clipboard`;:ref:`to_clipboard` @@ -2682,8 +2682,8 @@ Read in pandas ``to_html`` output (with some loss of floating point precision): .. code-block:: python df = pd.DataFrame(np.random.randn(2, 2)) - s = df.to_html(float_format="{0:.40g}".format) - dfin = pd.read_html(s, index_col=0) + s = df.style.format("{0:.40g}").to_html() + dfin = pd.read_html(s, index_col=0)[0] The ``lxml`` backend will raise an error on a failed parse if that is the only parser you provide. If you only have a single parser you can provide just a @@ -2714,156 +2714,34 @@ succeeds, the function will return*. Writing to HTML files '''''''''''''''''''''' -``DataFrame`` objects have an instance method ``to_html`` which renders the -contents of the ``DataFrame`` as an HTML table. The function arguments are as -in the method ``to_string`` described above. - .. note:: - Not all of the possible options for ``DataFrame.to_html`` are shown here for - brevity's sake. See :func:`~pandas.core.frame.DataFrame.to_html` for the - full set of options. + DataFrame *and* Styler objects currently have a ``to_html`` method. We recommend + using the :meth:`Styler.to_html ` method + over :meth:`DataFrame.to_html` due to the former's greater flexibility with + conditional styling, and the latter's possible argument signature change and/or future deprecation. -.. ipython:: python - :suppress: +Review the documentation for :meth:`Styler.to_html `, +which gives examples of conditional styling and explains the operation of its keyword +arguments. The ``to_html`` methods render the contents of the ``DataFrame`` as an HTML table. - def write_html(df, filename, *args, **kwargs): - static = os.path.abspath(os.path.join("source", "_static")) - with open(os.path.join(static, filename + ".html"), "w") as f: - df.to_html(f, *args, **kwargs) +For simple application the following pattern is sufficient: .. ipython:: python df = pd.DataFrame(np.random.randn(2, 2)) df - print(df.to_html()) # raw html - -.. ipython:: python - :suppress: - - write_html(df, "basic") - -HTML: - -.. raw:: html - :file: ../_static/basic.html + print(df.style.to_html()) # raw html -The ``columns`` argument will limit the columns shown: +To format values before output, chain the :meth:`Styler.format ` +and :meth:`Styler.format_index ` methods. .. ipython:: python - print(df.to_html(columns=[0])) - -.. ipython:: python - :suppress: - - write_html(df, "columns", columns=[0]) - -HTML: - -.. raw:: html - :file: ../_static/columns.html - -``float_format`` takes a Python callable to control the precision of floating -point values: - -.. ipython:: python - - print(df.to_html(float_format="{0:.10f}".format)) - -.. ipython:: python - :suppress: - - write_html(df, "float_format", float_format="{0:.10f}".format) - -HTML: - -.. raw:: html - :file: ../_static/float_format.html - -``bold_rows`` will make the row labels bold by default, but you can turn that -off: - -.. ipython:: python - - print(df.to_html(bold_rows=False)) - -.. ipython:: python - :suppress: - - write_html(df, "nobold", bold_rows=False) - -.. raw:: html - :file: ../_static/nobold.html - -The ``classes`` argument provides the ability to give the resulting HTML -table CSS classes. Note that these classes are *appended* to the existing -``'dataframe'`` class. - -.. ipython:: python - - print(df.to_html(classes=["awesome_table_class", "even_more_awesome_class"])) - -The ``render_links`` argument provides the ability to add hyperlinks to cells -that contain URLs. - -.. ipython:: python - - url_df = pd.DataFrame( - { - "name": ["Python", "pandas"], - "url": ["https://www.python.org/", "https://pandas.pydata.org"], - } - ) - print(url_df.to_html(render_links=True)) - -.. ipython:: python - :suppress: - - write_html(url_df, "render_links", render_links=True) - -HTML: - -.. raw:: html - :file: ../_static/render_links.html - -Finally, the ``escape`` argument allows you to control whether the -"<", ">" and "&" characters escaped in the resulting HTML (by default it is -``True``). So to get the HTML without escaped characters pass ``escape=False`` - -.. ipython:: python - - df = pd.DataFrame({"a": list("&<>"), "b": np.random.randn(3)}) - - -.. ipython:: python - :suppress: - - write_html(df, "escape") - write_html(df, "noescape", escape=False) - -Escaped: - -.. ipython:: python - - print(df.to_html()) - -.. raw:: html - :file: ../_static/escape.html - -Not escaped: - -.. ipython:: python - - print(df.to_html(escape=False)) - -.. raw:: html - :file: ../_static/noescape.html - -.. note:: + print(df.style.format("€ {}").to_html()) - Some browsers may not show a difference in the rendering of the previous two - HTML tables. +Some browsers or browser applications may process and add css class styling by default to alter the appearance +of HTML tables, such as Jupyter Notebook and Google Colab. .. _io.html.gotchas: diff --git a/doc/source/user_guide/scale.rst b/doc/source/user_guide/scale.rst index 71aef4fdd75f6..edeebe2e8678c 100644 --- a/doc/source/user_guide/scale.rst +++ b/doc/source/user_guide/scale.rst @@ -275,6 +275,7 @@ column names and dtypes. That's because Dask hasn't actually read the data yet. Rather than executing immediately, doing operations build up a **task graph**. .. ipython:: python + :okwarning: ddf ddf["name"] @@ -333,6 +334,7 @@ known automatically. In this case, since we created the parquet files manually, we need to supply the divisions manually. .. ipython:: python + :okwarning: N = 12 starts = [f"20{i:>02d}-01-01" for i in range(N)] diff --git a/doc/source/whatsnew/v1.4.0.rst b/doc/source/whatsnew/v1.4.0.rst index 5e74cf57e8718..a25117f878ce5 100644 --- a/doc/source/whatsnew/v1.4.0.rst +++ b/doc/source/whatsnew/v1.4.0.rst @@ -621,7 +621,7 @@ Other Deprecations - Deprecated the behavior of :func:`to_datetime` with the string "now" with ``utc=False``; in a future version this will match ``Timestamp("now")``, which in turn matches :meth:`Timestamp.now` returning the local time (:issue:`18705`) - Deprecated :meth:`DateOffset.apply`, use ``offset + other`` instead (:issue:`44522`) - Deprecated parameter ``names`` in :meth:`Index.copy` (:issue:`44916`) -- A deprecation warning is now shown for :meth:`DataFrame.to_latex` indicating the arguments signature may change and emulate more the arguments to :meth:`.Styler.to_latex` in future versions (:issue:`44411`) +- A deprecation warning is now shown for both :meth:`DataFrame.to_html` and :meth:`DataFrame.to_latex` indicating the arguments signature may change and emulate more the arguments in :meth:`.Styler.to_html` and :meth:`.Styler.to_latex`, respectively, in future versions (:issue:`44411`, :issue:`44451`) - Deprecated behavior of :func:`concat` between objects with bool-dtype and numeric-dtypes; in a future version these will cast to object dtype instead of coercing bools to numeric values (:issue:`39817`) - Deprecated :meth:`Categorical.replace`, use :meth:`Series.replace` instead (:issue:`44929`) - Deprecated passing ``set`` or ``dict`` as indexer for :meth:`DataFrame.loc.__setitem__`, :meth:`DataFrame.loc.__getitem__`, :meth:`Series.loc.__setitem__`, :meth:`Series.loc.__getitem__`, :meth:`DataFrame.__getitem__`, :meth:`Series.__getitem__` and :meth:`Series.__setitem__` (:issue:`42825`) diff --git a/pandas/core/frame.py b/pandas/core/frame.py index cbc40c5d0aa75..1a817f2572e7e 100644 --- a/pandas/core/frame.py +++ b/pandas/core/frame.py @@ -2842,22 +2842,12 @@ def to_parquet( **kwargs, ) - @Substitution( - header_type="bool", - header="Whether to print column labels, default True", - col_space_type="str or int, list or dict of int or str", - col_space="The minimum width of each column in CSS length " - "units. An int is assumed to be px units.\n\n" - " .. versionadded:: 0.25.0\n" - " Ability to use str", - ) - @Substitution(shared_params=fmt.common_docstring, returns=fmt.return_docstring) def to_html( self, buf: FilePath | WriteBuffer[str] | None = None, columns: Sequence[str] | None = None, col_space: ColspaceArgType | None = None, - header: bool | Sequence[str] = True, + header: bool = True, index: bool = True, na_rep: str = "NaN", formatters: FormattersType | None = None, @@ -2867,75 +2857,652 @@ def to_html( justify: str | None = None, max_rows: int | None = None, max_cols: int | None = None, - show_dimensions: bool | str = False, + show_dimensions: bool | str | None = None, decimal: str = ".", - bold_rows: bool = True, + bold_rows: bool | None = None, classes: str | list | tuple | None = None, escape: bool = True, - notebook: bool = False, + notebook: bool | None = None, border: int | None = None, table_id: str | None = None, - render_links: bool = False, + render_links: bool | None = None, encoding: str | None = None, + *, + table_attributes: str | None = None, + sparse_index: bool | None = None, + sparse_columns: bool | None = None, + caption: str | None = None, + max_columns: int | None = None, + doctype_html: bool | None = None, + formatter=None, + precision: int | None = None, + thousands: str | None = None, + hyperlinks: bool | None = None, + bold_headers: bool | None = None, + **kwargs, ): """ Render a DataFrame as an HTML table. - %(shared_params)s + + .. versionchanged:: 1.5.0 + An alternative `Styler` implementation is invoked in certain cases. + See notes. + + Parameters + ---------- + buf : str, Path or StringIO-like, optional, default None + Buffer to write to. If None, the output is returned as a string. + columns : sequence, optional, default None + The subset of columns to write. Writes all by default. + col_space : str or int, list or dict of int or str, optional + The minimum width of each column in CSS length units. An int is assumed + to be px units. + + .. deprecated:: 1.5.0 + See notes for using CSS to control column width. + header : bool, optional, default True + Whether to print column labels. + index : bool, optional, default True + Whether to print index (row) labels. + na_rep : str, optional + String representation of `NaN` to use. + formatters : list, tuple or dict of one-parameter functions, optional + Formatter functions to apply to columns' elements by position or + name. The result of each function must be a unicode string. List or + tuple must be equal to the number of columns. + + .. deprecated:: 1.5.0 + The future ``Styler`` implementation will use the new ``formatter``, and + associated arguments. See notes. + float_format : one-parameter function, optional + Formatter function to apply to columns's elements if they are floats. + This function must return a unicode string and will be applied only to + the non-NaN elements, with NaN being handled by ``na_rep``. + + .. versionchanged:: 1.2.0 + + .. deprecated:: 1.5.0 + The ``Styler`` implementation will use the new ``precision`` and + associated arguments. See notes. + sparsify : bool, optional, default True + Set to `False` for a DataFrame with a hierarchical index to print + every multiindex key at each row. + + .. deprecated:: 1.5.0 + The ``Styler`` implementation will use the new ``sparse_index`` and + ``sparse_columns`` arguments. See notes. + index_names : bool, optional, default True + Whether to display the names of the indexes. + justify : str, default None + How to justify the column labels. If None uses the option from the + print configuration (controlled by set_option), `right` by default. + Valid values are: + + - left + - right + - center + - justify + - justify-all + - start + - end + - inherit + - match-parent + - initial + - unset + + .. deprecated:: 1.5.0 + See notes on using CSS to control aspects of text positioning. + max_rows : int, optional + Maximum number of rows to display in the console. + show_dimensions : bool, default False + Display the DataFrame dimensions (number or rows by columns). + + .. deprecated:: 1.5.0 + See notes for the recommendation to add a ``caption``. + decimal : str, default "." + Character recognized as the decimal separator, e.g. `,` in Europe. bold_rows : bool, default True Make the row labels bold in the output. + + .. deprecated:: 1.5.0 + Replaced by ``bold_headers``, in the `Styler` implementation. classes : str or list or tuple, default None CSS class(es) to apply to the resulting html table. + + .. deprecated:: 1.5.0 + Replaced by ``table_attributes``. See notes. escape : bool, default True Convert the characters <, >, and & to HTML-safe sequences. notebook : {True, False}, default False Whether the generated HTML is for IPython Notebook. + + .. deprecated:: 1.5.0 border : int A ``border=border`` attribute is included in the opening `` tag. Default ``pd.options.display.html.border``. + + .. deprecated:: 1.5.0 + This produces deprecated HTML. See notes. table_id : str, optional A css id is included in the opening `
` tag if specified. render_links : bool, default False Convert URLs to HTML links. + + .. deprecated:: 1.5.0 + Replaced by ``hyperlinks`` in the `Styler` implementation. encoding : str, default "utf-8" Set character encoding. .. versionadded:: 1.0 - %(returns)s + table_attributes : str, optional + Attributes to assign within the `
` HTML element in the format: + + ``
>``. + + .. versionadded:: 1.5.0 + sparse_index : bool, optional + Whether to sparsify the display of a hierarchical index. Setting to False + will display each explicit level element in a hierarchical key for each row. + Defaults to ``pandas.options.styler.sparse.index`` value. + + .. versionadded:: 1.5.0 + sparse_columns : bool, optional + Whether to sparsify the display of a hierarchical index. Setting to False + will display each explicit level element in a hierarchical key for each + column. Defaults to ``pandas.options.styler.sparse.columns`` value. + + .. versionadded:: 1.5.0 + caption : str, optional + Set the HTML caption on Styler. + + .. versionadded:: 1.5.0 + max_columns : int, optional + The maximum number of columns that will be rendered. Defaults to + ``pandas.options.styler.render.max_columns``, which is None. + + Rows and columns may be reduced if the number of total elements is + large. This value is set to ``pandas.options.styler.render.max_elements``, + which is 262144 (18 bit browser rendering). + + .. versionadded:: 1.5.0 + doctype_html : bool, default False + Whether to output a fully structured HTML file including all + HTML elements, or just the core ``