Skip to content

DEPR: Remove literal string input for read_html #53805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 28, 2023
Merged

DEPR: Remove literal string input for read_html #53805

merged 6 commits into from
Jun 28, 2023

Conversation

rmhowe425
Copy link
Contributor

@rmhowe425 rmhowe425 commented Jun 22, 2023

@rmhowe425
Copy link
Contributor Author

@mroeschke PR is ready for inspection!

@mroeschke mroeschke added IO HTML read_html, to_html, Styler.apply, Styler.applymap Deprecate Functionality to remove in pandas labels Jun 23, 2023
@@ -298,13 +298,15 @@ Deprecations
- Deprecated constructing :class:`SparseArray` from scalar data, pass a sequence instead (:issue:`53039`)
- Deprecated falling back to filling when ``value`` is not specified in :meth:`DataFrame.replace` and :meth:`Series.replace` with non-dict-like ``to_replace`` (:issue:`33302`)
- Deprecated literal json input to :func:`read_json`. Wrap literal json string input in ``io.StringIO`` instead. (:issue:`53409`)
- Deprecated literal string/bytes input to :func:`read_html`. Wrap literal string/bytes input in ``io.StringIO`` instead. (:issue:`53767`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Deprecated literal string/bytes input to :func:`read_html`. Wrap literal string/bytes input in ``io.StringIO`` instead. (:issue:`53767`)
- Deprecated literal string/bytes input to :func:`read_html`. Wrap literal string/bytes input in ``io.StringIO``/``io.BytesIO`` instead. (:issue:`53767`)

@@ -1178,6 +1183,15 @@ def read_html(

io = stringify_path(io)

if isinstance(io, str) and "\n" in io:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is \n a reliable way detect if it's literal html?

@rmhowe425 rmhowe425 requested a review from mroeschke June 27, 2023 00:26
@@ -1178,6 +1185,15 @@ def read_html(

io = stringify_path(io)

if isinstance(io, str) and not is_file_like(io) and "\n" in io:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use the same type of checks like in the xml PR and remove the \n check?

@rmhowe425
Copy link
Contributor Author

@mroeschke PR is ready for inspection.

@mroeschke mroeschke merged commit efc9f0d into pandas-dev:main Jun 28, 2023
@mroeschke mroeschke added this to the 2.1 milestone Jun 28, 2023
@mroeschke
Copy link
Member

Thanks @rmhowe425

Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* Updating documentation and adding deprecation logic for read_html.

* Fixing formatting errors

* Fixing documentation errors

* Updating deprecation logic and documentation per reviewer recommendations.

* Updating implementation per reviewer recommendations.
@rmhowe425 rmhowe425 deleted the dev/depr/literal-str-read_html branch February 17, 2024 17:20
debnath-d added a commit to debnath-d/ISLP that referenced this pull request Apr 30, 2024
See: pandas-dev/pandas#53805

Passing html literal strings is deprecated.

Wrap literal string/bytes input in ``io.StringIO``/``io.BytesIO`` instead.
jonathan-taylor pushed a commit to intro-stat-learning/ISLP that referenced this pull request Jun 4, 2024
See: pandas-dev/pandas#53805

Passing html literal strings is deprecated.

Wrap literal string/bytes input in ``io.StringIO``/``io.BytesIO`` instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas IO HTML read_html, to_html, Styler.apply, Styler.applymap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DEPR]: Remove literal string/bytes input from read_excel, read_html, and read_xml
2 participants