Skip to content

CI fix ci-failure from bs4 new version #46692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2022

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Apr 8, 2022

New BS4 release is throwing a deprecation warning

_____________ TestReadHtml.test_banklist_url_positional_match[bs4] _____________
[gw1] linux -- Python 3.8.13 /usr/share/miniconda/envs/pandas-dev/bin/python

self = <pandas.tests.io.test_html.TestReadHtml object at 0x7f7fd333a040>

    @pytest.mark.network
    @tm.network(
        url=(
            "https://www.fdic.gov/resources/resolutions/"
            "bank-failures/failed-bank-list/index.html"
        ),
        check_before_test=True,
    )
    def test_banklist_url_positional_match(self):
        url = "https://www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/index.html"  # noqa E501
        # Passing match argument as positional should cause a FutureWarning.
        with tm.assert_produces_warning(FutureWarning):
>           df1 = self.read_html(
                # lxml cannot find attrs leave out for now
                url,
                "First Federal Bank of Florida",  # attrs={"class": "dataTable"}
            )

pandas/tests/io/test_html.py:147: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/share/miniconda/envs/pandas-dev/lib/python3.8/contextlib.py:120: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    def _assert_caught_no_extra_warnings(
        *,
        caught_warnings: Sequence[warnings.WarningMessage],
Warning: xpected_warning: type[Warning] | bool | None,
    ) -> None:
        """Assert that no extra warnings apart from the expected ones are caught."""
        extra_warnings = []
    
        for actual_warning in caught_warnings:
            if _is_unexpected_warning(actual_warning, expected_warning):
                # GH#38630 pytest.filterwarnings does not suppress these.
                if actual_warning.category == ResourceWarning:
                    # GH 44732: Don't make the CI flaky by filtering SSL-related
                    # ResourceWarning from dependencies
                    unclosed_ssl = (
                        "unclosed transport <asyncio.sslproto._SSLProtocolTransport",
                        "unclosed <ssl.SSLSocket",
                    )
                    if any(msg in str(actual_warning.message) for msg in unclosed_ssl):
                        continue
                    # GH 44844: Matplotlib leaves font files open during the entire process
                    # upon import. Don't make CI flaky if ResourceWarning raised
                    # due to these open files.
                    if any("matplotlib" in mod for mod in sys.modules):
                        continue
    
                extra_warnings.append(
                    (
                        actual_warning.category.__name__,
                        actual_warning.message,
                        actual_warning.filename,
                        actual_warning.lineno,
                    )
                )
    
        if extra_warnings:
>           raise AssertionError(f"Caused unexpected warning(s): {repr(extra_warnings)}")
E           AssertionError: Caused unexpected warning(s): [('DeprecationWarning', DeprecationWarning("The 'text' argument to find()-type methods is deprecated. Use 'string' instead."), '/usr/share/miniconda/envs/pandas-dev/lib/python3.8/site-packages/bs4/element.py', 784)]

From the docs

The string argument is new in Beautiful Soup 4.4.0. In earlier versions it was called text:****

@phofl phofl added this to the 1.4.3 milestone Apr 8, 2022
@@ -577,7 +577,7 @@ def _parse_tables(self, doc, match, attrs):
for elem in table.find_all(style=re.compile(r"display:\s*none")):
elem.decompose()

if table not in unique_tables and table.find(text=match) is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this backwards compatible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was introduced in version 4.4.0, and the minimum version in pandas is 4.8.2, so should be fine

@jreback
Copy link
Contributor

jreback commented Apr 8, 2022

can u close/open again as the doc build failed to run

@phofl
Copy link
Member

phofl commented Apr 8, 2022

It seems that the doc build also fails on master, but this seems to be unrelated

@MarcoGorelli
Copy link
Member Author

yes that's unrelated, this only addresses the Posix / actions-38.yaml job

@jreback jreback merged commit 9b4f03e into pandas-dev:main Apr 8, 2022
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Apr 8, 2022
pllim added a commit to pllim/astropy that referenced this pull request Apr 8, 2022
pllim added a commit to pllim/astropy that referenced this pull request Apr 8, 2022
pllim added a commit to pllim/astropy that referenced this pull request Apr 8, 2022
jreback pushed a commit that referenced this pull request Apr 8, 2022
@simonjayhawkins simonjayhawkins added CI Continuous Integration IO HTML read_html, to_html, Styler.apply, Styler.applymap Dependencies Required and optional dependencies and removed CI Continuous Integration labels Apr 9, 2022
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies Required and optional dependencies IO HTML read_html, to_html, Styler.apply, Styler.applymap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants