Skip to content

AttributeError: module 'html5lib.treebuilders' has no attribute '_base' (Python3.9) #528

Closed
@schlpr0k-redbot

Description

@schlpr0k-redbot

By all sense of the meaning; I'm not a programmer, so I apologize upfront if this is out of place. That said, I was trying to resurrect a project from another researcher switching it over from Python 2.7 to Python 3.9. Lot's of clean up, but I stumbled upon this error at execution of MyProject:

Traceback (most recent call last):
File "/home/user/MyProject.py", line 25, in
import argparse,requests,sys,os,threading,bs4,warnings,random
File "/usr/local/lib/python3.9/dist-packages/bs4/init.py", line 30, in
from .builder import builder_registry, ParserRejectedMarkup
File "/usr/local/lib/python3.9/dist-packages/bs4/builder/init.py", line 314, in
from . import _html5lib
File "/usr/local/lib/python3.9/dist-packages/bs4/builder/_html5lib.py", line 70, in
class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: module 'html5lib.treebuilders' has no attribute '_base'

The project's libraries include:

  • bs4 (BeautifulSoup4 version 4.4.1)
  • html5lib (version 1.1)

First, I tried removing and reinstalling python3-bs4 and python3-html5lib, but it didn't resolve the issue. However, I had two successes in resolving the error, but I'm not fully sure what kind of impact this may introduce to other applications on my system in the future:

Resolution 1: I modified "_base" to "base" throughout file "/usr/local/lib/python3.9/dist-packages/bs4/builder/_html5lib.py"

Resolution 2: Downgraded the html5lib package to version 1.0b8 (python3.9 -m pip --upgrade html5lib==1.0b8)

Activity

gsnedders

gsnedders commented on Jan 13, 2021

@gsnedders
Member

Without looking too closely, I suspect you need a more up-to-date version of beautifulsoup4. Can you try using the latest BS4 release and see if it works?

schlpr0k-redbot

schlpr0k-redbot commented on Jan 14, 2021

@schlpr0k-redbot
Author

Same issue, but this is Kali Linux, so.. that's not saying much. I'll setup an Ubuntu 20.04 instance this weekend and try again.

root@kali:~# apt install python3-bs4 python-bs4 python3-html5lib python-html5lib
Reading package lists... Done
Building dependency tree
Reading state information... Done
python3-bs4 is already the newest version (4.9.3-1).
python3-html5lib is already the newest version (1.1-2).
python-bs4 is already the newest version (4.8.2-1).
python-html5lib is already the newest version (1.0.1-1).

Also tried: apt reinstall

Note: I would purge and reinstall, but Kali will try to remove major portions of the OS's applications, so I can't do that.

ambv

ambv commented on Mar 1, 2023

@ambv
Member

This is the bug on the BeautifulSoup side:
https://bugs.launchpad.net/beautifulsoup/+bug/1603299

The solution is to use bs4 4.5.0 or newer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @ambv@gsnedders@schlpr0k-redbot

        Issue actions

          AttributeError: module 'html5lib.treebuilders' has no attribute '_base' (Python3.9) · Issue #528 · html5lib/html5lib-python