Description
By all sense of the meaning; I'm not a programmer, so I apologize upfront if this is out of place. That said, I was trying to resurrect a project from another researcher switching it over from Python 2.7 to Python 3.9. Lot's of clean up, but I stumbled upon this error at execution of MyProject:
Traceback (most recent call last):
File "/home/user/MyProject.py", line 25, in
import argparse,requests,sys,os,threading,bs4,warnings,random
File "/usr/local/lib/python3.9/dist-packages/bs4/init.py", line 30, in
from .builder import builder_registry, ParserRejectedMarkup
File "/usr/local/lib/python3.9/dist-packages/bs4/builder/init.py", line 314, in
from . import _html5lib
File "/usr/local/lib/python3.9/dist-packages/bs4/builder/_html5lib.py", line 70, in
class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: module 'html5lib.treebuilders' has no attribute '_base'
The project's libraries include:
- bs4 (BeautifulSoup4 version 4.4.1)
- html5lib (version 1.1)
First, I tried removing and reinstalling python3-bs4 and python3-html5lib, but it didn't resolve the issue. However, I had two successes in resolving the error, but I'm not fully sure what kind of impact this may introduce to other applications on my system in the future:
Resolution 1: I modified "_base" to "base" throughout file "/usr/local/lib/python3.9/dist-packages/bs4/builder/_html5lib.py"
Resolution 2: Downgraded the html5lib package to version 1.0b8 (python3.9 -m pip --upgrade html5lib==1.0b8)
Activity
gsnedders commentedon Jan 13, 2021
Without looking too closely, I suspect you need a more up-to-date version of beautifulsoup4. Can you try using the latest BS4 release and see if it works?
schlpr0k-redbot commentedon Jan 14, 2021
Same issue, but this is Kali Linux, so.. that's not saying much. I'll setup an Ubuntu 20.04 instance this weekend and try again.
root@kali:~# apt install python3-bs4 python-bs4 python3-html5lib python-html5lib
Reading package lists... Done
Building dependency tree
Reading state information... Done
python3-bs4 is already the newest version (4.9.3-1).
python3-html5lib is already the newest version (1.1-2).
python-bs4 is already the newest version (4.8.2-1).
python-html5lib is already the newest version (1.0.1-1).
Also tried: apt reinstall
Note: I would purge and reinstall, but Kali will try to remove major portions of the OS's applications, so I can't do that.
ambv commentedon Mar 1, 2023
This is the bug on the BeautifulSoup side:
https://bugs.launchpad.net/beautifulsoup/+bug/1603299
The solution is to use bs4 4.5.0 or newer.