Skip to content

Commit 776590b

Browse files
committed
Switch linkchecker to use html5ever for html parsing.
The existing regex-based HTML parsing was just too primitive to correctly handle HTML content. Some books have legitimate `href="…"` text which should not be validated because it is part of the text, not actual HTML.
1 parent bf6a1b1 commit 776590b

File tree

3 files changed

+233
-194
lines changed

3 files changed

+233
-194
lines changed

Cargo.lock

+1
Original file line numberDiff line numberDiff line change
@@ -2274,6 +2274,7 @@ dependencies = [
22742274
name = "linkchecker"
22752275
version = "0.1.0"
22762276
dependencies = [
2277+
"html5ever",
22772278
"once_cell",
22782279
"regex",
22792280
]

src/tools/linkchecker/Cargo.toml

+1
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ path = "main.rs"
1010
[dependencies]
1111
regex = "1"
1212
once_cell = "1"
13+
html5ever = "0.26.0"

0 commit comments

Comments
 (0)