Skip to content

Commit 6982dd4

Browse files
committed
Add improved docs
1 parent 92ef96e commit 6982dd4

File tree

1 file changed

+115
-65
lines changed

1 file changed

+115
-65
lines changed

readme.md

Lines changed: 115 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -8,24 +8,67 @@
88
[![Backers][backers-badge]][collective]
99
[![Chat][chat-badge]][chat]
1010

11-
[**hast**][hast] utility to transform to [**nlcst**][nlcst].
11+
[hast][] utility to transform to [nlcst][].
1212

13-
> **Note**: You probably want to use [`rehype-retext`][rehype-retext].
13+
## Contents
1414

15-
## Install
15+
* [What is this?](#what-is-this)
16+
* [When should I use this?](#when-should-i-use-this)
17+
* [Install](#install)
18+
* [Use](#use)
19+
* [API](#api)
20+
* [`toNlcst(tree, file, Parser)`](#tonlcsttree-file-parser)
21+
* [Types](#types)
22+
* [Compatibility](#compatibility)
23+
* [Security](#security)
24+
* [Related](#related)
25+
* [Contribute](#contribute)
26+
* [License](#license)
27+
28+
## What is this?
29+
30+
This package is a utility that takes a [hast][] (HTML) syntax tree as input and
31+
turns it into [nlcst][] (natural language).
32+
33+
## When should I use this?
34+
35+
This project is useful when you want to deal with ASTs and inspect the natural
36+
language inside HTML.
37+
Unfortunately, there is no way yet to apply changes to the nlcst back into
38+
hast.
1639

17-
This package is [ESM only](https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c):
18-
Node 12+ is needed to use it and it must be `import`ed instead of `require`d.
40+
The mdast utility [`mdast-util-to-nlcst`][mdast-util-to-nlcst] does the same but
41+
uses a markdown tree as input.
1942

20-
[npm][]:
43+
The rehype plugin [`rehype-retext`][rehype-retext] wraps this utility to do the
44+
same at a higher-level (easier) abstraction.
45+
46+
## Install
47+
48+
This package is [ESM only][esm].
49+
In Node.js (version 12.20+, 14.14+, or 16.0+), install with [npm][]:
2150

2251
```sh
2352
npm install hast-util-to-nlcst
2453
```
2554

55+
In Deno with [`esm.sh`][esmsh]:
56+
57+
```js
58+
import {toNlcst} from "https://esm.sh/hast-util-to-nlcst@2"
59+
```
60+
61+
In browsers with [`esm.sh`][esmsh]:
62+
63+
```html
64+
<script type="module">
65+
import {toNlcst} from "https://esm.sh/hast-util-to-nlcst@2?bundle"
66+
</script>
67+
```
68+
2669
## Use
2770

28-
Say we have the following `example.html`:
71+
Say our document `example.html` contains:
2972

3073
```html
3174
<article>
@@ -35,64 +78,58 @@ Say we have the following `example.html`:
3578
</article>
3679
```
3780

38-
…and next to it, `index.js`:
81+
…and our module `example.js` looks as follows:
3982

4083
```js
41-
import {readSync} from 'to-vfile'
84+
import {read} from 'to-vfile'
4285
import {inspect} from 'unist-util-inspect'
4386
import {toNlcst} from 'hast-util-to-nlcst'
4487
import {ParseEnglish} from 'parse-english'
45-
import rehype from 'rehype'
88+
import {rehype} from 'rehype'
4689

47-
const file = readSync('example.html')
90+
const file = await read('example.html')
4891
const tree = rehype().parse(file)
4992

5093
console.log(inspect(toNlcst(tree, file, ParseEnglish)))
5194
```
5295

53-
Which, when running, yields:
96+
…now running `node example.js` yields (positional info removed for brevity):
5497

5598
```txt
5699
RootNode[2] (1:1-6:1, 0-134)
57-
├─ ParagraphNode[3] (1:10-3:3, 9-24)
58-
├─ WhiteSpaceNode: "\n " (1:10-2:3, 9-12)
59-
├─ SentenceNode[2] (2:3-2:12, 12-21)
60-
│ │ ├─ WordNode[1] (2:3-2:11, 12-20)
61-
│ │ │ └─ TextNode: "Implicit" (2:3-2:11, 12-20)
62-
│ │ └─ PunctuationNode: "." (2:11-2:12, 20-21)
63-
└─ WhiteSpaceNode: "\n " (2:12-3:3, 21-24)
64-
└─ ParagraphNode[1] (3:7-3:43, 28-64)
65-
└─ SentenceNode[4] (3:7-3:43, 28-64)
66-
├─ WordNode[1] (3:7-3:15, 28-36)
67-
└─ TextNode: "Explicit" (3:7-3:15, 28-36)
68-
├─ PunctuationNode: ":" (3:15-3:16, 36-37)
69-
├─ WhiteSpaceNode: " " (3:16-3:17, 37-38)
70-
└─ WordNode[4] (3:25-3:43, 46-64)
71-
├─ TextNode: "foo" (3:25-3:28, 46-49)
72-
├─ TextNode: "s" (3:37-3:38, 58-59)
73-
├─ PunctuationNode: "-" (3:38-3:39, 59-60)
74-
└─ TextNode: "ball" (3:39-3:43, 60-64)
100+
├─0 ParagraphNode[3] (1:10-3:3, 9-24)
101+
├─0 WhiteSpaceNode "\n " (1:10-2:3, 9-12)
102+
├─1 SentenceNode[2] (2:3-2:12, 12-21)
103+
├─0 WordNode[1] (2:3-2:11, 12-20)
104+
└─0 TextNode "Implicit" (2:3-2:11, 12-20)
105+
└─1 PunctuationNode "." (2:11-2:12, 20-21)
106+
└─2 WhiteSpaceNode "\n " (2:12-3:3, 21-24)
107+
└─1 ParagraphNode[1] (3:7-3:43, 28-64)
108+
└─0 SentenceNode[4] (3:7-3:43, 28-64)
109+
├─0 WordNode[1] (3:7-3:15, 28-36)
110+
└─0 TextNode "Explicit" (3:7-3:15, 28-36)
111+
├─1 PunctuationNode ":" (3:15-3:16, 36-37)
112+
├─2 WhiteSpaceNode " " (3:16-3:17, 37-38)
113+
└─3 WordNode[4] (3:25-3:43, 46-64)
114+
├─0 TextNode "foo" (3:25-3:28, 46-49)
115+
├─1 TextNode "s" (3:37-3:38, 58-59)
116+
├─2 PunctuationNode "-" (3:38-3:39, 59-60)
117+
└─3 TextNode "ball" (3:39-3:43, 60-64)
75118
```
76119

77120
## API
78121

79-
This package exports the following identifiers: `toNlcst`.
122+
This package exports the identifier `toNlcst`.
80123
There is no default export.
81124

82125
### `toNlcst(tree, file, Parser)`
83126

84-
Transform the given [**hast**][hast] [*tree*][tree] to [**nlcst**][nlcst].
85-
86-
##### Parameters
127+
[hast][] utility to transform to [nlcst][].
87128

88-
* `tree` ([`HastNode`][hast-node])
89-
[*Tree*][tree] with [positional info][positional-information]
90-
([`HastNode`][hast-node])
91-
* `file` ([`VFile`][vfile])
92-
— Virtual file
93-
* `parser` (`Function`)
94-
[**nlcst**][nlcst] parser, such as [`parse-english`][english],
95-
[`parse-dutch`][dutch], or [`parse-latin`][latin]
129+
> 👉 **Note**: `tree` must have positional info, `file` must be a [vfile][]
130+
> corresponding to `tree`, and `Parser` must be a parser such as
131+
> [`parse-english`][parse-english], [`parse-dutch`][parse-dutch], or
132+
> [`parse-latin`][parse-latin].
96133
97134
##### Returns
98135

@@ -117,7 +154,7 @@ more info).
117154
###### Ignored nodes
118155

119156
Some elements are ignored and their content will not be present in
120-
[**nlcst**][nlcst]: `<script>`, `<style>`, `<svg>`, `<math>`, `<del>`.
157+
**[nlcst][]**: `<script>`, `<style>`, `<svg>`, `<math>`, `<del>`.
121158

122159
To ignore other elements, add a `data-nlcst` attribute with a value of `ignore`:
123160

@@ -128,7 +165,8 @@ To ignore other elements, add a `data-nlcst` attribute with a value of `ignore`:
128165

129166
###### Source nodes
130167

131-
`<code>` elements are mapped to [`Source`][source] nodes in [**nlcst**][nlcst].
168+
`<code>` elements are mapped to [`Source`][nlcst-source] nodes in
169+
**[nlcst][]**.
132170

133171
To mark other elements as source, add a `data-nlcst` attribute with a value
134172
of `source`:
@@ -138,6 +176,18 @@ of `source`:
138176
<p data-nlcst="source">Completely marked.</p>
139177
```
140178

179+
## Types
180+
181+
This package is fully typed with [TypeScript][].
182+
It exports the additional types `ParserConstructor` and `ParserInstance`.
183+
184+
## Compatibility
185+
186+
Projects maintained by the unified collective are compatible with all maintained
187+
versions of Node.js.
188+
As of now, that is Node.js 12.20+, 14.14+, and 16.0+.
189+
Our projects sometimes work with older versions, but this is not guaranteed.
190+
141191
## Security
142192

143193
`hast-util-to-nlcst` does not change the original syntax tree so there are no
@@ -147,19 +197,15 @@ openings for [cross-site scripting (XSS)][xss] attacks.
147197

148198
* [`mdast-util-to-nlcst`](https://github.com/syntax-tree/mdast-util-to-nlcst)
149199
— transform mdast to nlcst
150-
* [`mdast-util-to-hast`](https://github.com/syntax-tree/mdast-util-to-hast)
151-
— transform mdast to hast
152200
* [`hast-util-to-mdast`](https://github.com/syntax-tree/hast-util-to-mdast)
153201
— transform hast to mdast
154202
* [`hast-util-to-xast`](https://github.com/syntax-tree/hast-util-to-xast)
155203
— transform hast to xast
156-
* [`hast-util-sanitize`](https://github.com/syntax-tree/hast-util-sanitize)
157-
— sanitize hast nodes
158204

159205
## Contribute
160206

161-
See [`contributing.md` in `syntax-tree/.github`][contributing] for ways to get
162-
started.
207+
See [`contributing.md`][contributing] in [`syntax-tree/.github`][health] for
208+
ways to get started.
163209
See [`support.md`][support] for ways to get help.
164210

165211
This project has a [code of conduct][coc].
@@ -200,38 +246,42 @@ abide by its terms.
200246

201247
[npm]: https://docs.npmjs.com/cli/install
202248

203-
[license]: license
249+
[esm]: https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c
204250

205-
[author]: https://wooorm.com
251+
[esmsh]: https://esm.sh
206252

207-
[contributing]: https://github.com/syntax-tree/.github/blob/HEAD/contributing.md
253+
[typescript]: https://www.typescriptlang.org
254+
255+
[license]: license
208256

209-
[support]: https://github.com/syntax-tree/.github/blob/HEAD/support.md
257+
[author]: https://wooorm.com
210258

211-
[coc]: https://github.com/syntax-tree/.github/blob/HEAD/code-of-conduct.md
259+
[health]: https://github.com/syntax-tree/.github
212260

213-
[english]: https://github.com/wooorm/parse-english
261+
[contributing]: https://github.com/syntax-tree/.github/blob/main/contributing.md
214262

215-
[latin]: https://github.com/wooorm/parse-latin
263+
[support]: https://github.com/syntax-tree/.github/blob/main/support.md
216264

217-
[dutch]: https://github.com/wooorm/parse-dutch
265+
[coc]: https://github.com/syntax-tree/.github/blob/main/code-of-conduct.md
218266

219267
[rehype-retext]: https://github.com/rehypejs/rehype-retext
220268

221-
[tree]: https://github.com/syntax-tree/unist#tree
222-
223-
[positional-information]: https://github.com/syntax-tree/unist#positional-information
269+
[vfile]: https://github.com/vfile/vfile
224270

225271
[hast]: https://github.com/syntax-tree/hast
226272

227-
[hast-node]: https://github.com/syntax-tree/hast#nodes
228-
229273
[nlcst]: https://github.com/syntax-tree/nlcst
230274

231275
[nlcst-node]: https://github.com/syntax-tree/nlcst#nodes
232276

233-
[vfile]: https://github.com/vfile/vfile
277+
[nlcst-source]: https://github.com/syntax-tree/nlcst#source
234278

235-
[source]: https://github.com/syntax-tree/nlcst#source
279+
[mdast-util-to-nlcst]: https://github.com/syntax-tree/mdast-util-to-nlcst
236280

237281
[xss]: https://en.wikipedia.org/wiki/Cross-site_scripting
282+
283+
[parse-english]: https://github.com/wooorm/parse-english
284+
285+
[parse-latin]: https://github.com/wooorm/parse-latin
286+
287+
[parse-dutch]: https://github.com/wooorm/parse-dutch

0 commit comments

Comments
 (0)