Skip to content

Commit b4b7fb2

Browse files
committed
Update README, and move to using reST. Fixes #5, #22.
1 parent ef92bf7 commit b4b7fb2

File tree

2 files changed

+113
-39
lines changed

2 files changed

+113
-39
lines changed

README

Lines changed: 0 additions & 39 deletions
This file was deleted.

README.rst

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
html5lib
2+
========
3+
4+
html5lib is a pure-python library for parsing HTML. It is designed to
5+
conform to the HTML specification, as is implemented by all major web
6+
browsers.
7+
8+
9+
Requirements
10+
------------
11+
12+
Python 2.6 and above as well as Python 3.0 and above are
13+
supported. Implementations known to work are CPython (as the reference
14+
implementation) and PyPy. Jython is known *not* to work due to various
15+
bugs in its implementation of the language. Others such as IronPython
16+
may or may not work; if you wish to try, you are strongly encouraged
17+
to run the testsuite and report back!
18+
19+
The only required library dependency is ``six``, this can be found
20+
packaged in PyPi.
21+
22+
Optionally:
23+
24+
- ``datrie`` can be used to improve parsing performance (though in
25+
almost all cases the improvement is marginal);
26+
27+
- ``lxml`` is supported as a tree format (for both building and
28+
walking) under CPython (but *not* PyPy where it is known to cause
29+
segfaults);
30+
31+
- ``genshi`` has a treewalker (but not builder); and
32+
33+
- ``chardet`` can be used as a fallback when character encoding cannot
34+
be determined (note currently this is only packaged on PyPi for
35+
Python 2, though several package managers include unofficial ports
36+
to Python 3).
37+
38+
39+
Installation
40+
------------
41+
42+
html5lib is packaged with distutils. To install it use::
43+
44+
$ python setup.py install
45+
46+
47+
Usage
48+
-----
49+
50+
Simple usage follows this pattern::
51+
52+
import html5lib
53+
with open("mydocument.html", "r") as fp:
54+
document = html5lib.parse(f)
55+
56+
or::
57+
58+
import html5lib
59+
document = html5lib.parse("<p>Hello World!")
60+
61+
More documentation is available in the docstrings.
62+
63+
64+
Bugs
65+
----
66+
67+
Please report any bugs on the `issue tracker
68+
<https://github.com/html5lib/html5lib-python/issues>`_.
69+
70+
71+
Tests
72+
-----
73+
74+
These are contained in the html5lib-tests repository and included as a
75+
submodule, thus for git checkouts they must be initialized (for
76+
release tarballs this is unneeded)::
77+
78+
$ git submodule init
79+
$ git submodule update
80+
81+
And then they can be run, with ``nose`` installed, using the
82+
``nosetests`` command in the root directory. All should pass.
83+
84+
85+
Contributing
86+
------------
87+
88+
Pull requests are more than welcome — both to the library and to the
89+
documentation. Some useful information:
90+
91+
- We aim to follow PEP 8 in the library, but ignoring the
92+
79-character-per-line limit, instead following a soft limit of 99,
93+
but allowing lines over this where it is the readable thing to do.
94+
95+
- We keep pyflakes reporting no errors or warnings at all times.
96+
97+
- We keep the master branch passing all tests at all times on all
98+
supported versions.
99+
100+
Travis CI is run against all pull requests and should enforce all of
101+
the above.
102+
103+
We also use an external code-review tool, which uses your GitHub login
104+
to authenticate. You'll get emails for changes on the review.
105+
106+
107+
Questions?
108+
----------
109+
110+
There's a mailing list available for support on Google Groups,
111+
`html5lib-discuss <http://groups.google.com/group/html5lib-discuss>`_,
112+
though you may have more success (and get a far quicker response)
113+
asking on IRC in #whatwg on irc.freenode.net.

0 commit comments

Comments
 (0)