Skip to content

DOC: Use nbsphinx #15581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ci/requirements-3.5_DOC.run
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ nbconvert
nbformat
notebook
matplotlib
seaborn
scipy
lxml
beautifulsoup4
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-3.5_DOC.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ echo "[install DOC_BUILD deps]"

pip install pandas-gbq

conda install -n pandas -c conda-forge feather-format
conda install -n pandas -c conda-forge feather-format nbsphinx pandoc

conda install -n pandas -c r r rpy2 --yes
2 changes: 2 additions & 0 deletions ci/requirements_all.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ pytest-cov
pytest-xdist
flake8
sphinx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, is this our catch all for what people should install to get a dev env? should add seaborn then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have requirements_dev.txt, which I think is more appropriate. I read requirements_all.txt as all the optional deps for using pandas (xlrd, xlwt, tables, etc.). Maybe sphinx and the test libs should go in requirements_dev.txt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think these should be more clear what the purpose of these are. You can add a comment (or 2) at the top of the file I think (# is ignored)

nbsphinx
ipython
python-dateutil
pytz
Expand All @@ -19,6 +20,7 @@ scipy
numexpr
pytables
matplotlib
seaborn
lxml
sqlalchemy
bottleneck
Expand Down
4 changes: 3 additions & 1 deletion doc/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,9 @@ have ``sphinx`` and ``ipython`` installed. `numpydoc
<https://github.com/numpy/numpydoc>`_ is used to parse the docstrings that
follow the Numpy Docstring Standard (see above), but you don't need to install
this because a local copy of ``numpydoc`` is included in the pandas source
code.
code. `nbsphinx <https://nbsphinx.readthedocs.io/>`_ is used to convert
Jupyter notebooks. You will need to install it if you intend to modify any of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prob add this to the requirements_docs (which should also have conda-forge as a channel in the instructions)

the notebooks included in the documentation.

Furthermore, it is recommended to have all `optional dependencies
<http://pandas.pydata.org/pandas-docs/dev/install.html#optional-dependencies>`_
Expand Down
122 changes: 29 additions & 93 deletions doc/make.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,106 +106,42 @@ def clean():


@contextmanager
def cleanup_nb(nb):
try:
yield
finally:
try:
os.remove(nb + '.executed')
except OSError:
pass


def get_kernel():
"""Find the kernel name for your python version"""
return 'python%s' % sys.version_info.major


def execute_nb(src, dst, allow_errors=False, timeout=1000, kernel_name=''):
"""
Execute notebook in `src` and write the output to `dst`

Parameters
----------
src, dst: str
path to notebook
allow_errors: bool
timeout: int
kernel_name: str
defualts to value set in notebook metadata

Returns
-------
dst: str
"""
import nbformat
from nbconvert.preprocessors import ExecutePreprocessor

with io.open(src, encoding='utf-8') as f:
nb = nbformat.read(f, as_version=4)

ep = ExecutePreprocessor(allow_errors=allow_errors,
timeout=timeout,
kernel_name=kernel_name)
ep.preprocess(nb, resources={})

with io.open(dst, 'wt', encoding='utf-8') as f:
nbformat.write(nb, f)
return dst


def convert_nb(src, dst, to='html', template_file='basic'):
def maybe_exclude_notebooks():
"""
Convert a notebook `src`.

Parameters
----------
src, dst: str
filepaths
to: {'rst', 'html'}
format to export to
template_file: str
name of template file to use. Default 'basic'
Skip building the notebooks if pandoc is not installed.
This assumes that nbsphinx is installed.
"""
from nbconvert import HTMLExporter, RSTExporter

dispatch = {'rst': RSTExporter, 'html': HTMLExporter}
exporter = dispatch[to.lower()](template_file=template_file)

(body, resources) = exporter.from_filename(src)
with io.open(dst, 'wt', encoding='utf-8') as f:
f.write(body)
return dst
base = os.path.dirname(__file__)
notebooks = [os.path.join(base, 'source', nb)
for nb in ['style.ipynb']]
contents = {}
try:
import nbconvert
nbconvert.utils.pandoc.get_pandoc_version()
except (ImportError, nbconvert.utils.pandoc.PandocMissing):
print("Warning: Pandoc is not installed. Skipping Notebooks.")
for nb in notebooks:
with open(nb, 'rt') as f:
contents[nb] = f.read()
os.remove(nb)
yield
for nb, content in contents.items():
with open(nb, 'wt') as f:
f.write(content)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I seeing correctly that you are writing the file(s) each time after Sphinx is run?
This will force re-parsing each time you run it again ...
Otherwise, Sphinx would simply skip over the already built (and up-to-date) notebooks.

Also, if somebody is editing the notebook(s) at the same time, this might cause problems, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I seeing correctly that you are writing the file(s) each time after Sphinx is run?

If nbsphinx is installed then contents will be empty. So we should only be writing stuff if nbsphinx is not installed, and sphinx isn't parsing these files anyway.

For editing the notebooks at the same time, I don't think there are any additional problems with this approach.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see now. It still seems a bit brutal to me and I'm generally skeptical about writing to the source directory, but I guess it's OK.



def html():
check_build()

notebooks = [
'source/html-styling.ipynb',
]

for nb in notebooks:
with cleanup_nb(nb):
try:
print("Converting %s" % nb)
kernel_name = get_kernel()
executed = execute_nb(nb, nb + '.executed', allow_errors=True,
kernel_name=kernel_name)
convert_nb(executed, nb.rstrip('.ipynb') + '.html')
except (ImportError, IndexError) as e:
print(e)
print("Failed to convert %s" % nb)

if os.system('sphinx-build -P -b html -d build/doctrees '
'source build/html'):
raise SystemExit("Building HTML failed.")
try:
# remove stale file
os.remove('source/html-styling.html')
os.remove('build/html/pandas.zip')
except:
pass
with maybe_exclude_notebooks():
if os.system('sphinx-build -P -b html -d build/doctrees '
'source build/html'):
raise SystemExit("Building HTML failed.")
try:
# remove stale file
os.remove('build/html/pandas.zip')
except:
pass


def zip_html():
Expand Down
14 changes: 10 additions & 4 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,16 @@
'numpydoc', # used to parse numpy-style docstrings for autodoc
'ipython_sphinxext.ipython_directive',
'ipython_sphinxext.ipython_console_highlighting',
'IPython.sphinxext.ipython_console_highlighting', # lowercase didn't work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there is a difference here in _ vs .. The one already present with the _ is our vendored version (and the directory is with lower case, so that works). What was the reason you needed this?
(although we should maybe migrate to just use IPython itself instead of our own version, but should look again at why we exactly were doing this)

'sphinx.ext.intersphinx',
'sphinx.ext.coverage',
'sphinx.ext.mathjax',
'sphinx.ext.ifconfig',
'sphinx.ext.linkcode',
'nbsphinx',
]


exclude_patterns = ['**.ipynb_checkpoints']

with open("index.rst") as f:
index_rst_lines = f.readlines()
Expand All @@ -70,15 +72,16 @@
# JP: added from sphinxdocs
autosummary_generate = False

if any([re.match("\s*api\s*",l) for l in index_rst_lines]):
if any([re.match("\s*api\s*", l) for l in index_rst_lines]):
autosummary_generate = True

files_to_delete = []
for f in os.listdir(os.path.dirname(__file__)):
if not f.endswith('.rst') or f.startswith('.') or os.path.basename(f) == 'index.rst':
if (not f.endswith(('.ipynb', '.rst')) or
f.startswith('.') or os.path.basename(f) == 'index.rst'):
continue

_file_basename = f.split('.rst')[0]
_file_basename = os.path.splitext(f)[0]
_regex_to_match = "\s*{}\s*$".format(_file_basename)
if not any([re.match(_regex_to_match, line) for line in index_rst_lines]):
files_to_delete.append(f)
Expand Down Expand Up @@ -261,6 +264,9 @@
# Output file base name for HTML help builder.
htmlhelp_basename = 'pandas'

# -- Options for nbsphinx ------------------------------------------------

nbsphinx_allow_errors = True

# -- Options for LaTeX output --------------------------------------------

Expand Down
5 changes: 2 additions & 3 deletions doc/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -347,15 +347,14 @@ have ``sphinx`` and ``ipython`` installed. `numpydoc
<https://github.com/numpy/numpydoc>`_ is used to parse the docstrings that
follow the Numpy Docstring Standard (see above), but you don't need to install
this because a local copy of numpydoc is included in the *pandas* source
code.
`nbconvert <https://nbconvert.readthedocs.io/en/latest/>`_ and
`nbformat <https://nbformat.readthedocs.io/en/latest/>`_ are required to build
code. `nbsphinx <https://nbsphinx.readthedocs.io/>`_ is required to build
the Jupyter notebooks included in the documentation.

If you have a conda environment named ``pandas_dev``, you can install the extra
requirements with::

conda install -n pandas_dev sphinx ipython nbconvert nbformat
conda install -n pandas_dev -c conda-forge nbsphinx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should just always add -c conda-forge (as defaults is always added), and things just work


Furthermore, it is recommended to have all :ref:`optional dependencies <install.optional_dependencies>`.
installed. This is not strictly necessary, but be aware that you will see some error
Expand Down
Loading