Skip to content

Commit b7fb324

Browse files
authored
Merge pull request #786 from ESMValGroup/version2_documentation
Version 2: Add chapter on adding new recipes/diagnostics (with provenance)
2 parents a36c2de + 01ada52 commit b7fb324

File tree

5 files changed

+200
-4
lines changed

5 files changed

+200
-4
lines changed

doc/sphinx/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@
157157
# Add any paths that contain custom static files (such as style sheets) here,
158158
# relative to this directory. They are copied after the builtin static files,
159159
# so a file named "default.css" will overwrite the builtin "default.css".
160-
html_static_path = ['_static']
160+
html_static_path = []
161161

162162
# Add any extra paths that contain custom files (such as robots.txt or
163163
# .htaccess) here, relative to this directory. These files are copied

doc/sphinx/source/developer_guide2/git_repository.inc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
.. _git_repository:
22

3+
**************
34
Git repository
45
**************
56

@@ -161,7 +162,7 @@ Do-s
161162
* Comment your code as much as possible and in English.
162163
* Use short but self-explanatory variable names (e.g., model_input and reference_input instead of xm and xr).
163164
* Consider a modular/functional programming style. This often makes code easier to read and deletes intermediate variables immediately. If possible, separate diagnostic calculations from plotting routines.
164-
* Consider reusing or extending existing code. General-purpose code can be found in diag_scripts/lib/ and in plot_scripts/.
165+
* Consider reusing or extending existing code. General-purpose code can be found in esmvaltool/diag_scripts/shared/.
165166
* Comment all switches and parameters including a list of all possible settings/options in the header section of your code (see also Section :ref:`std_diag`).
166167
* Use templates for recipes (Section :ref:`std_recipe`) and diagnostics (Section :ref:`std_diag`) to help with proper documentation.
167168
* Keep your *FEATURE BRANCH* regularly synchronized with the *DEVELOPMENT BRANCH* (git merge).
@@ -174,5 +175,5 @@ Don't-s
174175
* Do not develop without proper version control (see do-s above).
175176
* Avoid large (memory, disk space) intermediate results. Delete intermediate files/variables or see modular/functional programming style.
176177
* Do not use hard-coded pathnames or filenames.
177-
* Do not mix developments / modifications of the ESMValTool framework and developments / modifications of diagnotics in the same *FEATURE BRANCH*.
178+
* Do not mix developments / modifications of the ESMValTool framework and developments / modifications of diagnostics in the same *FEATURE BRANCH*.
178179

doc/sphinx/source/developer_guide2/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
Developer's Guide
33
#################
44

5+
.. include:: new_diagnostic.inc
56
.. include:: porting.inc
6-
.. include:: core_team.inc
77
.. include:: git_repository.inc
8+
.. include:: core_team.inc
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
.. _new_diagnostic:
2+
3+
***************************************
4+
Contributing a new diagnostic or recipe
5+
***************************************
6+
7+
Getting started
8+
===============
9+
10+
Please discuss your idea for a new diagnostic or recipe with the development team before getting started,
11+
to avoid disappointment later. A good way to do this is to open an
12+
`issue on GitHub <https://github.com/ESMValGroup/ESMValTool/issues>`_.
13+
This is also a good way to get help.
14+
15+
Creating a recipe and diagnostic script(s)
16+
==========================================
17+
First create a recipe in esmvaltool/recipes to define the input data your analysis script needs
18+
and optionally preprocessing and other settings. Also create a script in the esmvaltool/diag_scripts directory
19+
and make sure it is referenced from your recipe. The easiest way to do this is probably to copy the example recipe
20+
and diagnostic script and adjust those to your needs.
21+
A good example recipe is esmvaltool/recipes/examples/recipe_python.yml
22+
and a good example diagnostic is esmvaltool/diag_scripts/examples/diagnostic.py.
23+
24+
If you have no preferred programming language yet, Python 3 is highly recommended, because it is most well supported.
25+
However, NCL, R, and Julia scripts are also supported.
26+
27+
Unfortunately not much documentation is available at this stage,
28+
so have a look at the other recipes and diagnostics for further inspiration.
29+
30+
Re-using existing code
31+
======================
32+
Always make sure your code is or can be released under a license that is compatible with the Apache 2 license.
33+
34+
If you have existing code in a supported scripting language, you have two options for re-using it. If it is fairly
35+
mature and a large amount of code, the preferred way is to package and publish it on the
36+
official package repository for that language and add it as a dependency of esmvaltool.
37+
If it is just a few simple scripts or packaging is not possible (i.e. for NCL) you can simply copy
38+
and paste the source code into the esmvaltool/diag_scripts directory.
39+
40+
If you have existing code in a compiled language like
41+
C, C++, or Fortran that you want to re-use, the recommended way to proceed is to add Python bindings and publish
42+
the package on PyPI so it can be installed as a Python dependency. You can then call the functions it provides
43+
using a Python diagnostic.
44+
45+
Interfaces and provenance
46+
=========================
47+
When ESMValTool runs a recipe, it will first find all data and run the default preprocessor steps plus any
48+
additional preprocessing steps defined in the recipe. Next it will run the diagnostic script defined in the recipe
49+
and finally it will store provenance information. Provenance information is stored in the
50+
`W3C PROV XML format <https://www.w3.org/TR/prov-xml/>`_
51+
and also plotted in an SVG file for human inspection. In addition to provenance information, a caption is also added
52+
to the plots.
53+
54+
In order to communicate with the diagnostic script, two interfaces have been defined, which are described below.
55+
Note that for Python and NCL diagnostics much more convenient methods are available than
56+
directly reading and writing the interface files. For other languages these are not implemented yet.
57+
58+
Using the interfaces from Python
59+
--------------------------------
60+
Always use :meth:`esmvaltool.diag_scripts.shared.run_diagnostic` to start your script and make use of a
61+
:class:`esmvaltool.diag_scripts.shared.ProvenanceLogger` to log provenance. Have a look at the example
62+
Python diagnostic in esmvaltool/recipes/examples/diagnostic.py for a complete example.
63+
64+
Using the interfaces from NCL
65+
-----------------------------
66+
TODO: write this
67+
68+
Generic interface between backend and diagnostic
69+
------------------------------------------------
70+
To provide the diagnostic script with the information it needs to run (e.g. location of input data, various settings),
71+
the backend creates a YAML file called settings.yml and provides the path to this file as the first command line
72+
argument to the diagnostic script.
73+
74+
The most interesting settings provided in this file are
75+
76+
.. code:: yaml
77+
78+
run_dir: /path/to/recipe_output/run/diagnostic_name/script_name
79+
work_dir: /path/to/recipe_output/work/diagnostic_name/script_name
80+
plot_dir: /path/to/recipe_output/work/diagnostic_name/script_name
81+
input_files:
82+
- /path/to/recipe_output/preproc/diagnostic_name/ta/metadata.yml
83+
- /path/to/recipe_output/preproc/diagnostic_name/pr/metadata.yml
84+
85+
Custom settings in the script section of the recipe will also be made available in this file.
86+
87+
There are three directories defined:
88+
89+
- :code:`run_dir` use this for storing temporary files
90+
- :code:`work_dir` use this for storing NetCDF files containing the data used to make a plot
91+
- :code:`plot_dir` use this for storing plots
92+
93+
Finally :code:`input_files` is a list of YAML files, containing a description of the preprocessed data. Each entry in these
94+
YAML files is a path to a preprocessed file in NetCDF format, with a list of various attributes.
95+
An example preprocessor metadata.yml file could look like this
96+
97+
.. code:: yaml
98+
99+
? /path/to/recipe_output/preproc/diagnostic_name/pr/CMIP5_GFDL-ESM2G_Amon_historical_r1i1p1_T2Ms_pr_2000-2002.nc
100+
: cmor_table: CMIP5
101+
dataset: GFDL-ESM2G
102+
diagnostic: diagnostic_name
103+
end_year: 2002
104+
ensemble: r1i1p1
105+
exp: historical
106+
filename: /path/to/recipe_output/preproc/diagnostic_name/pr/CMIP5_GFDL-ESM2G_Amon_historical_r1i1p1_T2Ms_pr_2000-2002.nc
107+
frequency: mon
108+
institute: [NOAA-GFDL]
109+
long_name: Precipitation
110+
mip: Amon
111+
modeling_realm: [atmos]
112+
preprocessor: preprocessor_name
113+
project: CMIP5
114+
recipe_dataset_index: 1
115+
reference_dataset: MPI-ESM-LR
116+
short_name: pr
117+
standard_name: precipitation_flux
118+
start_year: 2000
119+
units: kg m-2 s-1
120+
variable_group: pr
121+
? /path/to/recipe_output/preproc/diagnostic_name/pr/CMIP5_MPI-ESM-LR_Amon_historical_r1i1p1_T2Ms_pr_2000-2002.nc
122+
: cmor_table: CMIP5
123+
dataset: MPI-ESM-LR
124+
diagnostic: diagnostic_name
125+
end_year: 2002
126+
ensemble: r1i1p1
127+
exp: historical
128+
filename: /path/to/recipe_output/preproc/diagnostic1/pr/CMIP5_MPI-ESM-LR_Amon_historical_r1i1p1_T2Ms_pr_2000-2002.nc
129+
frequency: mon
130+
institute: [MPI-M]
131+
long_name: Precipitation
132+
mip: Amon
133+
modeling_realm: [atmos]
134+
preprocessor: preprocessor_name
135+
project: CMIP5
136+
recipe_dataset_index: 2
137+
reference_dataset: MPI-ESM-LR
138+
short_name: pr
139+
standard_name: precipitation_flux
140+
start_year: 2000
141+
units: kg m-2 s-1
142+
variable_group: pr
143+
144+
Generic interface between diagnostic and backend
145+
------------------------------------------------
146+
147+
After the diagnostic script has finished running, the backend will try to store provenance information. In order to
148+
link the produced files to input data, the diagnostic script needs to store a file called diagnostic_provenance.yml
149+
in it's :code:`run_dir`.
150+
151+
For output file produced by the diagnostic script, there should be an entry in the diagnostic_provenance.yml file.
152+
The name of each entry should be the path to the output file.
153+
Each file entry should at least contain the following items
154+
155+
- :code:`ancestors` a list of input files used to create the plot
156+
- :code:`caption` a caption text for the plot
157+
- :code:`plot_file` if the diagnostic also created a plot file, e.g. in .png format.
158+
159+
Each file entry can also contain items from the categories defined in the file esmvaltool/config_references.yml.
160+
The short entries will automatically be replaced by their longer equivalent in the final provenance records.
161+
It is possible to add custom provenance information by adding custom items to entries.
162+
163+
An example preprocessor diagnostic_provenance.yml file could look like this
164+
165+
.. code:: yaml
166+
167+
? /path/to/recipe_output/work/diagnostic_name/script_name/CMIP5_GFDL-ESM2G_Amon_historical_r1i1p1_T2Ms_pr_2000-2002_mean.nc
168+
: ancestors:
169+
- /path/to/recipe_output/preproc/diagnostic_name/pr/CMIP5_GFDL-ESM2G_Amon_historical_r1i1p1_T2Ms_pr_2000-2002.nc
170+
authors: [ande_bo, righ_ma]
171+
caption: Average Precipitation between 2000 and 2002 according to GFDL-ESM2G.
172+
domains: [global]
173+
plot_file: /path/to/recipe_output/plots/diagnostic_name/script_name/CMIP5_GFDL-ESM2G_Amon_historical_r1i1p1_T2Ms_pr_2000-2002_mean.png
174+
plot_type: zonal
175+
references: [acknow_project]
176+
statistics: [mean]
177+
? /path/to/recipe_output/work/diagnostic_name/script_name/CMIP5_MPI-ESM-LR_Amon_historical_r1i1p1_T2Ms_pr_2000-2002_mean.nc
178+
: ancestors:
179+
- /path/to/recipe_output/preproc/diagnostic_name/pr/CMIP5_MPI-ESM-LR_Amon_historical_r1i1p1_T2Ms_pr_2000-2002.nc
180+
authors: [ande_bo, righ_ma]
181+
caption: Average Precipitation between 2000 and 2002 according to MPI-ESM-LR.
182+
domains: [global]
183+
plot_file: /path/to/recipe_output/plots/diagnostic_name/script_name/CMIP5_MPI-ESM-LR_Amon_historical_r1i1p1_T2Ms_pr_2000-2002_mean.png
184+
plot_type: zonal
185+
references: [acknow_project]
186+
statistics: [mean]
187+
188+
You can check whether your diagnostic script successfully provided the provenance information to the backend by
189+
verifying that
190+
191+
- for each output file in the :code:`work_dir`, a file with the same name, but ending with _provenance.xml is created
192+
- any NetCDF files created by your diagnostic script contain a 'provenance' global attribute
193+
- any PNG plots created by your diagnostic script contain the provenance information in the 'Image History' attribute

doc/sphinx/source/recipes/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Recipes
66

77
recipe_clouds
88
recipe_crem
9+
recipe_cvdp
910
recipe_flato13ipcc
1011
recipe_perfmetrics
1112
recipe_runoff_et

0 commit comments

Comments
 (0)