@@ -21,17 +21,17 @@ choice as priors over functions due to the marginalization and conditioning
21
21
properties of the multivariate normal distribution. Usually, the marginal
22
22
distribution over :math: `f(x)` is evaluated during the inference step. The
23
23
conditional distribution is then used for predicting the function values
24
- :math: `f(x_*)` at new points, :math: `x_*`.
24
+ :math: `f(x_*)` at new points, :math: `x_*`.
25
25
26
26
The joint distribution of :math: `f(x)` and :math: `f(x_*)` is multivariate
27
27
normal,
28
28
29
29
.. math ::
30
30
31
31
\begin {bmatrix} f(x) \\ f(x_*) \\ \end {bmatrix} \sim
32
- \text {N}\left (
32
+ \text {N}\left (
33
33
\begin {bmatrix} m(x) \\ m(x_*) \\ \end {bmatrix} \,,
34
- \begin {bmatrix} k(x,x') & k(x_*, x) \\
34
+ \begin {bmatrix} k(x,x') & k(x_*, x) \\
35
35
k(x_*, x) & k(x_*, x_*') \\ \end {bmatrix}
36
36
\right ) \,.
37
37
@@ -41,21 +41,21 @@ distribution is
41
41
42
42
.. math ::
43
43
44
- f(x_*) \mid f(x) \sim \text {N}\left ( k(x_*, x) k(x, x)^{-1 } [f(x) - m(x)] + m(x_*) ,\,
44
+ f(x_*) \mid f(x) \sim \text {N}\left ( k(x_*, x) k(x, x)^{-1 } [f(x) - m(x)] + m(x_*) ,\,
45
45
k(x_*, x_*) - k(x, x_*) k(x, x)^{-1 } k(x, x_*) \right ) \,.
46
46
47
47
.. note ::
48
48
49
49
For more information on GPs, check out the book `Gaussian Processes for
50
50
Machine Learning <http://www.gaussianprocess.org/gpml/> `_ by Rasmussen &
51
- Williams, or `this introduction <https://www.ics.uci.edu/~welling/teaching/KernelsICS273B/gpB.pdf >`_
51
+ Williams, or `this introduction <https://www.ics.uci.edu/~welling/teaching/KernelsICS273B/gpB.pdf >`_
52
52
by D. Mackay.
53
53
54
54
PyMC3 is a great environment for working with fully Bayesian Gaussian Process
55
- models. GPs in PyMC3 have a clear syntax and are highly composable, and many
56
- predefined covariance functions (or kernels), mean functions, and several GP
55
+ models. GPs in PyMC3 have a clear syntax and are highly composable, and many
56
+ predefined covariance functions (or kernels), mean functions, and several GP
57
57
implementations are included. GPs are treated as distributions that can be
58
- used within larger or hierarchical models, not just as standalone regression
58
+ used within larger or hierarchical models, not just as standalone regression
59
59
models.
60
60
61
61
Mean and covariance functions
@@ -83,7 +83,7 @@ specify :code:`input_dim`, the total number of columns of :code:`X`, and
83
83
:code: `active_dims `, which of those columns or dimensions the covariance
84
84
function will act on, is because :code: `cov_func ` hasn't actually seen the
85
85
input data yet. The :code: `active_dims ` argument is optional, and defaults to
86
- all columns of the matrix of inputs.
86
+ all columns of the matrix of inputs.
87
87
88
88
Covariance functions in PyMC3 closely follow the algebraic rules for kernels,
89
89
which allows users to combine covariance functions into new ones, for example:
@@ -97,13 +97,13 @@ which allows users to combine covariance functions into new ones, for example:
97
97
98
98
99
99
cov_func = pm.gp.cov.ExpQuad(...) * pm.gp.cov.Periodic(...)
100
-
100
+
101
101
- The product (or sum) of a covariance function with a scalar is a
102
102
covariance function::
103
103
104
-
104
+
105
105
cov_func = eta**2 * pm.gp.cov.Matern32(...)
106
-
106
+
107
107
108
108
109
109
After the covariance function is defined, it is now a function that is
133
133
The first argument is the mean function and the second is the covariance
134
134
function. We've made the GP object, but we haven't made clear which function
135
135
it is to be a prior for, what the inputs are, or what parameters it will be
136
- conditioned on.
136
+ conditioned on.
137
137
138
138
.. note ::
139
139
@@ -145,18 +145,18 @@ conditioned on.
145
145
146
146
Calling the `prior ` method will create a PyMC3 random variable that represents
147
147
the latent function :math: `f(x) = \mathbf {f}`::
148
-
148
+
149
149
f = gp.prior("f", X)
150
150
151
151
:code: `f ` is a random variable that can be used within a PyMC3 model like any
152
152
other type of random variable. The first argument is the name of the random
153
- variable representing the function we are placing the prior over.
154
- The second argument is the inputs to the function that the prior is over,
153
+ variable representing the function we are placing the prior over.
154
+ The second argument is the inputs to the function that the prior is over,
155
155
:code: `X `. The inputs are usually known and present in the data, but they can
156
- also be PyMC3 random variables. If the inputs are a Theano tensor or a
156
+ also be PyMC3 random variables. If the inputs are a Theano tensor or a
157
157
PyMC3 random variable, the :code: `shape ` needs to be given.
158
158
159
- Usually at this point, inference is performed on the model. The
159
+ Usually at this point, inference is performed on the model. The
160
160
:code: `conditional ` method creates the conditional, or predictive,
161
161
distribution over the latent function at arbitrary :math: `x_*` input points,
162
162
:math: `f(x_*)`. To construct the conditional distribution we write::
@@ -166,7 +166,7 @@ distribution over the latent function at arbitrary :math:`x_*` input points,
166
166
Additive GPs
167
167
============
168
168
169
- The GP implementation in PyMC3 is constructed so that it is easy to define
169
+ The GP implementation in PyMC3 is constructed so that it is easy to define
170
170
additive GPs and sample from individual GP components. We can write::
171
171
172
172
gp1 = pm.gp.Marginal(mean_func1, cov_func1)
@@ -183,18 +183,18 @@ Consider two independent GP distributed functions, :math:`f_1(x) \sim
183
183
184
184
.. math ::
185
185
186
- \begin {bmatrix} f_1 \\ f_1 ^* \\ f_2 \\ f_2 ^*
186
+ \begin {bmatrix} f_1 \\ f_1 ^* \\ f_2 \\ f_2 ^*
187
187
\\ f_1 + f_2 \\ f_1 ^* + f_2 ^* \end {bmatrix} \sim
188
- \text {N}\left (
188
+ \text {N}\left (
189
189
\begin {bmatrix} m_1 \\ m_1 ^* \\ m_2 \\ m_2 ^* \\
190
190
m_1 + m_2 \\ m_1 ^* + m_2 ^* \\ \end {bmatrix} \,,\,
191
- \begin {bmatrix}
191
+ \begin {bmatrix}
192
192
K_1 & K_1 ^* & 0 & 0 & K_1 & K_1 ^* \\
193
193
K_1 ^{*^T} & K_1 ^{**} & 0 & 0 & K_1 ^* & K_1 ^{**} \\
194
194
0 & 0 & K_2 & K_2 ^* & K_2 & K_2 ^{*} \\
195
195
0 & 0 & K_2 ^{*^T} & K_2 ^{**} & K_2 ^{*} & K_2 ^{**} \\
196
196
K_1 & K_1 ^{*} & K_2 & K_2 ^{*} & K_1 + K_2 & K_1 ^{*} + K_2 ^{*} \\
197
- K_1 ^{*^T} & K_1 ^{**} & K_2 ^{*^T} & K_2 ^{**} & K_1 ^{*^T}+K_2 ^{*^T} & K_1 ^{**}+K_2 ^{**}
197
+ K_1 ^{*^T} & K_1 ^{**} & K_2 ^{*^T} & K_2 ^{**} & K_1 ^{*^T}+K_2 ^{*^T} & K_1 ^{**}+K_2 ^{**}
198
198
\end {bmatrix}
199
199
\right ) \,.
200
200
@@ -220,42 +220,42 @@ other implementations. The first block fits the GP prior. We denote
220
220
with pm.Model() as model:
221
221
gp1 = pm.gp.Marginal(mean_func1, cov_func1)
222
222
gp2 = pm.gp.Marginal(mean_func2, cov_func2)
223
-
224
- # gp represents f1 + f2.
223
+
224
+ # gp represents f1 + f2.
225
225
gp = gp1 + gp2
226
-
226
+
227
227
f = gp.marginal_likelihood("f", X, y, noise)
228
-
228
+
229
229
trace = pm.sample(1000)
230
230
231
231
232
- To construct the conditional distribution of :code: `gp1 ` or :code: `gp2 `, we
233
- also need to include the additional arguments, :code: `X `, :code: `y `, and
232
+ To construct the conditional distribution of :code: `gp1 ` or :code: `gp2 `, we
233
+ also need to include the additional arguments, :code: `X `, :code: `y `, and
234
234
:code: `noise `::
235
235
236
236
with model:
237
237
# conditional distributions of f1 and f2
238
- f1_star = gp1.conditional("f1_star", X_star,
238
+ f1_star = gp1.conditional("f1_star", X_star,
239
239
given={"X": X, "y": y, "noise": noise, "gp": gp})
240
- f2_star = gp2.conditional("f2_star", X_star,
240
+ f2_star = gp2.conditional("f2_star", X_star,
241
241
given={"X": X, "y": y, "noise": noise, "gp": gp})
242
242
243
243
# conditional of f1 + f2, `given` not required
244
244
f_star = gp.conditional("f_star", X_star)
245
245
246
- This second block produces the conditional distributions. Notice that extra
246
+ This second block produces the conditional distributions. Notice that extra
247
247
arguments are required for conditionals of :math: `f1 ` and :math: `f2 `, but not
248
- :math: `f`. This is because those arguments are cached when
248
+ :math: `f`. This is because those arguments are cached when
249
249
:code: `.marginal_likelihood ` is called on :code: `gp `.
250
250
251
251
.. note ::
252
252
When constructing conditionals, the additional arguments :code: `X `, :code: `y `,
253
253
:code: `noise ` and :code: `gp ` must be provided as a dict called `given `!
254
254
255
- Since the marginal likelihoood method of :code: `gp1 ` or :code: `gp2 ` weren't called,
256
- their conditionals need to be provided with the required inputs. In the same
255
+ Since the marginal likelihoood method of :code: `gp1 ` or :code: `gp2 ` weren't called,
256
+ their conditionals need to be provided with the required inputs. In the same
257
257
fashion as the prior, :code: `f_star `, :code: `f1_star ` and :code: `f2_star ` are random
258
- variables that can now be used like any other random variable in PyMC3.
258
+ variables that can now be used like any other random variable in PyMC3.
259
259
260
260
Check the notebooks for detailed demonstrations of the usage of GP functionality
261
261
in PyMC3.
0 commit comments