Skip to content

Update GP Marginal and Latent notebooks to v5 #549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jul 13, 2023
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ repos:
(?x)^
^examples/ode_models/ODE_with_manual_gradients\.ipynb$
|examples/samplers/DEMetropolisZ_EfficiencyComparison\.ipynb$
|examples/gaussian_processes/GP-Latent\.ipynb$
|examples/gaussian_processes/GP-MaunaLoa2\.ipynb$
|examples/samplers/MLDA_gravity_surveying\.ipynb$
|examples/howto/sampling_callback\.ipynb$
Expand Down
507 changes: 182 additions & 325 deletions examples/gaussian_processes/GP-Latent.ipynb

Large diffs are not rendered by default.

16 changes: 11 additions & 5 deletions examples/gaussian_processes/GP-Latent.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@ jupytext:
format_name: myst
format_version: 0.13
kernelspec:
display_name: pymc-dev
display_name: Python 3 (ipykernel)
language: python
name: pymc-dev
name: python3
---

(gp_latent)=
# Gaussian Processes: Latent Variable Implementation

:::{post} Sept 28, 2022
:::{post} June 6, 2023
:tags: gaussian processes, time series
:category: reference, intermediate
:author: Bill Engels
Expand Down Expand Up @@ -80,6 +80,11 @@ with latent_gp_model:

The following is an example showing how to specify a simple model with a GP prior using the {class}`gp.Latent` class. We use a GP to generate the data so we can verify that the inference we perform is correct. Note that the likelihood is not normal, but IID Student-T. For a more efficient implementation when the likelihood is Gaussian, use {class}`gp.Marginal`.

```{code-cell} ipython3
# If not already installed, install Jax and Numpyro by uncommenting the following lines:
# %pip install -U -q jax numpyro
```

```{code-cell} ipython3
import arviz as az
import matplotlib.pyplot as plt
Expand Down Expand Up @@ -158,7 +163,7 @@ with pm.Model() as model:
) # add one because student t is undefined for degrees of freedom less than one
y_ = pm.StudentT("y", mu=f, lam=1.0 / sigma, nu=nu, observed=y)

idata = pm.sample(1000, tune=1000, chains=2, cores=1)
idata = pm.sample(1000, tune=1000, chains=2, cores=2, nuts_sampler="numpyro")
```

```{code-cell} ipython3
Expand Down Expand Up @@ -349,7 +354,7 @@ with pm.Model() as model:
p = pm.Deterministic("p", pm.math.invlogit(f))
y_ = pm.Bernoulli("y", p=p, observed=y)

idata = pm.sample(1000, chains=2, cores=1)
idata = pm.sample(1000, chains=2, cores=2, nuts_sampler="numpyro")
```

```{code-cell} ipython3
Expand Down Expand Up @@ -429,6 +434,7 @@ plt.legend(loc=(0.32, 0.65), frameon=True);
* Created by [Bill Engels](https://github.com/bwengals) in 2017 ([pymc#1674](https://github.com/pymc-devs/pymc/pull/1674))
* Reexecuted by [Colin Caroll](https://github.com/ColCarroll) in 2019 ([pymc#3397](https://github.com/pymc-devs/pymc/pull/3397))
* Updated for V4 by Bill Engels in September 2022 ([pymc-examples#237](https://github.com/pymc-devs/pymc-examples/pull/237))
* Updated for V5 by Chris Fonnesbeck in July 2023 ([pymc-examples#549](https://github.com/pymc-devs/pymc-examples/pull/549))

+++

Expand Down
389 changes: 201 additions & 188 deletions examples/gaussian_processes/GP-Marginal.ipynb

Large diffs are not rendered by default.

98 changes: 60 additions & 38 deletions examples/gaussian_processes/GP-Marginal.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,15 @@ kernelspec:
name: python3
---

(gp_marginal)=
# Marginal Likelihood Implementation

:::{post} June 4, 2023
:tags: gaussian processes, time series
:category: reference, intermediate
:author: Bill Engels, Chris Fonnesbeck
:::

The `gp.Marginal` class implements the more common case of GP regression: the observed data are the sum of a GP and Gaussian noise. `gp.Marginal` has a `marginal_likelihood` method, a `conditional` method, and a `predict` method. Given a mean and covariance function, the function $f(x)$ is modeled as,

$$
Expand Down Expand Up @@ -68,19 +75,19 @@ with pm.Model() as marginal_gp_model:

# The scale of the white noise term can be provided,
sigma = pm.HalfCauchy("sigma", beta=5)
y_ = gp.marginal_likelihood("y", X=X, y=y, noise=sigma)
y_ = gp.marginal_likelihood("y", X=X, y=y, sigma=sigma)

# OR a covariance function for the noise can be given
# noise_l = pm.Gamma("noise_l", alpha=2, beta=2)
# cov_func_noise = pm.gp.cov.Exponential(1, noise_l) + pm.gp.cov.WhiteNoise(sigma=0.1)
# y_ = gp.marginal_likelihood("y", X=X, y=y, noise=cov_func_noise)
# y_ = gp.marginal_likelihood("y", X=X, y=y, sigma=cov_func_noise)
```

+++

## The `.conditional` distribution

The `.conditional` has an optional flag for `pred_noise`, which defaults to `False`. When `pred_noise=False`, the `conditional` method produces the predictive distribution for the underlying function represented by the GP. When `pred_noise=True`, the `conditional` method produces the predictive distribution for the GP plus noise. Using the same `gp` object defined above,
The `.conditional` has an optional flag for `pred_noise`, which defaults to `False`. When `pred_sigma=False`, the `conditional` method produces the predictive distribution for the underlying function represented by the GP. When `pred_sigma=True`, the `conditional` method produces the predictive distribution for the GP plus noise. Using the same `gp` object defined above,

```python
# vector of new X points we want to predict the function at
Expand All @@ -90,7 +97,7 @@ with marginal_gp_model:
f_star = gp.conditional("f_star", Xnew=Xnew)

# or to predict the GP plus noise
y_star = gp.conditional("y_star", Xnew=Xnew, pred_noise=True)
y_star = gp.conditional("y_star", Xnew=Xnew, pred_sigma=True)
```
If using an additive GP model, the conditional distribution for individual components can be constructed by setting the optional argument `given`. For more information on building additive GPs, see the main documentation page. For an example, see the Mauna Loa CO$_2$ notebook.

Expand All @@ -108,7 +115,7 @@ mu, cov = gp.predict(Xnew, point=trace[-1])
mu, var = gp.predict(Xnew, point=trace[-1], diag=True)

# With noise included
mu, var = gp.predict(Xnew, point=trace[-1], diag=True, pred_noise=True)
mu, var = gp.predict(Xnew, point=trace[-1], diag=True, pred_sigma=True)
```

+++
Expand All @@ -120,10 +127,11 @@ mu, var = gp.predict(Xnew, point=trace[-1], diag=True, pred_noise=True)
jupyter:
outputs_hidden: true
---
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm
import pymc as pm
import scipy as sp
%matplotlib inline
Expand All @@ -137,9 +145,9 @@ n = 100 # The number of data points
X = np.linspace(0, 10, n)[:, None] # The inputs to the GP, they must be arranged as a column vector
# Define the true covariance function and its parameters
ℓ_true = 1.0
η_true = 3.0
cov_func = η_true**2 * pm.gp.cov.Matern52(1, ℓ_true)
ell_true = 1.0
eta_true = 3.0
cov_func = eta_true**2 * pm.gp.cov.Matern52(1, ell_true)
# A mean function that is zero everywhere
mean_func = pm.gp.mean.Zero()
Expand All @@ -152,8 +160,8 @@ f_true = np.random.multivariate_normal(
# The observed data is the latent function plus a small amount of IID Gaussian noise
# The standard deviation of the noise is `sigma`
σ_true = 2.0
y = f_true + σ_true * np.random.randn(n)
sigma_true = 2.0
y = f_true + sigma_true * np.random.randn(n)
## Plot the data and the unobserved latent function
fig = plt.figure(figsize=(12, 5))
Expand All @@ -167,31 +175,26 @@ plt.legend();

```{code-cell} ipython3
with pm.Model() as model:
= pm.Gamma("", alpha=2, beta=1)
η = pm.HalfCauchy("η", beta=5)
ell = pm.Gamma("ell", alpha=2, beta=1)
eta = pm.HalfCauchy("eta", beta=5)
cov = η**2 * pm.gp.cov.Matern52(1, )
cov = eta**2 * pm.gp.cov.Matern52(1, ell)
gp = pm.gp.Marginal(cov_func=cov)
σ = pm.HalfCauchy("σ", beta=5)
y_ = gp.marginal_likelihood("y", X=X, y=y, noise=σ)
sigma = pm.HalfCauchy("sigma", beta=5)
y_ = gp.marginal_likelihood("y", X=X, y=y, sigma=sigma)
mp = pm.find_MAP()
with model:
marginal_post = pm.sample(500, tune=2000, nuts_sampler="numpyro", chains=1)
```

```{code-cell} ipython3
# collect the results into a pandas dataframe to display
# "mp" stands for marginal posterior
pd.DataFrame(
{
"Parameter": ["ℓ", "η", "σ"],
"Value at MAP": [float(mp["ℓ"]), float(mp["η"]), float(mp["σ"])],
"True value": [ℓ_true, η_true, σ_true],
}
)
summary = az.summary(marginal_post, var_names=["ell", "eta", "sigma"], round_to=2, kind="stats")
summary["True value"] = [ell_true, eta_true, sigma_true]
summary
```

The MAP values are close to their true values.
The estimated values are close to their true values.

+++

Expand All @@ -205,9 +208,10 @@ X_new = np.linspace(0, 20, 600)[:, None]
with model:
f_pred = gp.conditional("f_pred", X_new)
# To use the MAP values, you can just replace the trace with a length-1 list with `mp`
with model:
pred_samples = pm.sample_posterior_predictive([mp], vars=[f_pred], samples=2000)
pred_samples = pm.sample_posterior_predictive(
marginal_post.sel(draw=slice(0, 20)), var_names=["f_pred"]
)
```

```{code-cell} ipython3
Expand All @@ -216,9 +220,10 @@ fig = plt.figure(figsize=(12, 5))
ax = fig.gca()
# plot the samples from the gp posterior with samples and shading
from pymc3.gp.util import plot_gp_dist
from pymc.gp.util import plot_gp_dist
plot_gp_dist(ax, pred_samples["f_pred"], X_new)
f_pred_samples = az.extract(pred_samples, group="posterior_predictive", var_names=["f_pred"])
plot_gp_dist(ax, samples=f_pred_samples.T, x=X_new)
# plot the data and the true latent function
plt.plot(X, f_true, "dodgerblue", lw=3, label="True f")
Expand All @@ -238,19 +243,22 @@ The `conditional` method of `gp.Marginal` contains the flag `pred_noise` whose d
```{code-cell} ipython3
with model:
y_pred = gp.conditional("y_pred", X_new, pred_noise=True)
y_samples = pm.sample_posterior_predictive([mp], vars=[y_pred], samples=2000)
y_samples = pm.sample_posterior_predictive(
marginal_post.sel(draw=slice(0, 50)), var_names=["y_pred"]
)
```

```{code-cell} ipython3
fig = plt.figure(figsize=(12, 5))
ax = fig.gca()
# posterior predictive distribution
plot_gp_dist(ax, y_samples["y_pred"], X_new, plot_samples=False, palette="bone_r")
y_pred_samples = az.extract(y_samples, group="posterior_predictive", var_names=["y_pred"])
plot_gp_dist(ax, y_pred_samples.T, X_new, plot_samples=False, palette="bone_r")
# overlay a scatter of one draw of random points from the
# posterior predictive distribution
plt.plot(X_new, y_samples["y_pred"][800, :].T, "co", ms=2, label="Predicted data")
plt.plot(X_new, y_pred_samples.sel(sample=1), "co", ms=2, label="Predicted data")
# plot original data and true function
plt.plot(X, y, "ok", ms=3, alpha=1.0, label="observed data")
Expand All @@ -268,18 +276,21 @@ Notice that the posterior predictive density is wider than the conditional distr

### Using `.predict`

We can use the `.predict` method to return the mean and variance given a particular `point`. Since we used `find_MAP` in this example, `predict` returns the same mean and covariance that the distribution of `.conditional` has.
We can use the `.predict` method to return the mean and variance given a particular `point`.

```{code-cell} ipython3
# predict
mu, var = gp.predict(X_new, point=mp, diag=True)
with model:
mu, var = gp.predict(
X_new, point=az.extract(marginal_post.posterior.sel(draw=[0])).squeeze(), diag=True
)
sd = np.sqrt(var)
# draw plot
fig = plt.figure(figsize=(12, 5))
ax = fig.gca()
# plot mean and intervals
# plot mean and 2sigma intervals
plt.plot(X_new, mu, "r", lw=2, label="mean and 2σ region")
plt.plot(X_new, mu + 2 * sd, "r", lw=1)
plt.plot(X_new, mu - 2 * sd, "r", lw=1)
Expand All @@ -291,10 +302,21 @@ plt.plot(X, f_true, "dodgerblue", lw=3, label="true f")
plt.xlabel("x")
plt.ylim([-13, 13])
plt.title("predictive mean and interval")
plt.title("predictive mean and 2sigma interval")
plt.legend();
```

## Authors

* Created by [Bill Engels](https://github.com/bwengals) in 2017 ([pymc#1674](https://github.com/pymc-devs/pymc/pull/1674))
* Reexecuted by [Colin Caroll](https://github.com/ColCarroll) in 2019 ([pymc#3397](https://github.com/pymc-devs/pymc/pull/3397))
* Updated for V4 by Bill Engels in September 2022 ([pymc-examples#237](https://github.com/pymc-devs/pymc-examples/pull/237))
* Updated for V5 by Chris Fonnesbeck in July 2023 ([pymc-examples#549](https://github.com/pymc-devs/pymc-examples/pull/549))

+++

## Watermark

```{code-cell} ipython3
%load_ext watermark
%watermark -n -u -v -iv -w
Expand Down