Skip to content

Commit 684e985

Browse files
authored
Merge pull request #2 from pymc-devs/main
Rebasing
2 parents 4699e1d + 76fae3f commit 684e985

File tree

8 files changed

+1385
-1058
lines changed

8 files changed

+1385
-1058
lines changed

examples/case_studies/stochastic_volatility.ipynb

+276-125
Large diffs are not rendered by default.

examples/diagnostics_and_criticism/Diagnosing_biased_Inference_with_Divergences.ipynb

+805-724
Large diffs are not rendered by default.

examples/gaussian_processes/gaussian_process.ipynb

+141-132
Large diffs are not rendered by default.

examples/references.bib

+10
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,16 @@ @book{hayes2017introduction
156156
year = {2017},
157157
publisher = {Guilford publications}
158158
}
159+
@article{hoffman2014nuts,
160+
title = {The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo},
161+
author = {Hoffman, Matthew and Gelman, Andrew},
162+
year = {2014},
163+
journal = {Journal of Machine Learning Research},
164+
volume = {15},
165+
issue = {1},
166+
pages = {1593--1623},
167+
url = {https://dl.acm.org/doi/10.5555/2627435.2638586}
168+
}
159169
@misc{hogg2010data,
160170
title = {Data analysis recipes: Fitting a model to data},
161171
author = {David W. Hogg and Jo Bovy and Dustin Lang},

myst_nbs/case_studies/stochastic_volatility.myst.md

+39-9
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,20 @@ jupytext:
66
format_version: 0.13
77
jupytext_version: 1.13.7
88
kernelspec:
9-
display_name: Python 3 (ipykernel)
9+
display_name: pymc_env
1010
language: python
11-
name: python3
11+
name: pymc_env
1212
---
1313

14+
(stochastic_volatility)=
1415
# Stochastic Volatility model
1516

17+
:::{post} June 17, 2022
18+
:tags: time series, case study
19+
:category: beginner
20+
:author: John Salvatier
21+
:::
22+
1623
```{code-cell} ipython3
1724
import os
1825
@@ -26,15 +33,15 @@ rng = np.random.RandomState(1234)
2633
az.style.use("arviz-darkgrid")
2734
```
2835

29-
Asset prices have time-varying volatility (variance of day over day `returns`). In some periods, returns are highly variable, while in others very stable. Stochastic volatility models model this with a latent volatility variable, modeled as a stochastic process. The following model is similar to the one described in the No-U-Turn Sampler paper, Hoffman (2011) p21.
36+
Asset prices have time-varying volatility (variance of day over day `returns`). In some periods, returns are highly variable, while in others very stable. Stochastic volatility models model this with a latent volatility variable, modeled as a stochastic process. The following model is similar to the one described in the No-U-Turn Sampler paper, {cite:p}`hoffman2014nuts`.
3037

3138
$$ \sigma \sim Exponential(50) $$
3239

3340
$$ \nu \sim Exponential(.1) $$
3441

3542
$$ s_i \sim Normal(s_{i-1}, \sigma^{-2}) $$
3643

37-
$$ log(r_i) \sim t(\nu, 0, exp(-2 s_i)) $$
44+
$$ \log(r_i) \sim t(\nu, 0, \exp(-2 s_i)) $$
3845

3946
Here, $r$ is the daily return series and $s$ is the latent log volatility process.
4047

@@ -180,11 +187,34 @@ axes[0].set_title("True log returns (black) and posterior predictive log returns
180187
axes[1].set_title("Posterior volatility");
181188
```
182189

183-
## References
184-
185-
1. Hoffman & Gelman. (2011). [The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo](http://arxiv.org/abs/1111.4246).
186-
187190
```{code-cell} ipython3
188191
%load_ext watermark
189-
%watermark -n -u -v -iv -w -p aesara,xarray
192+
%watermark -n -u -v -iv -w -p aesara,aeppl,xarray
190193
```
194+
195+
## References
196+
197+
:::{bibliography}
198+
:filter: docname in docnames
199+
:::
200+
201+
+++
202+
203+
## Authors
204+
205+
+++
206+
207+
* Written by John Salvatier
208+
* Updated by Kyle Meyer
209+
* Updated by Thomas Wiecki
210+
* Updated by Chris Fonnesbeck
211+
* Updated by Aaron Maxwell on May 18, 2018 ([pymc#2978](https://github.com/pymc-devs/pymc/pull/2978))
212+
* Updated by Colin Carroll on November 16, 2019 ([pymc#3682](https://github.com/pymc-devs/pymc/pull/3682))
213+
* Updated by Abhipsha Das on July 24, 2021 ([pymc-examples#155](https://github.com/pymc-devs/pymc-examples/pull/155))
214+
* Updated by Michael Osthege on June 1, 2022 ([pymc-examples#343](https://github.com/pymc-devs/pymc-examples/pull/343))
215+
* Updated by Christopher Krapu on June 17, 2022 ([pymc-examples#378](https://github.com/pymc-devs/pymc-examples/pull/378))
216+
217+
+++
218+
219+
:::{include} ../page_footer.md
220+
:::

myst_nbs/diagnostics_and_criticism/Diagnosing_biased_Inference_with_Divergences.myst.md

+45-10
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,20 @@ jupytext:
66
format_version: 0.13
77
jupytext_version: 1.13.7
88
kernelspec:
9-
display_name: Python PyMC3 (Dev)
9+
display_name: Python 3 (ipykernel)
1010
language: python
11-
name: pymc3-dev-py38
11+
name: python3
1212
---
1313

1414
(diagnosing_with_divergences)=
1515
# Diagnosing Biased Inference with Divergences
1616

17+
:::{post} Feb, 2018
18+
:tags: hierarchical model, diagnostics
19+
:category: intermediate
20+
:author: Agustina Arroyuelo
21+
:::
22+
1723
```{code-cell} ipython3
1824
from collections import defaultdict
1925
@@ -23,7 +29,7 @@ import numpy as np
2329
import pandas as pd
2430
import pymc3 as pm
2531
26-
print(f"Running on PyMC3 v{pm.__version__}")
32+
print(f"Running on PyMC3 v{pm.__version__}")
2733
```
2834

2935
```{code-cell} ipython3
@@ -32,13 +38,13 @@ az.style.use("arviz-darkgrid")
3238
SEED = [20100420, 20134234]
3339
```
3440

35-
This notebook is a PyMC3 port of [Michael Betancourt's post on ms-stan](http://mc-stan.org/documentation/case-studies/divergences_and_bias.html). For detailed explanation of the underlying mechanism please check [the original post](http://mc-stan.org/documentation/case-studies/divergences_and_bias.html) and Betancourt's [excellent paper](https://arxiv.org/abs/1701.02434).
41+
This notebook is a PyMC3 port of [Michael Betancourt's post on mc-stan](http://mc-stan.org/documentation/case-studies/divergences_and_bias.html). For detailed explanation of the underlying mechanism please check the original post, [Diagnosing Biased Inference with Divergences](http://mc-stan.org/documentation/case-studies/divergences_and_bias.html) and Betancourt's excellent paper, [A Conceptual Introduction to Hamiltonian Monte Carlo](https://arxiv.org/abs/1701.02434).
3642

3743
+++
3844

39-
Bayesian statistics is all about building a model and estimating the parameters in that model. However, a naive or direct parameterization of our probability model can sometimes be ineffective, you can check out [Thomas Wiecki's blog post](http://twiecki.github.io/blog/2017/02/08/bayesian-hierchical-non-centered/) on the same issue in PyMC3. Suboptimal parameterization often leads to slow sampling, and more problematic, biased MCMC estimators.
45+
Bayesian statistics is all about building a model and estimating the parameters in that model. However, a naive or direct parameterization of our probability model can sometimes be ineffective, you can check out Thomas Wiecki's blog post, [Why hierarchical models are awesome, tricky, and Bayesian](http://twiecki.github.io/blog/2017/02/08/bayesian-hierchical-non-centered/) on the same issue in PyMC3. Suboptimal parameterization often leads to slow sampling, and more problematic, biased MCMC estimators.
4046

41-
More formally, as explained in [the original post](http://mc-stan.org/documentation/case-studies/divergences_and_bias.html):
47+
More formally, as explained in the original post, [Diagnosing Biased Inference with Divergences](http://mc-stan.org/documentation/case-studies/divergences_and_bias.html):
4248

4349
Markov chain Monte Carlo (MCMC) approximates expectations with respect to a given target distribution,
4450

@@ -136,11 +142,27 @@ The Gelman-Rubin diagnostic $\hat{R}$ doesn’t indicate any problem (values are
136142
az.summary(short_trace).round(2)
137143
```
138144

139-
Moreover, the trace plots all look fine. Let's consider, for example, the hierarchical standard deviation $\tau$, or more specifically, its logarithm, $log(\tau)$. Because $\tau$ is constrained to be positive, its logarithm will allow us to better resolve behavior for small values. Indeed the chains seems to be exploring both small and large values reasonably well,
145+
Moreover, the trace plots all look fine. Let's consider, for example, the hierarchical standard deviation $\tau$, or more specifically, its logarithm, $log(\tau)$. Because $\tau$ is constrained to be positive, its logarithm will allow us to better resolve behavior for small values. Indeed the chains seems to be exploring both small and large values reasonably well.
140146

141147
```{code-cell} ipython3
148+
---
149+
mystnb:
150+
figure:
151+
caption: Trace plot of log(tau)
152+
name: nb-divergence-traceplot
153+
image:
154+
alt: log-tau
155+
---
142156
# plot the trace of log(tau)
143-
az.plot_trace({"log(tau)": short_trace.get_values(varname="tau_log__", combine=False)});
157+
ax = az.plot_trace(
158+
{"log(tau)": short_trace.get_values(varname="tau_log__", combine=False)}, legend=True
159+
)
160+
ax[0, 1].set_xlabel("Draw")
161+
ax[0, 1].set_ylabel("log(tau)")
162+
ax[0, 1].set_title("")
163+
164+
ax[0, 0].set_xlabel("log(tau)")
165+
ax[0, 0].set_title("Probability density function of log(tau)");
144166
```
145167

146168
Unfortunately, the resulting estimate for the mean of $log(\tau)$ is strongly biased away from the true value, here shown in grey.
@@ -495,7 +517,7 @@ As shown above, the effective sample size per iteration has drastically improved
495517
report_trace(fit_ncp80)
496518
```
497519

498-
As expected of false positives, we can remove the divergences entirely by decreasing the step size,
520+
As expected of false positives, we can remove the divergences entirely by decreasing the step size.
499521

500522
```{code-cell} ipython3
501523
with NonCentered_eight:
@@ -506,7 +528,7 @@ divergent = fit_ncp90["diverging"]
506528
print("Number of Divergent %d" % divergent.nonzero()[0].size)
507529
```
508530

509-
The more agreeable geometry of the non-centered implementation allows the Markov chain to explore deep into the neck of the funnel, capturing even the smallest values of $\tau$ that are consistent with the measurements. Consequently, MCMC estimators from the non-centered chain rapidly converge towards their true expectation values.
531+
The more agreeable geometry of the non-centered implementation allows the Markov chain to explore deep into the neck of the funnel, capturing even the smallest values of `tau` ($\tau$) that are consistent with the measurements. Consequently, MCMC estimators from the non-centered chain rapidly converge towards their true expectation values.
510532

511533
```{code-cell} ipython3
512534
_, ax = plt.subplots(1, 1, figsize=(10, 6))
@@ -538,7 +560,20 @@ plt.title("MCMC estimation of log(tau)")
538560
plt.legend();
539561
```
540562

563+
## Authors
564+
* Adapted from Michael Betancourt's post January 2017, [Diagnosing Biased Inference with Divergences](https://mc-stan.org/users/documentation/case-studies/divergences_and_bias.html)
565+
* Updated by Agustina Arroyuelo in February 2018, ([pymc#2861](https://github.com/pymc-devs/pymc/pull/2861))
566+
* Updated by [@CloudChaoszero](https://github.com/CloudChaoszero) in January 2021, ([pymc-examples#25](https://github.com/pymc-devs/pymc-examples/pull/25))
567+
* Updated Markdown and styling by @reshamas in August 2022, ([pymc-examples#402](https://github.com/pymc-devs/pymc-examples/pull/402))
568+
541569
```{code-cell} ipython3
542570
%load_ext watermark
543571
%watermark -n -u -v -iv -w
544572
```
573+
574+
:::{include} ../page_footer.md
575+
:::
576+
577+
```{code-cell} ipython3
578+
579+
```

0 commit comments

Comments
 (0)