Skip to content

Updated the GLM hierarchical binomial notebook to run in v4 #362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ jupytext:
format_version: 0.13
jupytext_version: 1.13.7
kernelspec:
display_name: Python PyMC3 (Dev)
display_name: Python PyMC (Dev)
language: python
name: python3
---

# Hierarchical Binomial Model: Rat Tumor Example
:::{post} Nov 11, 2021
:tags: generalized linear model, hierarchical model, pymc3.Beta, pymc3.Binomial, pymc3.Data, pymc3.Deterministic, pymc3.HalfNormal, pymc3.Model, pymc3.Potential
:tags: generalized linear model, hierarchical model, pymc.Beta, pymc.Binomial, pymc.Data, pymc.Deterministic, pymc.HalfNormal, pymc.Model, pymc.Potential
:category: intermediate
:author: Demetri Pananos, Junpeng Lao, Raúl Maldonado, Farhan Reynaldo
:::
Expand All @@ -23,20 +23,20 @@ import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm
import theano.tensor as tt
import pymc as pm
import aesara.tensor as at

from scipy.special import gammaln

print(f"Running on PyMC3 v{pm.__version__}")
print(f"Running on PyMC v{pm.__version__}")
```

```{code-cell} ipython3
%config InlineBackend.figure_format = 'retina'
az.style.use("arviz-darkgrid")
```

This short tutorial demonstrates how to use PyMC3 to do inference for the rat tumour example found in chapter 5 of *Bayesian Data Analysis 3rd Edition* {cite:p}`gelman2013bayesian`. Readers should already be familiar with the PyMC3 API.
This short tutorial demonstrates how to use PyMC to do inference for the rat tumour example found in chapter 5 of *Bayesian Data Analysis 3rd Edition* {cite:p}`gelman2013bayesian`. Readers should already be familiar with the PyMC API.

Suppose we are interested in the probability that a lab rat develops endometrial stromal polyps. We have data from 71 previously performed trials and would like to use this data to perform inference.

Expand Down Expand Up @@ -76,7 +76,7 @@ p(\alpha, \beta)

See BDA3 pg. 110 for a more information on the deriving the marginal posterior distribution. With a little determination, we can plot the marginal posterior and estimate the means of $\alpha$ and $\beta$ without having to resort to MCMC. We will see, however, that this requires considerable effort.

The authors of BDA3 choose to plot the surface under the parameterization $(\log(\alpha/\beta), \log(\alpha+\beta))$. We do so as well. Through the remainder of the example let $x = \log(\alpha/\beta)$ and $z = \log(\alpha+\beta)$.
The authors of BDA3 choose to plot the surface under the parameterization $(\log(\alpha/\beta), \log(\alpha+\beta))$. We do so as well. Through the remainder of the example let $x = \log(\alpha/\beta)$ and $z = \log(\alpha+\beta)$.

```{code-cell} ipython3
# rat data (BDA3, p. 102)
Expand Down Expand Up @@ -196,11 +196,11 @@ $$ \operatorname{E}(\beta \lvert y) \text{ is estimated by }
(df.beta * df.normed_exp_trans).sum().round(3)
```

## Computing the Posterior using PyMC3
## Computing the Posterior using PyMC

Computing the marginal posterior directly is a lot of work, and is not always possible for sufficiently complex models.

On the other hand, creating hierarchical models in PyMC3 is simple. We can use the samples obtained from the posterior to estimate the means of $\alpha$ and $\beta$.
On the other hand, creating hierarchical models in PyMC is simple. We can use the samples obtained from the posterior to estimate the means of $\alpha$ and $\beta$.

```{code-cell} ipython3
coords = {
Expand All @@ -212,7 +212,7 @@ coords = {
```{code-cell} ipython3
def logp_ab(value):
"""prior density"""
return tt.log(tt.pow(tt.sum(value), -5 / 2))
return at.log(at.pow(at.sum(value), -5 / 2))


with pm.Model(coords=coords) as model:
Expand All @@ -221,13 +221,13 @@ with pm.Model(coords=coords) as model:
ab = pm.HalfNormal("ab", sigma=10, dims="param")
pm.Potential("p(a, b)", logp_ab(ab))

X = pm.Deterministic("X", tt.log(ab[0] / ab[1]))
Z = pm.Deterministic("Z", tt.log(tt.sum(ab)))
X = pm.Deterministic("X", at.log(ab[0] / ab[1]))
Z = pm.Deterministic("Z", at.log(at.sum(ab)))

theta = pm.Beta("theta", alpha=ab[0], beta=ab[1], dims="obs_id")

p = pm.Binomial("y", p=theta, observed=y, n=n_val)
trace = pm.sample(draws=1000, tune=2000, target_accept=0.95, return_inferencedata=True)
trace = pm.sample(draws=1000, tune=2000, target_accept=0.95)
```

```{code-cell} ipython3
Expand All @@ -254,7 +254,7 @@ trace.posterior["ab"].mean(("chain", "draw"))

## Conclusion

Analytically calculating statistics for posterior distributions is difficult if not impossible for some models. PyMC3 provides an easy way drawing samples from your model's posterior with only a few lines of code. Here, we used PyMC3 to obtain estimates of the posterior mean for the rat tumor example in chapter 5 of BDA3. The estimates obtained from PyMC3 are encouragingly close to the estimates obtained from the analytical posterior density.
Analytically calculating statistics for posterior distributions is difficult if not impossible for some models. PyMC provides an easy way drawing samples from your model's posterior with only a few lines of code. Here, we used PyMC to obtain estimates of the posterior mean for the rat tumor example in chapter 5 of BDA3. The estimates obtained from PyMC are encouragingly close to the estimates obtained from the analytical posterior density.

+++

Expand Down