Skip to content

Commit 61b0e1a

Browse files
Benjamin T. Vincentdrbenvincent
Benjamin T. Vincent
andauthored
update GLM negative binomial to v4 (#368)
* create truncated regression example * delete truncated regression example from main branch * create truncated regression example * delete truncated regression example from main branch * create truncated regression example * delete truncated regression example from main branch * fix incorrect statement about pm.NormalMixture * update to v4 + switch from bambi to PyMC Co-authored-by: Benjamin T. Vincent <[email protected]>
1 parent c813d0f commit 61b0e1a

File tree

2 files changed

+374
-113
lines changed

2 files changed

+374
-113
lines changed

examples/generalized_linear_models/GLM-negative-binomial-regression.ipynb

+331-98
Large diffs are not rendered by default.

myst_nbs/generalized_linear_models/GLM-negative-binomial-regression.myst.md

+43-15
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,28 @@ jupytext:
66
format_version: 0.13
77
jupytext_version: 1.13.7
88
kernelspec:
9-
display_name: Python 3
9+
display_name: Python 3.9.12 ('pymc-dev-py39')
1010
language: python
1111
name: python3
1212
---
1313

14+
(GLM-negative-binomial-regression)=
1415
# GLM: Negative Binomial Regression
1516

16-
```{code-cell} ipython3
17-
import re
17+
:::{post} June, 2022
18+
:tags: negative binomial regression, generalized linear model,
19+
:category: beginner
20+
:author: Ian Ozsvald, Abhipsha Das, Benjamin Vincent
21+
:::
1822

23+
```{code-cell} ipython3
1924
import arviz as az
20-
import bambi as bmb
21-
import matplotlib.pyplot as plt
2225
import numpy as np
2326
import pandas as pd
24-
import pymc3 as pm
27+
import pymc as pm
2528
import seaborn as sns
2629
2730
from scipy import stats
28-
29-
print(f"Running on PyMC3 v{pm.__version__}")
3031
```
3132

3233
```{code-cell} ipython3
@@ -37,7 +38,7 @@ rng = np.random.default_rng(RANDOM_SEED)
3738
az.style.use("arviz-darkgrid")
3839
```
3940

40-
This notebook demos negative binomial regression using the `bambi` library. It closely follows the GLM Poisson regression example by [Jonathan Sedar](https://github.com/jonsedar) (which is in turn inspired by [a project by Ian Osvald](http://ianozsvald.com/2016/05/07/statistically-solving-sneezes-and-sniffles-a-work-in-progress-report-at-pydatalondon-2016/)) except the data here is negative binomially distributed instead of Poisson distributed.
41+
This notebook closely follows the GLM Poisson regression example by [Jonathan Sedar](https://github.com/jonsedar) (which is in turn inspired by [a project by Ian Osvald](http://ianozsvald.com/2016/05/07/statistically-solving-sneezes-and-sniffles-a-work-in-progress-report-at-pydatalondon-2016/)) except the data here is negative binomially distributed instead of Poisson distributed.
4142

4243
Negative binomial regression is used to model count data for which the variance is higher than the mean. The [negative binomial distribution](https://en.wikipedia.org/wiki/Negative_binomial_distribution) can be thought of as a Poisson distribution whose rate parameter is gamma distributed, so that rate parameter can be adjusted to account for the increased variance.
4344

@@ -148,6 +149,7 @@ df = pd.DataFrame(
148149
),
149150
}
150151
)
152+
df
151153
```
152154

153155
```{code-cell} ipython3
@@ -177,21 +179,37 @@ ax.set_xticklabels(labels[::5]);
177179
### Create GLM Model
178180

179181
```{code-cell} ipython3
180-
fml = "nsneeze ~ alcohol + nomeds + alcohol:nomeds"
182+
COORDS = {"regressor": ["nomeds", "alcohol", "nomeds:alcohol"], "obs_idx": df.index}
183+
184+
with pm.Model(coords=COORDS) as m_sneeze_inter:
185+
a = pm.Normal("intercept", mu=0, sigma=5)
186+
b = pm.Normal("slopes", mu=0, sigma=1, dims="regressor")
187+
alpha = pm.Exponential("alpha", 0.5)
188+
189+
M = pm.ConstantData("nomeds", df.nomeds.to_numpy(), dims="obs_idx")
190+
A = pm.ConstantData("alcohol", df.alcohol.to_numpy(), dims="obs_idx")
191+
S = pm.ConstantData("nsneeze", df.nsneeze.to_numpy(), dims="obs_idx")
192+
193+
λ = pm.math.exp(a + b[0] * M + b[1] * A + b[2] * M * A)
194+
195+
y = pm.NegativeBinomial("y", mu=λ, alpha=alpha, observed=S, dims="obs_idx")
181196
182-
model = bmb.Model(fml, df, family="negativebinomial")
183-
trace = model.fit(draws=1000, tune=1000, cores=2)
197+
idata = pm.sample()
184198
```
185199

186200
### View Results
187201

188202
```{code-cell} ipython3
189-
az.plot_trace(trace);
203+
az.plot_trace(idata, compact=False);
190204
```
191205

192206
```{code-cell} ipython3
193207
# Transform coefficients to recover parameter values
194-
az.summary(np.exp(trace.posterior), kind="stats", var_names="~nsneeze_alpha")
208+
az.summary(np.exp(idata.posterior), kind="stats", var_names=["intercept", "slopes"])
209+
```
210+
211+
```{code-cell} ipython3
212+
az.summary(idata.posterior, kind="stats", var_names="alpha")
195213
```
196214

197215
The mean values are close to the values we specified when generating the data:
@@ -206,7 +224,17 @@ Finally, the mean of `nsneeze_alpha` is also quite close to its actual value of
206224

207225
See also, [`bambi's` negative binomial example](https://bambinos.github.io/bambi/master/notebooks/negative_binomial.html) for further reference.
208226

227+
+++
228+
229+
## Authors
230+
- Created by [Ian Ozsvald](https://github.com/ianozsvald)
231+
- Updated by [Abhipsha Das](https://github.com/chiral-carbon) in August 2021
232+
- Updated by [Benjamin Vincent](https://github.com/drbenvincent) to PyMC v4 in June 2022
233+
209234
```{code-cell} ipython3
210235
%load_ext watermark
211-
%watermark -n -u -v -iv -w -p theano,xarray
236+
%watermark -n -u -v -iv -w -p aesara,aeppl,xarray
212237
```
238+
239+
:::{include} ../page_footer.md
240+
:::

0 commit comments

Comments
 (0)