Skip to content

Commit 57d3f19

Browse files
authored
Updates to binning case study to work with PyMC v4 (#366)
* Updates to binning case study to work with PyMC v4 * Added PR number * Clean up * Re run all cells * updated myst * Removed pymc tags * Removed warning and pymc tags * Change at.exp/concat to pm.math.exp/concat; updated ppc text * Fixed typos * Updated to concate calls
1 parent 75e0b9a commit 57d3f19

File tree

2 files changed

+958
-732
lines changed

2 files changed

+958
-732
lines changed

examples/case_studies/binning.ipynb

+889-662
Large diffs are not rendered by default.

myst_nbs/case_studies/binning.myst.md

+69-70
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,15 @@ jupytext:
66
format_version: 0.13
77
jupytext_version: 1.13.7
88
kernelspec:
9-
display_name: Python 3 (ipykernel)
9+
display_name: Python [conda env:pymc_env]
1010
language: python
11-
name: python3
11+
name: conda-env-pymc_env-py
1212
---
1313

1414
(awkward_binning)=
1515
# Estimating parameters of a distribution from awkwardly binned data
1616
:::{post} Oct 23, 2021
17-
:tags: binned data, case study, parameter estimation, pymc3.Bound, pymc3.Deterministic, pymc3.Gamma, pymc3.HalfNormal, pymc3.Model, pymc3.Multinomial, pymc3.Normal
17+
:tags: binned data, case study, parameter estimation
1818
:category: intermediate
1919
:author: Eric Ma, Benjamin T. Vincent
2020
:::
@@ -70,31 +70,33 @@ In ordinal regression, the cutpoints are treated as latent variables and the par
7070
We are now in a position to sketch out a generative PyMC model:
7171

7272
```python
73+
import aesara.tensor as at
74+
7375
with pm.Model() as model:
7476
# priors
7577
mu = pm.Normal("mu")
7678
sigma = pm.HalfNormal("sigma")
7779
# generative process
78-
probs = pm.Normal.dist(mu=mu, sigma=sigma).cdf(cutpoints)
79-
probs = theano.tensor.concatenate([[0], probs, [1]])
80-
probs = theano.tensor.extra_ops.diff(probs)
80+
probs = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), cutpoints))
81+
probs = pm.math.concatenate([[0], probs, [1]])
82+
probs = at.extra_ops.diff(probs)
8183
# likelihood
8284
pm.Multinomial("counts", p=probs, n=sum(counts), observed=counts)
8385
```
8486

8587
The exact way we implement the models below differs only very slightly from this, but let's decompose how this works.
8688
Firstly we define priors over the `mu` and `sigma` parameters of the latent distribution. Then we have 3 lines which calculate the probability that any observed datum falls in a given bin. The first line of this
8789
```python
88-
probs = pm.Normal.dist(mu=mu, sigma=sigma).cdf(cutpoints)
90+
probs = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), cutpoints))
8991
```
9092
calculates the cumulative density at each of the cutpoints. The second line
9193
```python
92-
probs = theano.tensor.concatenate([[0], probs, [1]])
94+
probs = pm.math.concatenate([[0], probs, [1]])
9395
```
9496
simply concatenates the cumulative density at $-\infty$ (which is zero) and at $\infty$ (which is 1).
9597
The third line
9698
```python
97-
probs = theano.tensor.extra_ops.diff(probs)
99+
probs = at.extra_ops.diff(probs)
98100
```
99101
calculates the difference between consecutive cumulative densities to give the actual probability of a datum falling in any given bin.
100102

@@ -107,13 +109,17 @@ The approach was illustrated with a Gaussian distribution, and below we show a n
107109
```{code-cell} ipython3
108110
:tags: []
109111
112+
import warnings
113+
114+
import aesara.tensor as at
110115
import arviz as az
111116
import matplotlib.pyplot as plt
112117
import numpy as np
113118
import pandas as pd
114-
import pymc3 as pm
119+
import pymc as pm
115120
import seaborn as sns
116-
import theano.tensor as aet
121+
122+
warnings.filterwarnings(action="ignore", category=UserWarning)
117123
```
118124

119125
```{code-cell} ipython3
@@ -220,8 +226,8 @@ with pm.Model() as model1:
220226
sigma = pm.HalfNormal("sigma")
221227
mu = pm.Normal("mu")
222228
223-
probs1 = aet.exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf(d1))
224-
probs1 = aet.extra_ops.diff(aet.concatenate([[0], probs1, [1]]))
229+
probs1 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), d1))
230+
probs1 = at.extra_ops.diff(pm.math.concatenate([[0], probs1, [1]]))
225231
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
226232
```
227233

@@ -233,7 +239,7 @@ pm.model_to_graphviz(model1)
233239
:tags: []
234240
235241
with model1:
236-
trace1 = pm.sample(return_inferencedata=True)
242+
trace1 = pm.sample()
237243
```
238244

239245
### Checks on model
@@ -246,8 +252,7 @@ we should be able to generate observations that look close to what we observed.
246252
:tags: []
247253
248254
with model1:
249-
ppc1 = pm.sample_posterior_predictive(trace1)
250-
ppc = az.from_pymc3(posterior_predictive=ppc1)
255+
ppc = pm.sample_posterior_predictive(trace1)
251256
```
252257

253258
We can do this graphically.
@@ -326,16 +331,16 @@ with pm.Model() as model2:
326331
sigma = pm.HalfNormal("sigma")
327332
mu = pm.Normal("mu")
328333
329-
probs2 = aet.exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf(d2))
330-
probs2 = aet.extra_ops.diff(aet.concatenate([[0], probs2, [1]]))
334+
probs2 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), d2))
335+
probs2 = at.extra_ops.diff(pm.math.concatenate([[0], probs2, [1]]))
331336
pm.Multinomial("counts2", p=probs2, n=c2.sum(), observed=c2.values)
332337
```
333338

334339
```{code-cell} ipython3
335340
:tags: []
336341
337342
with model2:
338-
trace2 = pm.sample(return_inferencedata=True)
343+
trace2 = pm.sample()
339344
```
340345

341346
```{code-cell} ipython3
@@ -352,11 +357,10 @@ Let's run a PPC check to ensure we are generating data that are similar to what
352357
:tags: []
353358
354359
with model2:
355-
ppc2 = pm.sample_posterior_predictive(trace2)
356-
ppc = az.from_pymc3(posterior_predictive=ppc2)
360+
ppc = pm.sample_posterior_predictive(trace2)
357361
```
358362

359-
Note that `ppc2` is not in xarray format. It is a dictionary where the keys are the parameters and the values are arrays of samples. So the line below looks at the mean bin posterior predictive bin counts, averaged over samples.
363+
We calculate the mean bin posterior predictive bin counts, averaged over samples.
360364

361365
```{code-cell} ipython3
362366
:tags: []
@@ -422,12 +426,12 @@ with pm.Model() as model3:
422426
sigma = pm.HalfNormal("sigma")
423427
mu = pm.Normal("mu")
424428
425-
probs1 = aet.exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf(d1))
426-
probs1 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs1, np.array([1])]))
429+
probs1 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), d1))
430+
probs1 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs1, np.array([1])]))
427431
probs1 = pm.Deterministic("normal1_cdf", probs1)
428432
429-
probs2 = aet.exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf(d2))
430-
probs2 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs2, np.array([1])]))
433+
probs2 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), d2))
434+
probs2 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs2, np.array([1])]))
431435
probs2 = pm.Deterministic("normal2_cdf", probs2)
432436
433437
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
@@ -442,7 +446,7 @@ pm.model_to_graphviz(model3)
442446
:tags: []
443447
444448
with model3:
445-
trace3 = pm.sample(return_inferencedata=True)
449+
trace3 = pm.sample()
446450
```
447451

448452
```{code-cell} ipython3
@@ -453,8 +457,7 @@ az.plot_pair(trace3, var_names=["mu", "sigma"], divergences=True);
453457

454458
```{code-cell} ipython3
455459
with model3:
456-
ppc3 = pm.sample_posterior_predictive(trace3)
457-
ppc = az.from_pymc3(posterior_predictive=ppc3)
460+
ppc = pm.sample_posterior_predictive(trace3)
458461
```
459462

460463
```{code-cell} ipython3
@@ -516,8 +519,8 @@ with pm.Model() as model4:
516519
sigma = pm.HalfNormal("sigma")
517520
mu = pm.Normal("mu")
518521
# study 1
519-
probs1 = aet.exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf(d1))
520-
probs1 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs1, np.array([1])]))
522+
probs1 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu, sigma=sigma), d1))
523+
probs1 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs1, np.array([1])]))
521524
probs1 = pm.Deterministic("normal1_cdf", probs1)
522525
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
523526
# study 2
@@ -530,15 +533,14 @@ pm.model_to_graphviz(model4)
530533

531534
```{code-cell} ipython3
532535
with model4:
533-
trace4 = pm.sample(return_inferencedata=True)
536+
trace4 = pm.sample()
534537
```
535538

536539
### Posterior predictive checks
537540

538541
```{code-cell} ipython3
539542
with model4:
540-
ppc4 = pm.sample_posterior_predictive(trace4)
541-
ppc = az.from_pymc3(posterior_predictive=ppc4)
543+
ppc = pm.sample_posterior_predictive(trace4)
542544
```
543545

544546
```{code-cell} ipython3
@@ -556,14 +558,16 @@ ax[0].set_xticklabels([f"bin {n}" for n in range(len(c1))])
556558
ax[0].set_title("Posterior predictive: Study 1")
557559
558560
# Study 2 ----------------------------------------------------------------
559-
ax[1].hist(ppc4["y"].flatten(), 50, density=True, alpha=0.5)
561+
ax[1].hist(ppc.posterior_predictive.y.values.flatten(), 50, density=True, alpha=0.5)
560562
ax[1].set(title="Posterior predictive: Study 2", xlabel="$x$", ylabel="density");
561563
```
562564

563565
We can calculate the mean and standard deviation of the posterior predictive distribution for study 2 and see that they are close to our true parameters.
564566

565567
```{code-cell} ipython3
566-
np.mean(ppc4["y"].flatten()), np.std(ppc4["y"].flatten())
568+
np.mean(ppc.posterior_predictive.y.values.flatten()), np.std(
569+
ppc.posterior_predictive.y.values.flatten()
570+
)
567571
```
568572

569573
### Recovering parameters
@@ -598,23 +602,23 @@ with pm.Model(coords=coords) as model5:
598602
mu_pop_mean = pm.Normal("mu_pop_mean", 0.0, 1.0)
599603
mu_pop_variance = pm.HalfNormal("mu_pop_variance", sigma=1)
600604
601-
BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
602-
sigma_pop_mean = BoundedNormal("sigma_pop_mean", mu=0, sigma=1)
605+
sigma_pop_mean = pm.HalfNormal("sigma_pop_mean", sigma=1)
603606
sigma_pop_sigma = pm.HalfNormal("sigma_pop_sigma", sigma=1)
604607
605608
# Study level priors
606609
mu = pm.Normal("mu", mu=mu_pop_mean, sigma=mu_pop_variance, dims="study")
607-
# sigma = pm.HalfCauchy("sigma", beta=sigma_pop_sigma, dims='study')
608-
sigma = BoundedNormal("sigma", mu=sigma_pop_mean, sigma=sigma_pop_sigma, dims="study")
610+
sigma = pm.TruncatedNormal(
611+
"sigma", mu=sigma_pop_mean, sigma=sigma_pop_sigma, lower=0, dims="study"
612+
)
609613
610614
# Study 1
611-
probs1 = aet.exp(pm.Normal.dist(mu=mu[0], sigma=sigma[0]).logcdf(d1))
612-
probs1 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs1, np.array([1])]))
615+
probs1 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu[0], sigma=sigma[0]), d1))
616+
probs1 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs1, np.array([1])]))
613617
probs1 = pm.Deterministic("normal1_cdf", probs1, dims="bin1")
614618
615619
# Study 2
616-
probs2 = aet.exp(pm.Normal.dist(mu=mu[1], sigma=sigma[1]).logcdf(d2))
617-
probs2 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs2, np.array([1])]))
620+
probs2 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu[1], sigma=sigma[1]), d2))
621+
probs2 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs2, np.array([1])]))
618622
probs2 = pm.Deterministic("normal2_cdf", probs2, dims="bin2")
619623
620624
# Likelihood
@@ -641,13 +645,13 @@ with pm.Model(coords=coords) as model5:
641645
sigma = pm.Gamma("sigma", alpha=2, beta=1, dims="study")
642646
643647
# Study 1
644-
probs1 = aet.exp(pm.Normal.dist(mu=mu[0], sigma=sigma[0]).logcdf(d1))
645-
probs1 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs1, np.array([1])]))
648+
probs1 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu[0], sigma=sigma[0]), d1))
649+
probs1 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs1, np.array([1])]))
646650
probs1 = pm.Deterministic("normal1_cdf", probs1, dims="bin1")
647651
648652
# Study 2
649-
probs2 = aet.exp(pm.Normal.dist(mu=mu[1], sigma=sigma[1]).logcdf(d2))
650-
probs2 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs2, np.array([1])]))
653+
probs2 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu[1], sigma=sigma[1]), d2))
654+
probs2 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs2, np.array([1])]))
651655
probs2 = pm.Deterministic("normal2_cdf", probs2, dims="bin2")
652656
653657
# Likelihood
@@ -661,7 +665,7 @@ pm.model_to_graphviz(model5)
661665

662666
```{code-cell} ipython3
663667
with model5:
664-
trace5 = pm.sample(tune=2000, target_accept=0.99, return_inferencedata=True)
668+
trace5 = pm.sample(tune=2000, target_accept=0.99)
665669
```
666670

667671
We can see that despite our efforts, we still get some divergences. Plotting the samples and highlighting the divergences suggests (from the top left subplot) that our model is suffering from the funnel problem
@@ -676,8 +680,7 @@ az.plot_pair(
676680

677681
```{code-cell} ipython3
678682
with model5:
679-
ppc5 = pm.sample_posterior_predictive(trace5)
680-
ppc = az.from_pymc3(posterior_predictive=ppc5)
683+
ppc = pm.sample_posterior_predictive(trace5)
681684
```
682685

683686
```{code-cell} ipython3
@@ -745,13 +748,13 @@ with pm.Model(coords=coords) as model5:
745748
sigma = pm.HalfNormal("sigma", dims='study')
746749

747750
# Study 1
748-
probs1 = aet.exp(pm.Normal.dist(mu=mu[0], sigma=sigma[0]).logcdf(d1))
749-
probs1 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs1, np.array([1])]))
751+
probs1 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu[0], sigma=sigma[0]), d1))
752+
probs1 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs1, np.array([1])]))
750753
probs1 = pm.Deterministic("normal1_cdf", probs1, dims='bin1')
751754

752755
# Study 2
753-
probs2 = aet.exp(pm.Normal.dist(mu=mu[1], sigma=sigma[1]).logcdf(d2))
754-
probs2 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs2, np.array([1])]))
756+
probs2 = pm.math.exp(pm.logcdf(pm.Normal.dist(mu=mu[1], sigma=sigma[1]), d2))
757+
probs2 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs2, np.array([1])]))
755758
probs2 = pm.Deterministic("normal2_cdf", probs2, dims='bin2')
756759

757760
# Likelihood
@@ -803,8 +806,8 @@ true_mu, true_beta = 20, 4
803806
BMI = pm.Gumbel.dist(mu=true_mu, beta=true_beta)
804807
805808
# Generate two different sets of random samples from the same Gaussian.
806-
x1 = BMI.random(size=800)
807-
x2 = BMI.random(size=1200)
809+
x1 = pm.draw(BMI, 800)
810+
x2 = pm.draw(BMI, 1200)
808811
809812
# Calculate bin counts
810813
c1 = data_to_bincounts(x1, d1)
@@ -843,7 +846,7 @@ ax[1, 0].set(xlim=(0, 50), xlabel="BMI", ylabel="observed frequency", title="Sam
843846

844847
### Model specification
845848

846-
This is a variation of Example 3 above. The only changes are:
849+
This is a variation of Example 3 above. The only changes are:
847850
- update the probability distribution to match our target (the Gumbel distribution)
848851
- ensure we specify priors for our target distribution, appropriate given our domain knowledge.
849852

@@ -852,12 +855,12 @@ with pm.Model() as model6:
852855
mu = pm.Normal("mu", 20, 5)
853856
beta = pm.HalfNormal("beta", 10)
854857
855-
probs1 = aet.exp(pm.Gumbel.dist(mu=mu, beta=beta).logcdf(d1))
856-
probs1 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs1, np.array([1])]))
858+
probs1 = pm.math.exp(pm.logcdf(pm.Gumbel.dist(mu=mu, beta=beta), d1))
859+
probs1 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs1, np.array([1])]))
857860
probs1 = pm.Deterministic("gumbel_cdf1", probs1)
858861
859-
probs2 = aet.exp(pm.Gumbel.dist(mu=mu, beta=beta).logcdf(d2))
860-
probs2 = aet.extra_ops.diff(aet.concatenate([np.array([0]), probs2, np.array([1])]))
862+
probs2 = pm.math.exp(pm.logcdf(pm.Gumbel.dist(mu=mu, beta=beta), d2))
863+
probs2 = at.extra_ops.diff(pm.math.concatenate([np.array([0]), probs2, np.array([1])]))
861864
probs2 = pm.Deterministic("gumbel_cdf2", probs2)
862865
863866
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
@@ -870,15 +873,14 @@ pm.model_to_graphviz(model6)
870873

871874
```{code-cell} ipython3
872875
with model6:
873-
trace6 = pm.sample(return_inferencedata=True)
876+
trace6 = pm.sample()
874877
```
875878

876879
### Posterior predictive checks
877880

878881
```{code-cell} ipython3
879882
with model6:
880-
ppc6 = pm.sample_posterior_predictive(trace6)
881-
ppc = az.from_pymc3(posterior_predictive=ppc6)
883+
ppc = pm.sample_posterior_predictive(trace6)
882884
```
883885

884886
```{code-cell} ipython3
@@ -935,19 +937,16 @@ We have presented a range of different examples here which makes clear that the
935937

936938
## Authors
937939
* Authored by [Eric Ma](https://github.com/ericmjl) and [Benjamin T. Vincent](https://github.com/drbenvincent) in September, 2021 ([pymc-examples#229](https://github.com/pymc-devs/pymc-examples/pull/229))
940+
* Updated to run in PyMC v4 by Fernando Irarrazaval in June 2022 ([pymc-examples#366](https://github.com/pymc-devs/pymc-examples/pull/366))
938941

939942
+++
940943

941944
## Watermark
942945

943946
```{code-cell} ipython3
944947
%load_ext watermark
945-
%watermark -n -u -v -iv -w -p theano,xarray
948+
%watermark -n -u -v -iv -w -p aeppl,xarray
946949
```
947950

948951
:::{include} ../page_footer.md
949952
:::
950-
951-
```{code-cell} ipython3
952-
953-
```

0 commit comments

Comments
 (0)