@@ -6,15 +6,15 @@ jupytext:
6
6
format_version : 0.13
7
7
jupytext_version : 1.13.7
8
8
kernelspec :
9
- display_name : Python 3 (ipykernel)
9
+ display_name : Python [conda env:pymc_env]
10
10
language : python
11
- name : python3
11
+ name : conda-env-pymc_env-py
12
12
---
13
13
14
14
(awkward_binning)=
15
15
# Estimating parameters of a distribution from awkwardly binned data
16
16
:::{post} Oct 23, 2021
17
- :tags: binned data, case study, parameter estimation, pymc3.Bound, pymc3.Deterministic, pymc3.Gamma, pymc3.HalfNormal, pymc3.Model, pymc3.Multinomial, pymc3.Normal
17
+ :tags: binned data, case study, parameter estimation
18
18
:category: intermediate
19
19
:author: Eric Ma, Benjamin T. Vincent
20
20
:::
@@ -70,31 +70,33 @@ In ordinal regression, the cutpoints are treated as latent variables and the par
70
70
We are now in a position to sketch out a generative PyMC model:
71
71
72
72
``` python
73
+ import aesara.tensor as at
74
+
73
75
with pm.Model() as model:
74
76
# priors
75
77
mu = pm.Normal(" mu" )
76
78
sigma = pm.HalfNormal(" sigma" )
77
79
# generative process
78
- probs = pm.Normal.dist(mu = mu, sigma = sigma).cdf( cutpoints)
79
- probs = theano.tensor .concatenate([[0 ], probs, [1 ]])
80
- probs = theano.tensor .extra_ops.diff(probs)
80
+ probs = pm.math.exp(pm.logcdf(pm. Normal.dist(mu = mu, sigma = sigma), cutpoints) )
81
+ probs = pm.math .concatenate([[0 ], probs, [1 ]])
82
+ probs = at .extra_ops.diff(probs)
81
83
# likelihood
82
84
pm.Multinomial(" counts" , p = probs, n = sum (counts), observed = counts)
83
85
```
84
86
85
87
The exact way we implement the models below differs only very slightly from this, but let's decompose how this works.
86
88
Firstly we define priors over the ` mu ` and ` sigma ` parameters of the latent distribution. Then we have 3 lines which calculate the probability that any observed datum falls in a given bin. The first line of this
87
89
``` python
88
- probs = pm.Normal.dist(mu = mu, sigma = sigma).cdf( cutpoints)
90
+ probs = pm.math.exp(pm.logcdf(pm. Normal.dist(mu = mu, sigma = sigma), cutpoints) )
89
91
```
90
92
calculates the cumulative density at each of the cutpoints. The second line
91
93
``` python
92
- probs = theano.tensor .concatenate([[0 ], probs, [1 ]])
94
+ probs = pm.math .concatenate([[0 ], probs, [1 ]])
93
95
```
94
96
simply concatenates the cumulative density at $-\infty$ (which is zero) and at $\infty$ (which is 1).
95
97
The third line
96
98
``` python
97
- probs = theano.tensor .extra_ops.diff(probs)
99
+ probs = at .extra_ops.diff(probs)
98
100
```
99
101
calculates the difference between consecutive cumulative densities to give the actual probability of a datum falling in any given bin.
100
102
@@ -107,13 +109,17 @@ The approach was illustrated with a Gaussian distribution, and below we show a n
107
109
``` {code-cell} ipython3
108
110
:tags: []
109
111
112
+ import warnings
113
+
114
+ import aesara.tensor as at
110
115
import arviz as az
111
116
import matplotlib.pyplot as plt
112
117
import numpy as np
113
118
import pandas as pd
114
- import pymc3 as pm
119
+ import pymc as pm
115
120
import seaborn as sns
116
- import theano.tensor as aet
121
+
122
+ warnings.filterwarnings(action="ignore", category=UserWarning)
117
123
```
118
124
119
125
``` {code-cell} ipython3
@@ -220,8 +226,8 @@ with pm.Model() as model1:
220
226
sigma = pm.HalfNormal("sigma")
221
227
mu = pm.Normal("mu")
222
228
223
- probs1 = aet. exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf( d1))
224
- probs1 = aet .extra_ops.diff(aet .concatenate([[0], probs1, [1]]))
229
+ probs1 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu, sigma=sigma), d1))
230
+ probs1 = at .extra_ops.diff(pm.math .concatenate([[0], probs1, [1]]))
225
231
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
226
232
```
227
233
@@ -233,7 +239,7 @@ pm.model_to_graphviz(model1)
233
239
:tags: []
234
240
235
241
with model1:
236
- trace1 = pm.sample(return_inferencedata=True )
242
+ trace1 = pm.sample()
237
243
```
238
244
239
245
### Checks on model
@@ -246,8 +252,7 @@ we should be able to generate observations that look close to what we observed.
246
252
:tags: []
247
253
248
254
with model1:
249
- ppc1 = pm.sample_posterior_predictive(trace1)
250
- ppc = az.from_pymc3(posterior_predictive=ppc1)
255
+ ppc = pm.sample_posterior_predictive(trace1)
251
256
```
252
257
253
258
We can do this graphically.
@@ -326,16 +331,16 @@ with pm.Model() as model2:
326
331
sigma = pm.HalfNormal("sigma")
327
332
mu = pm.Normal("mu")
328
333
329
- probs2 = aet. exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf( d2))
330
- probs2 = aet .extra_ops.diff(aet .concatenate([[0], probs2, [1]]))
334
+ probs2 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu, sigma=sigma), d2))
335
+ probs2 = at .extra_ops.diff(pm.math .concatenate([[0], probs2, [1]]))
331
336
pm.Multinomial("counts2", p=probs2, n=c2.sum(), observed=c2.values)
332
337
```
333
338
334
339
``` {code-cell} ipython3
335
340
:tags: []
336
341
337
342
with model2:
338
- trace2 = pm.sample(return_inferencedata=True )
343
+ trace2 = pm.sample()
339
344
```
340
345
341
346
``` {code-cell} ipython3
@@ -352,11 +357,10 @@ Let's run a PPC check to ensure we are generating data that are similar to what
352
357
:tags: []
353
358
354
359
with model2:
355
- ppc2 = pm.sample_posterior_predictive(trace2)
356
- ppc = az.from_pymc3(posterior_predictive=ppc2)
360
+ ppc = pm.sample_posterior_predictive(trace2)
357
361
```
358
362
359
- Note that ` ppc2 ` is not in xarray format. It is a dictionary where the keys are the parameters and the values are arrays of samples. So the line below looks at the mean bin posterior predictive bin counts, averaged over samples.
363
+ We calculate the mean bin posterior predictive bin counts, averaged over samples.
360
364
361
365
``` {code-cell} ipython3
362
366
:tags: []
@@ -422,12 +426,12 @@ with pm.Model() as model3:
422
426
sigma = pm.HalfNormal("sigma")
423
427
mu = pm.Normal("mu")
424
428
425
- probs1 = aet. exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf( d1))
426
- probs1 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs1, np.array([1])]))
429
+ probs1 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu, sigma=sigma), d1))
430
+ probs1 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs1, np.array([1])]))
427
431
probs1 = pm.Deterministic("normal1_cdf", probs1)
428
432
429
- probs2 = aet. exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf( d2))
430
- probs2 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs2, np.array([1])]))
433
+ probs2 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu, sigma=sigma), d2))
434
+ probs2 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs2, np.array([1])]))
431
435
probs2 = pm.Deterministic("normal2_cdf", probs2)
432
436
433
437
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
@@ -442,7 +446,7 @@ pm.model_to_graphviz(model3)
442
446
:tags: []
443
447
444
448
with model3:
445
- trace3 = pm.sample(return_inferencedata=True )
449
+ trace3 = pm.sample()
446
450
```
447
451
448
452
``` {code-cell} ipython3
@@ -453,8 +457,7 @@ az.plot_pair(trace3, var_names=["mu", "sigma"], divergences=True);
453
457
454
458
``` {code-cell} ipython3
455
459
with model3:
456
- ppc3 = pm.sample_posterior_predictive(trace3)
457
- ppc = az.from_pymc3(posterior_predictive=ppc3)
460
+ ppc = pm.sample_posterior_predictive(trace3)
458
461
```
459
462
460
463
``` {code-cell} ipython3
@@ -516,8 +519,8 @@ with pm.Model() as model4:
516
519
sigma = pm.HalfNormal("sigma")
517
520
mu = pm.Normal("mu")
518
521
# study 1
519
- probs1 = aet. exp(pm.Normal.dist(mu=mu, sigma=sigma).logcdf( d1))
520
- probs1 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs1, np.array([1])]))
522
+ probs1 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu, sigma=sigma), d1))
523
+ probs1 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs1, np.array([1])]))
521
524
probs1 = pm.Deterministic("normal1_cdf", probs1)
522
525
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
523
526
# study 2
@@ -530,15 +533,14 @@ pm.model_to_graphviz(model4)
530
533
531
534
``` {code-cell} ipython3
532
535
with model4:
533
- trace4 = pm.sample(return_inferencedata=True )
536
+ trace4 = pm.sample()
534
537
```
535
538
536
539
### Posterior predictive checks
537
540
538
541
``` {code-cell} ipython3
539
542
with model4:
540
- ppc4 = pm.sample_posterior_predictive(trace4)
541
- ppc = az.from_pymc3(posterior_predictive=ppc4)
543
+ ppc = pm.sample_posterior_predictive(trace4)
542
544
```
543
545
544
546
``` {code-cell} ipython3
@@ -556,14 +558,16 @@ ax[0].set_xticklabels([f"bin {n}" for n in range(len(c1))])
556
558
ax[0].set_title("Posterior predictive: Study 1")
557
559
558
560
# Study 2 ----------------------------------------------------------------
559
- ax[1].hist(ppc4["y"] .flatten(), 50, density=True, alpha=0.5)
561
+ ax[1].hist(ppc.posterior_predictive.y.values .flatten(), 50, density=True, alpha=0.5)
560
562
ax[1].set(title="Posterior predictive: Study 2", xlabel="$x$", ylabel="density");
561
563
```
562
564
563
565
We can calculate the mean and standard deviation of the posterior predictive distribution for study 2 and see that they are close to our true parameters.
564
566
565
567
``` {code-cell} ipython3
566
- np.mean(ppc4["y"].flatten()), np.std(ppc4["y"].flatten())
568
+ np.mean(ppc.posterior_predictive.y.values.flatten()), np.std(
569
+ ppc.posterior_predictive.y.values.flatten()
570
+ )
567
571
```
568
572
569
573
### Recovering parameters
@@ -598,23 +602,23 @@ with pm.Model(coords=coords) as model5:
598
602
mu_pop_mean = pm.Normal("mu_pop_mean", 0.0, 1.0)
599
603
mu_pop_variance = pm.HalfNormal("mu_pop_variance", sigma=1)
600
604
601
- BoundedNormal = pm.Bound(pm.Normal, lower=0.0)
602
- sigma_pop_mean = BoundedNormal("sigma_pop_mean", mu=0, sigma=1)
605
+ sigma_pop_mean = pm.HalfNormal("sigma_pop_mean", sigma=1)
603
606
sigma_pop_sigma = pm.HalfNormal("sigma_pop_sigma", sigma=1)
604
607
605
608
# Study level priors
606
609
mu = pm.Normal("mu", mu=mu_pop_mean, sigma=mu_pop_variance, dims="study")
607
- # sigma = pm.HalfCauchy("sigma", beta=sigma_pop_sigma, dims='study')
608
- sigma = BoundedNormal("sigma", mu=sigma_pop_mean, sigma=sigma_pop_sigma, dims="study")
610
+ sigma = pm.TruncatedNormal(
611
+ "sigma", mu=sigma_pop_mean, sigma=sigma_pop_sigma, lower=0, dims="study"
612
+ )
609
613
610
614
# Study 1
611
- probs1 = aet. exp(pm.Normal.dist(mu=mu[0], sigma=sigma[0]).logcdf( d1))
612
- probs1 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs1, np.array([1])]))
615
+ probs1 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu[0], sigma=sigma[0]), d1))
616
+ probs1 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs1, np.array([1])]))
613
617
probs1 = pm.Deterministic("normal1_cdf", probs1, dims="bin1")
614
618
615
619
# Study 2
616
- probs2 = aet. exp(pm.Normal.dist(mu=mu[1], sigma=sigma[1]).logcdf( d2))
617
- probs2 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs2, np.array([1])]))
620
+ probs2 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu[1], sigma=sigma[1]), d2))
621
+ probs2 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs2, np.array([1])]))
618
622
probs2 = pm.Deterministic("normal2_cdf", probs2, dims="bin2")
619
623
620
624
# Likelihood
@@ -641,13 +645,13 @@ with pm.Model(coords=coords) as model5:
641
645
sigma = pm.Gamma("sigma", alpha=2, beta=1, dims="study")
642
646
643
647
# Study 1
644
- probs1 = aet. exp(pm.Normal.dist(mu=mu[0], sigma=sigma[0]).logcdf( d1))
645
- probs1 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs1, np.array([1])]))
648
+ probs1 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu[0], sigma=sigma[0]), d1))
649
+ probs1 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs1, np.array([1])]))
646
650
probs1 = pm.Deterministic("normal1_cdf", probs1, dims="bin1")
647
651
648
652
# Study 2
649
- probs2 = aet. exp(pm.Normal.dist(mu=mu[1], sigma=sigma[1]).logcdf( d2))
650
- probs2 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs2, np.array([1])]))
653
+ probs2 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu=mu[1], sigma=sigma[1]), d2))
654
+ probs2 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs2, np.array([1])]))
651
655
probs2 = pm.Deterministic("normal2_cdf", probs2, dims="bin2")
652
656
653
657
# Likelihood
@@ -661,7 +665,7 @@ pm.model_to_graphviz(model5)
661
665
662
666
``` {code-cell} ipython3
663
667
with model5:
664
- trace5 = pm.sample(tune=2000, target_accept=0.99, return_inferencedata=True )
668
+ trace5 = pm.sample(tune=2000, target_accept=0.99)
665
669
```
666
670
667
671
We can see that despite our efforts, we still get some divergences. Plotting the samples and highlighting the divergences suggests (from the top left subplot) that our model is suffering from the funnel problem
@@ -676,8 +680,7 @@ az.plot_pair(
676
680
677
681
``` {code-cell} ipython3
678
682
with model5:
679
- ppc5 = pm.sample_posterior_predictive(trace5)
680
- ppc = az.from_pymc3(posterior_predictive=ppc5)
683
+ ppc = pm.sample_posterior_predictive(trace5)
681
684
```
682
685
683
686
``` {code-cell} ipython3
@@ -745,13 +748,13 @@ with pm.Model(coords=coords) as model5:
745
748
sigma = pm.HalfNormal(" sigma" , dims = ' study' )
746
749
747
750
# Study 1
748
- probs1 = aet. exp(pm.Normal.dist(mu = mu[0 ], sigma = sigma[0 ]).logcdf( d1))
749
- probs1 = aet .extra_ops.diff(aet .concatenate([np.array([0 ]), probs1, np.array([1 ])]))
751
+ probs1 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu = mu[0 ], sigma = sigma[0 ]), d1))
752
+ probs1 = at .extra_ops.diff(pm.math .concatenate([np.array([0 ]), probs1, np.array([1 ])]))
750
753
probs1 = pm.Deterministic(" normal1_cdf" , probs1, dims = ' bin1' )
751
754
752
755
# Study 2
753
- probs2 = aet. exp(pm.Normal.dist(mu = mu[1 ], sigma = sigma[1 ]).logcdf( d2))
754
- probs2 = aet .extra_ops.diff(aet .concatenate([np.array([0 ]), probs2, np.array([1 ])]))
756
+ probs2 = pm.math. exp(pm.logcdf(pm. Normal.dist(mu = mu[1 ], sigma = sigma[1 ]), d2))
757
+ probs2 = at .extra_ops.diff(pm.math .concatenate([np.array([0 ]), probs2, np.array([1 ])]))
755
758
probs2 = pm.Deterministic(" normal2_cdf" , probs2, dims = ' bin2' )
756
759
757
760
# Likelihood
@@ -803,8 +806,8 @@ true_mu, true_beta = 20, 4
803
806
BMI = pm.Gumbel.dist(mu=true_mu, beta=true_beta)
804
807
805
808
# Generate two different sets of random samples from the same Gaussian.
806
- x1 = BMI.random(size= 800)
807
- x2 = BMI.random(size= 1200)
809
+ x1 = pm.draw(BMI, 800)
810
+ x2 = pm.draw(BMI, 1200)
808
811
809
812
# Calculate bin counts
810
813
c1 = data_to_bincounts(x1, d1)
@@ -843,7 +846,7 @@ ax[1, 0].set(xlim=(0, 50), xlabel="BMI", ylabel="observed frequency", title="Sam
843
846
844
847
### Model specification
845
848
846
- This is a variation of Example 3 above. The only changes are:
849
+ This is a variation of Example 3 above. The only changes are:
847
850
- update the probability distribution to match our target (the Gumbel distribution)
848
851
- ensure we specify priors for our target distribution, appropriate given our domain knowledge.
849
852
@@ -852,12 +855,12 @@ with pm.Model() as model6:
852
855
mu = pm.Normal("mu", 20, 5)
853
856
beta = pm.HalfNormal("beta", 10)
854
857
855
- probs1 = aet. exp(pm.Gumbel.dist(mu=mu, beta=beta).logcdf( d1))
856
- probs1 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs1, np.array([1])]))
858
+ probs1 = pm.math. exp(pm.logcdf(pm. Gumbel.dist(mu=mu, beta=beta), d1))
859
+ probs1 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs1, np.array([1])]))
857
860
probs1 = pm.Deterministic("gumbel_cdf1", probs1)
858
861
859
- probs2 = aet. exp(pm.Gumbel.dist(mu=mu, beta=beta).logcdf( d2))
860
- probs2 = aet .extra_ops.diff(aet .concatenate([np.array([0]), probs2, np.array([1])]))
862
+ probs2 = pm.math. exp(pm.logcdf(pm. Gumbel.dist(mu=mu, beta=beta), d2))
863
+ probs2 = at .extra_ops.diff(pm.math .concatenate([np.array([0]), probs2, np.array([1])]))
861
864
probs2 = pm.Deterministic("gumbel_cdf2", probs2)
862
865
863
866
pm.Multinomial("counts1", p=probs1, n=c1.sum(), observed=c1.values)
@@ -870,15 +873,14 @@ pm.model_to_graphviz(model6)
870
873
871
874
``` {code-cell} ipython3
872
875
with model6:
873
- trace6 = pm.sample(return_inferencedata=True )
876
+ trace6 = pm.sample()
874
877
```
875
878
876
879
### Posterior predictive checks
877
880
878
881
``` {code-cell} ipython3
879
882
with model6:
880
- ppc6 = pm.sample_posterior_predictive(trace6)
881
- ppc = az.from_pymc3(posterior_predictive=ppc6)
883
+ ppc = pm.sample_posterior_predictive(trace6)
882
884
```
883
885
884
886
``` {code-cell} ipython3
@@ -935,19 +937,16 @@ We have presented a range of different examples here which makes clear that the
935
937
936
938
## Authors
937
939
* Authored by [ Eric Ma] ( https://github.com/ericmjl ) and [ Benjamin T. Vincent] ( https://github.com/drbenvincent ) in September, 2021 ([ pymc-examples #229 ] ( https://github.com/pymc-devs/pymc-examples/pull/229 ) )
940
+ * Updated to run in PyMC v4 by Fernando Irarrazaval in June 2022 ([ pymc-examples #366 ] ( https://github.com/pymc-devs/pymc-examples/pull/366 ) )
938
941
939
942
+++
940
943
941
944
## Watermark
942
945
943
946
``` {code-cell} ipython3
944
947
%load_ext watermark
945
- %watermark -n -u -v -iv -w -p theano ,xarray
948
+ %watermark -n -u -v -iv -w -p aeppl ,xarray
946
949
```
947
950
948
951
:::{include} ../page_footer.md
949
952
:::
950
-
951
- ``` {code-cell} ipython3
952
-
953
- ```
0 commit comments