Skip to content

Commit 3cbcb57

Browse files
committed
Revision pymc-devs#2 ODE Notebook changes based on comments
1 parent 5eb8605 commit 3cbcb57

File tree

2 files changed

+965
-934
lines changed

2 files changed

+965
-934
lines changed

examples/ode_models/ODE_Lotka_Volterra_multiple_ways.ipynb

+912-880
Large diffs are not rendered by default.

examples/ode_models/ODE_Lotka_Volterra_multiple_ways.myst.md

+53-54
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ jupytext:
44
extension: .md
55
format_name: myst
66
format_version: 0.13
7-
jupytext_version: 1.13.7
87
kernelspec:
98
display_name: Python 3 (ipykernel)
109
language: python
@@ -16,24 +15,23 @@ kernelspec:
1615
(ODE_Lotka_Volterra_fit_multiple_ways)=
1716
# ODE Lotka-Volterra With Bayesian Inference in Multiple Ways
1817

19-
:::{post} December 27, 2022
20-
:tags: ODE, aesara op, gradient-free inference, aesara scan
21-
:category: intermediate
22-
:type: how-to
23-
:author: Adapted by Greg Brunkhorst from multiple pymc3 example notebooks by Sanmitra Ghosh, Demetri Pananos, and the PyMC Team
18+
:::{post} January 10, 2023
19+
:tags: ODE, pytensor op, gradient-free inference, pytensor scan
20+
:category: intermediate, how-to
21+
:author: [Greg Brunkhorst]{https://github.com/gbrunkhorst}
2422
:::
2523

2624
```{code-cell} ipython3
27-
import aesara
28-
import aesara.tensor as at
2925
import arviz as az
3026
import matplotlib.pyplot as plt
3127
import numpy as np
3228
import pandas as pd
3329
import pymc as pm
30+
import pytensor
31+
import pytensor.tensor as pt
3432
35-
from aesara.compile.ops import as_op
3633
from pymc.ode import DifferentialEquation
34+
from pytensor.compile.ops import as_op
3735
from scipy.integrate import odeint
3836
from scipy.optimize import least_squares
3937
@@ -57,7 +55,7 @@ The purpose of this notebook is to demonstrate how to perform Bayesian inference
5755
* Specification
5856
* Least Squares Solution
5957
* Gradient-free Bayesian Inference
60-
* Wrap `odeint` in an Aesara operator for use in PyMC
58+
* Wrapping `odeint` in a Pytensor operator for use in PyMC
6159
* Bayesian inference using gradient-free methods
6260
* Slice Sampler
6361
* DEMetropolisZ Sampler
@@ -66,21 +64,23 @@ The purpose of this notebook is to demonstrate how to perform Bayesian inference
6664
* Sequential Monte Carlo (SMC) Sampler
6765
* Bayesian Inference with Gradients
6866
* `pymc.ode.DifferentialEquation` specification with the NUTs Sampler
69-
* Looping in PyMC with `aesara.scan` and the NUTs Sampler
67+
* Looping in PyMC with `Pytensor.scan` and the NUTs Sampler
7068
### Key Conclusions
71-
* Based on the experiments in this notebook, the most simple and efficient method for performing Bayesian inference on the Lotka-Volterra equations was to specify the ODE system in Scipy, wrap the function as an Aesara op, and use a Differential Evolution Metropolis (DEMetropolis) sampler in PyMC.
69+
* Based on the experiments in this notebook, the most simple and efficient method for performing Bayesian inference on the Lotka-Volterra equations was to specify the ODE system in Scipy, wrap the function as an Pytensor op, and use a Differential Evolution Metropolis (DEMetropolis) sampler in PyMC.
7270

7371
+++ {"tags": []}
7472

7573
## Background
7674
### Motivation
77-
Ordinary differential equation models (ODEs) are used in a variety of science and engineering domains to model the time evolution of physical variables. A natural choice to estimate the values and uncertainty of model parameters given experimental data is Bayesian inference. However, ODEs can be challenging to specify and solve in the Bayesian setting, therefore, this notebook steps through multiple methods for solving an ODE inference problem using PyMC. The Lotka-Volterra model used in this example has often been used for benchmarking Bayesian inference methods (e.g., in [Stan](https://mc-stan.org/users/documentation/case-studies/lotka-volterra-predator-prey.html)).
75+
Ordinary differential equation models (ODEs) are used in a variety of science and engineering domains to model the time evolution of physical variables. A natural choice to estimate the values and uncertainty of model parameters given experimental data is Bayesian inference. However, ODEs can be challenging to specify and solve in the Bayesian setting, therefore, this notebook steps through multiple methods for solving an ODE inference problem using PyMC. The Lotka-Volterra model used in this example has often been used for benchmarking Bayesian inference methods (e.g., in [Stan](https://mc-stan.org/users/documentation/case-studies/lotka-volterra-predator-prey.html)). See also Richard McElraith's discussion of this model in [Statistical Rethinking](http://xcelab.net/rm/statistical-rethinking/), Chapter 16 of the Second Edition.
76+
77+
+++ {"tags": []}
7878

7979
### Lotka-Volterra Predator-Prey Model
8080
The Lotka-Volterra model describes the interaction between a predator and prey species. This ODE given by:
8181
$$
8282
\begin{aligned}
83-
\frac{d x}{dt} &=\alpha x -\beta xy \\
83+
\frac{d x}{dt} &=\alpha x -\beta xy \\
8484
\frac{d y}{dt} &=-\gamma y + \delta xy
8585
\end{aligned}
8686
$$
@@ -127,7 +127,7 @@ def plot_data(ax, lw=2, title="Hudson's Bay Company Data"):
127127
```
128128

129129
```{code-cell} ipython3
130-
fig, ax = plt.subplots(figsize=(9, 4))
130+
fig, ax = plt.subplots(figsize=(7, 4))
131131
plot_data(ax);
132132
```
133133

@@ -166,6 +166,7 @@ def plot_model(
166166
lw=3,
167167
title="Hudson's Bay Company Data and\nExample Model Run",
168168
):
169+
169170
ax.plot(time, x_y[:, 1], color="b", alpha=alpha, lw=lw, label="Lynx (Model)")
170171
ax.plot(time, x_y[:, 0], color="g", alpha=alpha, lw=lw, label="Hare (Model)")
171172
ax.legend(fontsize=14, loc="center left", bbox_to_anchor=(1, 0.5))
@@ -182,7 +183,7 @@ time = np.arange(1900, 1921, 0.01)
182183
x_y = odeint(func=rhs, y0=theta[-2:], t=time, args=(theta,))
183184
184185
# plot
185-
fig, ax = plt.subplots(figsize=(9, 4))
186+
fig, ax = plt.subplots(figsize=(7, 4))
186187
plot_data(ax, lw=0)
187188
plot_model(ax, x_y);
188189
```
@@ -225,7 +226,7 @@ Plot
225226
time = np.arange(1900, 1921, 0.01)
226227
theta = results.x
227228
x_y = odeint(func=rhs, y0=theta[-2:], t=time, args=(theta,))
228-
fig, ax = plt.subplots(figsize=(9, 4))
229+
fig, ax = plt.subplots(figsize=(7, 4))
229230
plot_data(ax, lw=0)
230231
plot_model(ax, x_y, title="Least Squares Solution");
231232
```
@@ -238,17 +239,17 @@ Looks right. If we didn't care about uncertainty, then we would be done. But w
238239

239240
+++
240241

241-
Like other Numpy or Scipy-based functions, the `scipy.integrate.odeint` function cannot be used directly in a PyMC model because PyMC needs to know the variable input and output types to compile. Therefore, we use an Aesara wrapper to give the variable types to PyMC. Then the function can be used in PyMC in conjunction with gradient-free samplers.
242+
Like other Numpy or Scipy-based functions, the `scipy.integrate.odeint` function cannot be used directly in a PyMC model because PyMC needs to know the variable input and output types to compile. Therefore, we use a Pytensor wrapper to give the variable types to PyMC. Then the function can be used in PyMC in conjunction with gradient-free samplers.
242243

243244
+++
244245

245-
### Convert Python Function to an Aesara Operator using @as_op decorator
246-
We tell PyMC the input variable types and the output variable types using the `@as_op` decorator. `odeint` returns Numpy arrays, but we tell PyMC that they are Aesara double float tensors for this purpose.
246+
### Convert Python Function to a Pytensor Operator using @as_op decorator
247+
We tell PyMC the input variable types and the output variable types using the `@as_op` decorator. `odeint` returns Numpy arrays, but we tell PyMC that they are Pytensor double float tensors for this purpose.
247248

248249
```{code-cell} ipython3
249-
# decorator with input and output types as aesara double float tensors
250-
@as_op(itypes=[at.dvector], otypes=[at.dmatrix])
251-
def aesara_forward_model_matrix(theta):
250+
# decorator with input and output types a Pytensor double float tensors
251+
@as_op(itypes=[pt.dvector], otypes=[pt.dmatrix])
252+
def pytensor_forward_model_matrix(theta):
252253
return odeint(func=rhs, y0=theta[-2:], t=data.year, args=(theta,))
253254
```
254255

@@ -273,7 +274,9 @@ with pm.Model() as model:
273274
sigma = pm.TruncatedNormal("sigma", mu=10, sigma=10, lower=0)
274275
275276
# Ode solution function
276-
ode_solution = aesara_forward_model_matrix(pm.math.stack(alpha, beta, gamma, delta, xt0, yt0))
277+
ode_solution = pytensor_forward_model_matrix(
278+
pm.math.stack([alpha, beta, gamma, delta, xt0, yt0])
279+
)
277280
278281
# Likelihood
279282
pm.Normal("Y_obs", mu=ode_solution, sigma=sigma, observed=data[["hare", "lynx"]].values)
@@ -326,7 +329,7 @@ Having good gradient free samplers can open up the models that can be fit within
326329

327330
Let's give them a shot.
328331

329-
A few notes on running these inferences. For each sampler, the number of tuning steps and draws have been reduced to run the inference in a reasonable amount of time (on the order of minutes). This is not a sufficient number of draws to get a good inferences, in some cases, but it works for demonstration purposes. In addition, multicore processing was not working for the aesara op function on all machines, so inference is performed on one core.
332+
A few notes on running these inferences. For each sampler, the number of tuning steps and draws have been reduced to run the inference in a reasonable amount of time (on the order of minutes). This is not a sufficient number of draws to get a good inferences, in some cases, but it works for demonstration purposes. In addition, multicore processing was not working for the Pytensor op function on all machines, so inference is performed on one core.
330333

331334
+++
332335

@@ -355,7 +358,7 @@ plt.suptitle(f"Trace Plot {sampler}");
355358
```
356359

357360
```{code-cell} ipython3
358-
fig, ax = plt.subplots(figsize=(9, 4))
361+
fig, ax = plt.subplots(figsize=(7, 4))
359362
plot_inference(ax, trace, title=f"Data and Inference Model Runs\n{sampler} Sampler");
360363
```
361364

@@ -427,7 +430,7 @@ plt.suptitle(f"Trace Plot {sampler} - Burn-in Removed");
427430
```
428431

429432
```{code-cell} ipython3
430-
fig, ax = plt.subplots(figsize=(9, 4))
433+
fig, ax = plt.subplots(figsize=(7, 4))
431434
plot_inference(ax, trace, title=f"Data and Inference Model Runs\n{sampler} Sampler");
432435
```
433436

@@ -453,7 +456,7 @@ plt.suptitle(f"Trace Plot {sampler}");
453456
```
454457

455458
```{code-cell} ipython3
456-
fig, ax = plt.subplots(figsize=(9, 4))
459+
fig, ax = plt.subplots(figsize=(7, 4))
457460
plot_inference(ax, trace, title=f"Data and Inference Model Runs\n{sampler} Sampler");
458461
```
459462

@@ -487,7 +490,7 @@ plt.suptitle(f"Trace Plot {sampler}");
487490
```
488491

489492
```{code-cell} ipython3
490-
fig, ax = plt.subplots(figsize=(9, 4))
493+
fig, ax = plt.subplots(figsize=(7, 4))
491494
plot_inference(ax, trace, title=f"Data and Inference Model Runs\n{sampler} Sampler");
492495
```
493496

@@ -500,7 +503,7 @@ At this number of samples and tuning scheme, the SMC algorithm results in wider
500503

501504
+++
502505

503-
As outlined in the SMC tutorial on PyMC.io, the SMC sampler is often combined with a `pm.Simulator` function rather than an aesara op. Here is a rewrite of the PyMC - odeint model for the SMC sampler.
506+
As outlined in the SMC tutorial on PyMC.io, the SMC sampler is often combined with a `pm.Simulator` function rather than a Pytensor op. Here is a rewrite of the PyMC - odeint model for the SMC sampler.
504507

505508
The simulator function needs to have the correct signature (e.g., accept an rng argument first).
506509

@@ -550,7 +553,7 @@ plt.suptitle(f"Trace Plot {sampler}");
550553
```
551554

552555
```{code-cell} ipython3
553-
fig, ax = plt.subplots(figsize=(9, 4))
556+
fig, ax = plt.subplots(figsize=(7, 4))
554557
plot_inference(ax, trace, title=f"Data and Inference Model Runs\n{sampler} Sampler");
555558
```
556559

@@ -596,7 +599,7 @@ plt.suptitle(f"Trace Plot {sampler}");
596599
```
597600

598601
```{code-cell} ipython3
599-
fig, ax = plt.subplots(figsize=(9, 4))
602+
fig, ax = plt.subplots(figsize=(7, 4))
600603
plot_inference(ax, trace, title=f"Data and Inference Model Runs\n{sampler} Sampler");
601604
```
602605

@@ -621,7 +624,7 @@ The major observation here is that the posterior shape is pretty difficult for a
621624

622625
+++
623626

624-
The PyMC default NUTs sampler can only be used if gradients are supplied to the sampler. In this section, we will solve the system of ODEs within PyMC in two different ways that supply the sampler with gradients. The first is the built-in `pymc.ode.DifferentialEquation` solver, and the second is to forward simulate using `aesara.scan`, which allows looping. Note that there may be other better and faster ways to perform Bayesian inference with ODEs using gradients, such as the [sunode](https://sunode.readthedocs.io/en/latest/index.html) project.
627+
The PyMC default NUTs sampler can only be used if gradients are supplied to the sampler. In this section, we will solve the system of ODEs within PyMC in two different ways that supply the sampler with gradients. The first is the built-in `pymc.ode.DifferentialEquation` solver, and the second is to forward simulate using `pytensor.scan`, which allows looping. Note that there may be other better and faster ways to perform Bayesian inference with ODEs using gradients, such as the [sunode](https://sunode.readthedocs.io/en/latest/index.html) project, and [diffrax](https://www.pymc-labs.io/blog-posts/jax-functions-in-pymc-3-quick-examples/), which relies on JAX.
625628

626629
+++
627630

@@ -683,7 +686,7 @@ with pm.Model() as model:
683686

684687
```{code-cell} ipython3
685688
sampler = "NUTs PyMC ODE"
686-
tune = draws = 15
689+
tune = draws = 25
687690
with model:
688691
trace_pymc_ode = pm.sample(tune=tune, draws=draws)
689692
```
@@ -708,11 +711,11 @@ Despite a paucity of samples, the NUTs sampler is starting to converge to the co
708711

709712
+++
710713

711-
### Simulate with Aesara Scan
714+
### Simulate with Pytensor Scan
712715

713716
+++
714717

715-
Finally, we can write the system of ODEs as a forward simulation solver within PyMC. The way to write for-loops in PyMC is with `aesara.scan.` Gradients are then supplied to the sampler via autodifferentiation.
718+
Finally, we can write the system of ODEs as a forward simulation solver within PyMC. The way to write for-loops in PyMC is with `pytensor.scan.` Gradients are then supplied to the sampler via autodifferentiation.
716719

717720
First, we should test that the time steps are sufficiently small to get a reasonable estimate.
718721

@@ -722,7 +725,7 @@ First, we should test that the time steps are sufficiently small to get a reason
722725

723726
+++
724727

725-
Create a function that accepts different numbers of time steps for testing. The function also demonstrates how `aesara.scan` is used.
728+
Create a function that accepts different numbers of time steps for testing. The function also demonstrates how `pytensor.scan` is used.
726729

727730
```{code-cell} ipython3
728731
# Lotka-Volterra forward simulation model using scan
@@ -750,9 +753,9 @@ def lv_scan_simulation_model(theta, steps_year=100, years=21):
750753
y_new = y + (-gamma * y + delta * x * y) * dt
751754
return x_new, y_new
752755
753-
# Aesara scan looping function
756+
# Pytensor scan looping function
754757
## The function argument names are not intuitive in this context!
755-
result, updates = aesara.scan(
758+
result, updates = pytensor.scan(
756759
fn=ode_update_function, # function
757760
outputs_info=[xt0, yt0], # initial conditions
758761
non_sequences=[alpha, beta, gamma, delta], # parameters
@@ -821,8 +824,8 @@ def lv_scan_inference_model(theta, steps_year=100, years=21):
821824
y_new = y + (-gamma * y + delta * x * y) * dt
822825
return x_new, y_new
823826
824-
# Aesara scan is a looping function
825-
result, updates = aesara.scan(
827+
# Pytensor scan is a looping function
828+
result, updates = pytensor.scan(
826829
fn=ode_update_function, # function
827830
outputs_info=[xt0, yt0], # initial conditions
828831
non_sequences=[alpha, beta, gamma, delta], # parameters
@@ -844,14 +847,14 @@ This is also quite slow, so we will just pull a few samples for demonstration pu
844847
```{code-cell} ipython3
845848
steps_year = 100
846849
model = lv_scan_inference_model(theta, steps_year=steps_year)
847-
sampler = "NUTs Aesara Scan"
850+
sampler = "NUTs Pytensor Scan"
848851
tune = draws = 50
849852
with model:
850-
trace_aesara_scan = pm.sample(tune=tune, draws=draws)
853+
trace_scan = pm.sample(tune=tune, draws=draws)
851854
```
852855

853856
```{code-cell} ipython3
854-
trace = trace_aesara_scan
857+
trace = trace_scan
855858
az.summary(trace)
856859
```
857860

@@ -894,7 +897,7 @@ inference_results = [
894897
trace_SMC_e1,
895898
trace_SMC_e10,
896899
trace_pymc_ode,
897-
trace_aesara_scan,
900+
trace_scan,
898901
]
899902
model_names = [
900903
"Slice Sampler",
@@ -905,7 +908,7 @@ model_names = [
905908
"SMC e=1",
906909
"SMC e=10",
907910
"PyMC ODE NUTs",
908-
"Aesara Scan NUTs",
911+
"Pytensor Scan NUTs",
909912
]
910913
911914
# Loop through variable names
@@ -940,25 +943,21 @@ If we ran the samplers for long enough to get good inferences, we would expect t
940943

941944
### Key Conclusions
942945
We performed Bayesian inference on a system of ODEs in 4 main ways:
943-
* Scipy `odeint` wrapped in an aesara `op` and sampled with non-gradient-based samplers (comparing 5 different samplers).
946+
* Scipy `odeint` wrapped in a Pytensor `op` and sampled with non-gradient-based samplers (comparing 5 different samplers).
944947
* Scipy `odeint` wrapped in a `pm.Simulator` function and sampled with a non-likelihood-based sequential Monte Carlo (SMC) sampler.
945948
* PyMC `ode.DifferentialEquation` sampled with NUTs.
946-
* Forward simulation using `aesara.scan` and sampled with NUTs.
949+
* Forward simulation using `pytensor.scan` and sampled with NUTs.
947950

948-
The "winner" for this problem was the Scipy `odeint` solver with a differential evolution (DE) Metropolis sampler. The improved efficiency of the NUTs sampler did not make up for the inefficiency in using the slow ODE solvers with gradients. Sticking with Scipy and DEMetropolis is also the simplest workflow for a scientist with a working numeric model and the desire to perform Bayesian inference. Just wrapping the numeric model in an aesara op and plugging it into a PyMC model can get you a long way!
951+
The "winner" for this problem was the Scipy `odeint` solver with a differential evolution (DE) Metropolis sampler. The improved efficiency of the NUTs sampler did not make up for the inefficiency in using the slow ODE solvers with gradients. Sticking with Scipy and DEMetropolis is also the simplest workflow for a scientist with a working numeric model and the desire to perform Bayesian inference. Just wrapping the numeric model in a Pytensor op and plugging it into a PyMC model can get you a long way!
949952

950953
+++
951954

952955
## Authors
953-
Adapted by Greg Brunkhorst from multiple PyMC.io example notebooks by Sanmitra Ghosh, Demetri Pananos, and the PyMC Team.
956+
Organized and rewritten by Greg Brunkhorst from multiple PyMC.io example notebooks by Sanmitra Ghosh, Demetri Pananos, and the PyMC Team.
954957

955958
```{code-cell} ipython3
956959
%watermark -n -u -v -iv -w
957960
```
958961

959962
:::{include} ../page_footer.md
960963
:::
961-
962-
```{code-cell} ipython3
963-
964-
```

0 commit comments

Comments
 (0)