You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/howto/blackbox_external_likelihood_numpy.myst.md
+33-33
Original file line number
Diff line number
Diff line change
@@ -5,9 +5,9 @@ jupytext:
5
5
format_name: myst
6
6
format_version: 0.13
7
7
kernelspec:
8
-
display_name: Python 3 (ipykernel)
8
+
display_name: pymc-examples
9
9
language: python
10
-
name: python3
10
+
name: pymc-examples
11
11
---
12
12
13
13
(blackbox_external_likelihood_numpy)=
@@ -25,8 +25,6 @@ There is a {ref}`related example <wrapping_jax_function>` that discusses how to
25
25
26
26
```{code-cell} ipython3
27
27
import arviz as az
28
-
import IPython
29
-
import matplotlib
30
28
import matplotlib.pyplot as plt
31
29
import numpy as np
32
30
import pymc as pm
@@ -45,9 +43,9 @@ az.style.use("arviz-darkgrid")
45
43
```
46
44
47
45
## Introduction
48
-
PyMC is a great tool for doing Bayesian inference and parameter estimation. It has a load of {doc}`in-built probability distributions <pymc:api/distributions>` that you can use to set up priors and likelihood functions for your particular model. You can even create your own {class}`Custom Distribution <pymc.CustomDist>`.
46
+
PyMC is a great tool for doing Bayesian inference and parameter estimation. It has many {doc}`in-built probability distributions <pymc:api/distributions>` that you can use to set up priors and likelihood functions for your particular model. You can even create your own {class}`Custom Distribution <pymc.CustomDist>` with a custom logp defined by PyTensor operations or automatically inferred from the generative graph.
49
47
50
-
However, this is not necessarily that simple if you have a model function, or probability distribution, that, for example, relies on an external code that you have little/no control over (and may even be, for example, wrapped `C` code rather than Python). This can be problematic when you need to pass parameters as PyMC distributions to these external functions; your external function probably wants you to pass it floating point numbers rather than PyMC distributions!
48
+
Despite all these "batteries included", you may still find yourself dealing with a model function or probability distribution thatrelies on complex external code that you cannot avoid but to use. This code is unlikely to work with the kind of abstract PyTensor variables that PyMC uses: {ref}`pymc:pymc_pytensor`.
51
49
52
50
```python
53
51
import pymc as pm
@@ -66,7 +64,7 @@ Another issue is that if you want to be able to use the gradient-based step samp
66
64
67
65
Defining a model/likelihood that PyMC can use and that calls your "black box" function is possible, but it relies on creating a custom PyTensor Op. This is, hopefully, a clear description of how to do this, including one way of writing a gradient function that could be generally applicable.
68
66
69
-
In the examples below, we create a very simple model and log-likelihood function in numpy.
67
+
In the examples below, we create a very simple lineral model and log-likelihood function in numpy.
70
68
71
69
```{code-cell} ipython3
72
70
def my_model(m, c, x):
@@ -105,19 +103,17 @@ with pm.Model():
105
103
trace = pm.sample(1000)
106
104
```
107
105
108
-
But, this will give an error like:
106
+
But, this will likely give an error when the black-box function does not accept PyTensor tensor objects as inputs.
109
107
110
-
```
111
-
ValueError: setting an array element with a sequence.
112
-
```
113
-
114
-
This is because `m` and `c` are PyTensor tensor-type objects.
108
+
So, what we actually need to do is create a {ref}`PyTensor Op <pytensor:creating_an_op>`. This will be a new class that wraps our log-likelihood function while obeying the PyTensor API contract. We will do this below, initially without defining a {func}`grad` for the Op.
115
109
116
-
So, what we actually need to do is create a {ref}`PyTensor Op <pytensor:creating_an_op>`. This will be a new class that wraps our log-likelihood function (or just our model function, if that is all that is required) into something that can take in PyTensor tensor objects, but internally can cast them as floating point values that can be passed to our log-likelihood function. We will do this below, initially without defining a {func}`grad` for the Op.
110
+
:::{tip}
111
+
Depending on your application you may only need to wrap a custom log-likelihood or a subset of the whole model (such as a function that computes an infinite series summation using an advanced library like mpmath), which can then be chained with other PyMC distributions and PyTensor operations to define your whole model. There is a trade-off here, usually the more you leave out of a black-box the more you may benefit from PyTensor rewrites and optimizations. We suggest you always try to define the whole model in PyMC and PyTensor, and only use black-boxes where strictly necessary.
112
+
:::
117
113
118
114
+++
119
115
120
-
## PyTensor Op without grad
116
+
## PyTensor Op without gradients
121
117
122
118
+++
123
119
@@ -128,7 +124,7 @@ So, what we actually need to do is create a {ref}`PyTensor Op <pytensor:creating
128
124
129
125
130
126
class LogLike(Op):
131
-
def make_node(self, m, c, sigma, x, data):
127
+
def make_node(self, m, c, sigma, x, data) -> Apply:
132
128
# Convert inputs to tensor variables
133
129
m = pt.as_tensor(m)
134
130
c = pt.as_tensor(c)
@@ -143,10 +139,10 @@ class LogLike(Op):
143
139
# outputs = [pt.vector()]
144
140
outputs = [data.type()]
145
141
146
-
# Apply is an object that combins inputs, outputs and an Op (self)
142
+
# Apply is an object that combines inputs, outputs and an Op (self)
What if we wanted to use NUTS or HMC? If we knew the analytical derivatives of the model/likelihood function then we could add a {func}`grad` to the Op using existing PyTensor operations.
269
267
270
268
But, what if we don't know the analytical form. If our model/likelihood, is implemented in a framework that provides automatic differentiation (just like PyTensor does), it's possible to reuse their functionality. This {ref}`related example <wrapping_jax_function> shows how to do this when working with JAX functions.
271
269
272
-
But, if our model/likelihood truly is a "black box" then we can just use the good-old-fashioned [finite difference](https://en.wikipedia.org/wiki/Finite_difference) to find the gradients - this can be slow, especially if there are a large number of variables, or the model takes a long time to evaluate. We use the handy SciPy {func}`~scipy.optimize.approx_fprime` function to achieve this.
270
+
If our model/likelihood truly is a "black box" then we can try to use approximation methods like [finite difference](https://en.wikipedia.org/wiki/Finite_difference) to find the gradients. We illustrate this approach with the handy SciPy {func}`~scipy.optimize.approx_fprime` function to achieve this.
271
+
272
+
:::{caution}
273
+
Finite differences are rarely recommended as a way to compute gradients. They can be too slow or unstable for practical uses. We suggest you use them only as a last resort.
So, now we can just redefine our Op with a `grad()` method, right?
313
315
314
-
It's not quite so simple! The `grad()` method itself requires that its inputs are PyTensor tensor variables, whereas our `gradients` function above, like our `my_loglike` function, wants a list of floating point values. So, we need to define another Op that calculates the gradients. Below, I define a new version of the `LogLike` Op, called `LogLikeWithGrad` this time, that has a `grad()` method. This is followed by anothor Op called `LogLikeGrad` that, when called with a vector of PyTensor tensor variables, returns another vector of values that are the gradients (i.e., the [Jacobian](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant)) of our log-likelihood function at those values. Note that the `grad()` method itself does not return the gradients directly, but instead returns the [Jacobian](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant)-vector product (you can hopefully just copy what I've done and not worry about what this means too much!).
316
+
It's not quite so simple! The `grad()` method itself requires that its inputs are PyTensor tensor variables, whereas our `gradients` function above, like our `my_loglike` function, wants a list of floating point values. So, we need to define another Op that calculates the gradients. Below, I define a new version of the `LogLike` Op, called `LogLikeWithGrad` this time, that has a `grad()` method. This is followed by anothor Op called `LogLikeGrad` that, when called with a vector of PyTensor tensor variables, returns another vector of values that are the gradients (i.e., the [Jacobian](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant)) of our log-likelihood function at those values. Note that the `grad()` method itself does not return the gradients directly, but instead returns the [Jacobian-vector product](https://en.wikipedia.org/wiki/Pushforward_(differential)).
315
317
316
318
```{code-cell} ipython3
317
319
# define a pytensor Op for our likelihood function
318
320
319
321
320
322
class LogLikeWithGrad(Op):
321
-
def make_node(self, m, c, sigma, x, data):
323
+
def make_node(self, m, c, sigma, x, data) -> Apply:
0 commit comments