4
4
_referenced in docs/source/notebooks/table_of_contents_tutorials.js
5
5
6
6
=================================
7
- Advanced usage of Theano in PyMC3
7
+ Advanced usage of Aesara in PyMC3
8
8
=================================
9
9
10
10
Using shared variables
11
11
======================
12
12
13
- Shared variables allow us to use values in theano functions that are
13
+ Shared variables allow us to use values in aesara functions that are
14
14
not considered an input to the function, but can still be changed
15
15
later. They are very similar to global variables in may ways::
16
16
17
- a = tt .scalar('a')
17
+ a = aet .scalar('a')
18
18
# Create a new shared variable with initial value of 0.1
19
- b = theano .shared(0.1)
20
- func = theano .function([a], a * b)
19
+ b = aesara .shared(0.1)
20
+ func = aesara .function([a], a * b)
21
21
assert func(2.) == 0.2
22
22
23
23
b.set_value(10.)
@@ -34,7 +34,7 @@ be time consuming if the number of datasets is large)::
34
34
true_mu = [np.random.randn() for _ in range(10)]
35
35
observed_data = [mu + np.random.randn(20) for mu in true_mu]
36
36
37
- data = theano .shared(observed_data[0])
37
+ data = aesara .shared(observed_data[0])
38
38
with pm.Model() as model:
39
39
mu = pm.Normal('mu', 0, 10)
40
40
pm.Normal('y', mu=mu, sigma=1, observed=data)
@@ -55,7 +55,7 @@ variable for our observations::
55
55
x = np.random.randn(100)
56
56
y = x > 0
57
57
58
- x_shared = theano .shared(x)
58
+ x_shared = aesara .shared(x)
59
59
60
60
with pm.Model() as model:
61
61
coeff = pm.Normal('x', mu=0, sigma=1)
@@ -74,10 +74,10 @@ not possible to change the shape of a shared variable if that would
74
74
also change the shape of one of the variables.
75
75
76
76
77
- Writing custom Theano Ops
77
+ Writing custom Aesara Ops
78
78
=========================
79
79
80
- While Theano includes a wide range of operations, there are cases where
80
+ While Aesara includes a wide range of operations, there are cases where
81
81
it makes sense to write your own. But before doing this it is a good
82
82
idea to think hard if it is actually necessary. Especially if you want
83
83
to use algorithms that need gradient information — this includes NUTS and
@@ -87,22 +87,22 @@ debugging skills for the gradients.
87
87
88
88
Good reasons for defining a custom Op might be the following:
89
89
90
- - You require an operation that is not available in Theano and can't
91
- be build up out of existing Theano operations. This could for example
90
+ - You require an operation that is not available in Aesara and can't
91
+ be build up out of existing Aesara operations. This could for example
92
92
include models where you need to solve differential equations or
93
93
integrals, or find a root or minimum of a function that depends
94
94
on your parameters.
95
95
- You want to connect your PyMC3 model to some existing external code.
96
96
- After carefully considering different parametrizations and a lot
97
97
of profiling your model is still too slow, but you know of a faster
98
- way to compute the gradient than what theano is doing. This faster
98
+ way to compute the gradient than what aesara is doing. This faster
99
99
way might be anything from clever maths to using more hardware.
100
100
There is nothing stopping anyone from using a cluster via MPI in
101
101
a custom node, if a part of the gradient computation is slow enough
102
102
and sufficiently parallelizable to make the cost worth it.
103
103
We would definitely like to hear about any such examples.
104
104
105
- Theano has extensive `documentation, <http ://deeplearning.net/software/theano /extending/index.html >`_
105
+ Aesara has extensive `documentation, <https ://aesara.readthedocs.io/en/latest /extending/index.html >`_
106
106
about how to write new Ops.
107
107
108
108
@@ -158,7 +158,7 @@ We can now use `scipy.optimize.newton` to find the root::
158
158
def mu_from_theta(theta):
159
159
return optimize.newton(func, 1, fprime=jac, args=(theta,))
160
160
161
- We could wrap `mu_from_theta ` with `theano .compile.ops.as_op ` and use gradient-free
161
+ We could wrap `mu_from_theta ` with `aesara .compile.ops.as_op ` and use gradient-free
162
162
methods like Metropolis, but to get NUTS and ADVI working, we also
163
163
need to define the derivative of `mu_from_theta `. We can find this
164
164
derivative using the implicit function theorem, or equivalently we
@@ -181,16 +181,16 @@ We get
181
181
\frac {d}{d\theta }\mu (\theta )
182
182
= - \frac {\mu (\theta )^2 }{1 + \theta\mu (\theta ) + e^{-\theta\mu (\theta )}}
183
183
184
- Now, we use this to define a theano op, that also computes the gradient::
184
+ Now, we use this to define a aesara op, that also computes the gradient::
185
185
186
- import theano
187
- import theano .tensor as tt
188
- import theano .tests.unittest_tools
189
- from theano .graph.op import Op
186
+ import aesara
187
+ import aesara .tensor as aet
188
+ import aesara .tests.unittest_tools
189
+ from aesara .graph.op import Op
190
190
191
191
class MuFromTheta(Op):
192
- itypes = [tt .dscalar]
193
- otypes = [tt .dscalar]
192
+ itypes = [aet .dscalar]
193
+ otypes = [aet .dscalar]
194
194
195
195
def perform(self, node, inputs, outputs):
196
196
theta, = inputs
@@ -201,23 +201,23 @@ Now, we use this to define a theano op, that also computes the gradient::
201
201
theta, = inputs
202
202
mu = self(theta)
203
203
thetamu = theta * mu
204
- return [- g[0] * mu ** 2 / (1 + thetamu + tt .exp(-thetamu))]
204
+ return [- g[0] * mu ** 2 / (1 + thetamu + aet .exp(-thetamu))]
205
205
206
206
If you value your sanity, always check that the gradient is ok::
207
207
208
- theano.tests.unittest_tools .verify_grad(MuFromTheta(), [np.array(0.2)])
209
- theano.tests.unittest_tools .verify_grad(MuFromTheta(), [np.array(1e-5)])
210
- theano.tests.unittest_tools .verify_grad(MuFromTheta(), [np.array(1e5)])
208
+ aesara.gradient .verify_grad(MuFromTheta(), [np.array(0.2)])
209
+ aesara.gradient .verify_grad(MuFromTheta(), [np.array(1e-5)])
210
+ aesara.gradient .verify_grad(MuFromTheta(), [np.array(1e5)])
211
211
212
212
We can now define our model using this new op::
213
213
214
214
import pymc3 as pm
215
215
216
- tt_mu_from_theta = MuFromTheta()
216
+ aet_mu_from_theta = MuFromTheta()
217
217
218
218
with pm.Model() as model:
219
219
theta = pm.HalfNormal('theta', sigma=1)
220
- mu = pm.Deterministic('mu', tt_mu_from_theta (theta))
220
+ mu = pm.Deterministic('mu', aet_mu_from_theta (theta))
221
221
pm.Normal('y', mu=mu, sigma=0.1, observed=[0.2, 0.21, 0.3])
222
222
223
223
trace = pm.sample()
0 commit comments