@@ -5,14 +5,18 @@ jupytext:
5
5
format_name : myst
6
6
format_version : 0.13
7
7
kernelspec :
8
- display_name : ' Python 3.8.2 64-bit ( '' pymc3 '' : conda) '
8
+ display_name : pie
9
9
language : python
10
- name : python38264bitpymc3conda8b8223a2f9874eff9bd3e12d36ed2ca2
10
+ name : python3
11
11
---
12
12
13
- +++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
14
-
15
- # Analysis of An $AR(1)$ Model in PyMC3
13
+ (AR)=
14
+ # Analysis of An $AR(1)$ Model in PyMC
15
+ :::{post} Jan 7, 2023
16
+ :tags: time series, autoregressive
17
+ :category: intermediate,
18
+ :author: Ed Herbst, Chris Fonnesbeck
19
+ :::
16
20
17
21
``` {code-cell} ipython3
18
22
---
@@ -23,7 +27,7 @@ slideshow:
23
27
import arviz as az
24
28
import matplotlib.pyplot as plt
25
29
import numpy as np
26
- import pymc3 as pm
30
+ import pymc as pm
27
31
```
28
32
29
33
``` {code-cell} ipython3
@@ -35,14 +39,14 @@ az.style.use("arviz-darkgrid")
35
39
36
40
+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
37
41
38
- Consider the following AR(1 ) process, initialized in the
42
+ Consider the following AR(2 ) process, initialized in the
39
43
infinite past:
40
44
$$
41
- y_t = \theta y_{t-1} + \epsilon_t,
45
+ y_t = \rho_0 + \rho_1 y_{t-1} + \rho_2 y_{t-2 } + \epsilon_t,
42
46
$$
43
- where $\epsilon_t \overset{iid}{\sim} {\cal N}(0,1)$. Suppose you'd like to learn about $\theta $ from a a sample of observations $Y^T = \{ y_0, y_1,\ldots, y_T \} $.
47
+ where $\epsilon_t \overset{iid}{\sim} {\cal N}(0,1)$. Suppose you'd like to learn about $\rho $ from a a sample of observations $Y^T = \{ y_0, y_1,\ldots, y_T \} $.
44
48
45
- First, let's generate some synthetic sample data. We simulate the 'infinite past' by generating 10,000 samples from an AR(1 ) process and then discarding the first 5,000:
49
+ First, let's generate some synthetic sample data. We simulate the 'infinite past' by generating 10,000 samples from an AR(2 ) process and then discarding the first 5,000:
46
50
47
51
``` {code-cell} ipython3
48
52
---
@@ -51,17 +55,18 @@ slideshow:
51
55
slide_type: '-'
52
56
---
53
57
T = 10000
54
- y = np.zeros((T,))
55
58
56
59
# true stationarity:
57
- true_theta = 0.95
60
+ true_rho = 0.7, -0.3
58
61
# true standard deviation of the innovation:
59
62
true_sigma = 2.0
60
63
# true process mean:
61
- true_center = 0 .0
64
+ true_center = -1 .0
62
65
63
- for t in range(1, T):
64
- y[t] = true_theta * y[t - 1] + np.random.normal(loc=true_center, scale=true_sigma)
66
+ y = np.random.normal(loc=true_center, scale=true_sigma, size=T)
67
+ y[1] += true_rho[0] * y[0]
68
+ for t in range(2, T - 100):
69
+ y[t] += true_rho[0] * y[t - 1] + true_rho[1] * y[t - 2]
65
70
66
71
y = y[-5000:]
67
72
plt.plot(y, alpha=0.8)
@@ -71,7 +76,9 @@ plt.ylabel("$y$");
71
76
72
77
+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
73
78
74
- This generative process is quite straight forward to implement in PyMC3:
79
+ Let's start by trying to fit the wrong model! Assume that we do no know the generative model and so simply fit an AR(1) model for simplicity.
80
+
81
+ This generative process is quite straight forward to implement in PyMC. Since we wish to include an intercept term in the AR process, we must set ` constant=True ` otherwise PyMC will assume that we want an AR2 process when ` rho ` is of size 2. Also, by default a $N(0, 100)$ distribution will be used as the prior for the initial value. We can override this by passing a distribution (not a full RV) to the ` init_dist ` argument.
75
82
76
83
``` {code-cell} ipython3
77
84
---
@@ -81,41 +88,39 @@ slideshow:
81
88
---
82
89
with pm.Model() as ar1:
83
90
# assumes 95% of prob mass is between -2 and 2
84
- theta = pm.Normal("theta ", 0.0, 1.0)
91
+ rho = pm.Normal("rho ", mu= 0.0, sigma= 1.0, shape=2 )
85
92
# precision of the innovation term
86
- tau = pm.Exponential("tau", 0.5)
87
- # process mean
88
- center = pm.Normal("center", mu=0.0, sigma=1.0)
93
+ tau = pm.Exponential("tau", lam=0.5)
89
94
90
- likelihood = pm.AR1("y", k=theta, tau_e=tau, observed=y - center)
95
+ likelihood = pm.AR(
96
+ "y", rho=rho, tau=tau, constant=True, init_dist=pm.Normal.dist(0, 10), observed=y
97
+ )
91
98
92
- trace = pm.sample(1000, tune=2000, init="advi+adapt_diag", random_seed=RANDOM_SEED)
93
- idata = az.from_pymc3(trace)
99
+ idata = pm.sample(1000, tune=2000, random_seed=RANDOM_SEED)
94
100
```
95
101
96
- We can see that even though the sample data did not start at zero, the true center of zero is captured rightly inferred by the model, as you can see in the trace plot below. Likewise, the model captured the true values of the autocorrelation parameter 𝜃 and the innovation term $\epsilon_t$ ( ` tau ` in the model) -- 0.95 and 1 respectively) .
102
+ We can see that even though we assumed the wrong model, the parameter estimates are actually not that far from the true values .
97
103
98
104
``` {code-cell} ipython3
99
105
az.plot_trace(
100
106
idata,
101
107
lines=[
102
- ("theta ", {}, true_theta ),
108
+ ("rho ", {}, [true_center, true_rho[0]] ),
103
109
("tau", {}, true_sigma**-2),
104
- ("center", {}, true_center),
105
110
],
106
111
);
107
112
```
108
113
109
114
+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
110
115
111
116
## Extension to AR(p)
112
- We can instead estimate an AR(2) model using PyMC3.
117
+ Now let's fit the correct underlying model, an AR(2):
113
118
114
119
$$
115
- y_t = \theta_1 y_{t-1} + \theta_2 y_{t-2} + \epsilon_t.
120
+ y_t = \rho_0 + \rho_1 y_{t-1} + \rho_2 y_{t-2} + \epsilon_t.
116
121
$$
117
122
118
- The ` AR ` distribution infers the order of the process thanks to the size the of ` rho ` argmument passed to ` AR ` .
123
+ The ` AR ` distribution infers the order of the process thanks to the size the of ` rho ` argmument passed to ` AR ` (including the mean) .
119
124
120
125
We will also use the standard deviation of the innovations (rather than the precision) to parameterize the distribution.
121
126
@@ -126,61 +131,85 @@ slideshow:
126
131
slide_type: '-'
127
132
---
128
133
with pm.Model() as ar2:
129
- theta = pm.Normal("theta ", 0.0, 1.0, shape=2 )
134
+ rho = pm.Normal("rho ", 0.0, 1.0, shape=3 )
130
135
sigma = pm.HalfNormal("sigma", 3)
131
- likelihood = pm.AR("y", theta, sigma=sigma, observed=y)
136
+ likelihood = pm.AR(
137
+ "y", rho=rho, sigma=sigma, constant=True, init_dist=pm.Normal.dist(0, 10), observed=y
138
+ )
132
139
133
- trace = pm.sample(
140
+ idata = pm.sample(
134
141
1000,
135
142
tune=2000,
136
143
random_seed=RANDOM_SEED,
137
144
)
138
- idata = az.from_pymc3(trace)
139
145
```
140
146
147
+ The posterior plots show that we have correctly inferred the generative model parameters.
148
+
141
149
``` {code-cell} ipython3
142
150
az.plot_trace(
143
151
idata,
144
152
lines=[
145
- ("theta", {"theta_dim_0": 0}, true_theta),
146
- ("theta", {"theta_dim_0": 1}, 0.0),
153
+ ("rho", {}, (true_center,) + true_rho),
147
154
("sigma", {}, true_sigma),
148
155
],
149
156
);
150
157
```
151
158
152
159
+++ {"ein.tags": [ "worksheet-0"] , "slideshow": {"slide_type": "-"}}
153
160
154
- You can also pass the set of AR parameters as a list.
161
+ You can also pass the set of AR parameters as a list, if they are not identically distributed .
155
162
156
163
``` {code-cell} ipython3
157
164
---
158
165
ein.tags: [worksheet-0]
159
166
slideshow:
160
167
slide_type: '-'
161
168
---
169
+ import pytensor.tensor as pt
170
+
162
171
with pm.Model() as ar2_bis:
163
- beta0 = pm.Normal("theta0", mu=0.0, sigma=1.0)
164
- beta1 = pm.Uniform("theta1", -1, 1)
172
+ rho0 = pm.Normal("rho0", mu=0.0, sigma=5.0)
173
+ rho1 = pm.Uniform("rho1", -1, 1)
174
+ rho2 = pm.Uniform("rho2", -1, 1)
165
175
sigma = pm.HalfNormal("sigma", 3)
166
- likelhood = pm.AR("y", [beta0, beta1], sigma=sigma, observed=y)
176
+ likelihood = pm.AR(
177
+ "y",
178
+ rho=pt.stack([rho0, rho1, rho2]),
179
+ sigma=sigma,
180
+ constant=True,
181
+ init_dist=pm.Normal.dist(0, 10),
182
+ observed=y,
183
+ )
167
184
168
- trace = pm.sample(
185
+ idata = pm.sample(
169
186
1000,
170
187
tune=2000,
188
+ target_accept=0.9,
171
189
random_seed=RANDOM_SEED,
172
190
)
173
- idata = az.from_pymc3(trace)
174
191
```
175
192
176
193
``` {code-cell} ipython3
177
194
az.plot_trace(
178
195
idata,
179
- lines=[("theta0", {}, true_theta), ("theta1", {}, 0.0), ("sigma", {}, true_sigma)],
196
+ lines=[
197
+ ("rho0", {}, true_center),
198
+ ("rho1", {}, true_rho[0]),
199
+ ("rho2", {}, true_rho[1]),
200
+ ("sigma", {}, true_sigma),
201
+ ],
180
202
);
181
203
```
182
204
205
+ ## Authors
206
+ * authored by Ed Herbst in August, 2016 ([ pymc #1546 ] ( https://github.com/pymc-devs/pymc/pull/2285 ) )
207
+ * updated Chris Fonnesbeck in January, 2023 ([ pymc-examples #493 ] ( https://github.com/pymc-devs/pymc-examples/pull/494 ) )
208
+
183
209
``` {code-cell} ipython3
184
210
%load_ext watermark
185
- %watermark -n -u -v -iv -w
211
+ %watermark -n -u -v -iv -w -p pytensor,aeppl,xarray
186
212
```
213
+
214
+ :::{include} ../page_footer.md
215
+ :::
0 commit comments