Skip to content

Commit d1788f0

Browse files
committed
[Hierarchical BVAR pymc-devs#286] fixed issue with hierarchical model fit. Fixed some typos too.
Signed-off-by: Nathaniel <[email protected]>
1 parent a5979d4 commit d1788f0

File tree

2 files changed

+6568
-6532
lines changed

2 files changed

+6568
-6532
lines changed

examples/time_series/bayesian_var_model.ipynb

+6,539-6,515
Large diffs are not rendered by default.

myst_nbs/time_series/bayesian_var_model.myst.md

+29-17
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ kernelspec:
2121
:::
2222

2323
```{code-cell} ipython3
24+
import pickle
25+
2426
import aesara as at
2527
import arviz as az
2628
import matplotlib.dates as mdates
@@ -41,15 +43,15 @@ az.style.use("arviz-darkgrid")
4143
%config InlineBackend.figure_format = 'retina'
4244
```
4345

44-
# Bayesian V(ector)A(uto)R(egression) Models
46+
## V(ector)A(uto)R(egression) Models
4547

46-
In this notebook we will outline an application of the Bayesian Vector Autoregressive Modelling. We will draw on the work in the PYMC Labs blogpost. This will be a three part series. In the first we want to show how to fit Bayesian VAR models in PYMC. In the second we will show how to extract extra insight from the fitted model with Impulse Response analysis and make forecasts from the fitted VAR model. In the third and final post we will show in some more the benefits of using hierarchical priors with Bayesian VAR models. Specifically we'll outline how and why there are actually a range of carefully formulated industry standard priors which work with Bayesian VAR modelling.
48+
In this notebook we will outline an application of the Bayesian Vector Autoregressive Modelling. We will draw on the work in the PYMC Labs blogpost. This will be a three part series. In the first we want to show how to fit Bayesian VAR models in PYMC. In the second we will show how to extract extra insight from the fitted model with Impulse Response analysis and make forecasts from the fitted VAR model. In the third and final post we will show in some more detail the benefits of using hierarchical priors with Bayesian VAR models. Specifically, we'll outline how and why there are actually a range of carefully formulated industry standard priors which work with Bayesian VAR modelling.
4749

48-
In this post we will (i) demonstrate the basic pattern on a simple VAR model on fake data and show how the model recovers the true data generating parameters and (ii) we will show an example applied to macro-economic data and compare the results to those achieved on the same data with statsmodels MLE fits and (iii) show an example of estimating a hierarchical bayesian VAR model over two countries.
50+
In this post we will (i) demonstrate the basic pattern on a simple VAR model on fake data and show how the model recovers the true data generating parameters and (ii) we will show an example applied to macro-economic data and compare the results to those achieved on the same data with statsmodels MLE fits and (iii) show an example of estimating a hierarchical bayesian VAR model over a number of countries.
4951

5052
## Autoregressive Models in General
5153

52-
The idea of a simple autoregressive model is to capture the manner in which past observations of the timeseries are predictive of the current observation. So in traditional if we model this as a linear phenomena we get simple autoregressive models where the current value is predicted by a weighted linear combination of the past values and an error term.
54+
The idea of a simple autoregressive model is to capture the manner in which past observations of the timeseries are predictive of the current observation. So in traditional fashion, if we model this as a linear phenomena we get simple autoregressive models where the current value is predicted by a weighted linear combination of the past values and an error term.
5355

5456
$$ y_t = \alpha + \beta_{y0} \cdot y_{t-1} + \beta_{y1} \cdot y_{t-2} ... + \epsilon $$
5557

@@ -64,7 +66,7 @@ where the As are coefficient matrices to be combined with the past values of eac
6466
$$ \begin{bmatrix} gdp \\ inv \\ con \end{bmatrix}_{T} = \nu + A_{1}\begin{bmatrix} gdp \\ inv \\ con \end{bmatrix}_{T-1} +
6567
A_{2}\begin{bmatrix} gdp \\ inv \\ con \end{bmatrix}_{T-2} ... A_{p}\begin{bmatrix} gdp \\ inv \\ con \end{bmatrix}_{T-p} + \mathbf{e}_{t} $$
6668

67-
This structure is compact representation using matrix notation. The thing we are trying to estimate when we fit a VAR model is the A matrix that determines the manner of the linear combination that best fits our timeseries. Such timeseries models can have an auto-regressive or a moving average representation, and the details matter for some of the implication of a VAR model fit.
69+
This structure is compact representation using matrix notation. The thing we are trying to estimate when we fit a VAR model is the A matrices that determine the nature of the linear combination that best fits our timeseries data. Such timeseries models can have an auto-regressive or a moving average representation, and the details matter for some of the implication of a VAR model fit.
6870

6971
We'll see in the next notebook of the series how the moving-average representation of a VAR lends itself to the interpretation of the covariance structure in our model as representing a kind of impulse-response relationship between the component timeseries.
7072

@@ -75,7 +77,7 @@ The matrix notation is convenient to suggest the broad patterns of the model, bu
7577
$$ gdp_{t} = \beta_{gdp1} \cdot gdp_{t-1} + \beta_{gdp2} \cdot gdp_{t-2} + \beta_{cons1} \cdot cons_{t-1} + \beta_{cons2} \cdot cons_{t-2} + \epsilon_{gdp}$$
7678
$$ cons_{t} = \beta_{cons1} \cdot cons_{t-1} + \beta_{cons2} \cdot cons_{t-2} + \beta_{gdp1} \cdot gdp_{t-1} + \beta_{gdp2} \cdot gdp_{t-2} + \epsilon_{cons}$$
7779

78-
In this manner we can see that if we can estimate the $\beta$ terms we have an estimate for the bi-directional effects of each variable on the other. This is a useful feature of the modelling. In what follows i should stress that i'm not an economist and I'm aiming to show only the functionality of these models not give you a decisive opinion about the economic relationships determining Irish GDP figures.
80+
In this way we can see that if we can estimate the $\beta$ terms we have an estimate for the bi-directional effects of each variable on the other. This is a useful feature of the modelling. In what follows i should stress that i'm not an economist and I'm aiming to show only the functionality of these models not give you a decisive opinion about the economic relationships determining Irish GDP figures.
7981

8082
### Creating some Fake Data
8183

@@ -310,18 +312,18 @@ def shade_background(ppc, ax, idx, palette="cividis"):
310312
311313
312314
def plot_ppc(idata, df, group="posterior_predictive"):
313-
fig, axs = plt.subplots(2, 1, figsize=(20, 10))
315+
fig, axs = plt.subplots(2, 1, figsize=(20, 15))
314316
df = pd.DataFrame(idata_fake_data["observed_data"]["obs"].data, columns=["x", "y"])
315317
axs = axs.flatten()
316318
ppc = az.extract_dataset(idata, group=group, num_samples=100)["obs"]
317319
# Minus the lagged terms and the constant
318-
shade_background(ppc, axs, 0)
320+
shade_background(ppc, axs, 0, "viridis")
319321
axs[0].plot(np.arange(ppc.shape[0]), ppc[:, 0, :].mean(axis=1), color="cyan", label="Mean")
320-
axs[0].plot(df["x"], "o", color="black", markersize=4, label="Observed")
322+
axs[0].plot(df["x"], "o", color="black", markersize=6, label="Observed")
321323
axs[0].set_title("VAR Series 1")
322324
axs[0].legend()
323-
shade_background(ppc, axs, 1)
324-
axs[1].plot(df["y"], "o", color="black", markersize=4, label="Observed")
325+
shade_background(ppc, axs, 1, "viridis")
326+
axs[1].plot(df["y"], "o", color="black", markersize=6, label="Observed")
325327
axs[1].plot(np.arange(ppc.shape[0]), ppc[:, 1, :].mean(axis=1), color="cyan", label="Mean")
326328
axs[1].set_title("VAR Series 2")
327329
axs[1].legend()
@@ -399,11 +401,11 @@ def plot_ppc_macro(idata, df, group="posterior_predictive"):
399401
400402
shade_background(ppc, axs, 0)
401403
axs[0].plot(np.arange(ppc.shape[0]), ppc[:, 0, :].mean(axis=1), color="cyan", label="Mean")
402-
axs[0].plot(df["dl_gdp"], "o", color="black", markersize=4, label="Observed")
404+
axs[0].plot(df["dl_gdp"], "o", color="black", markersize=6, label="Observed")
403405
axs[0].set_title("Differenced and Logged GDP")
404406
axs[0].legend()
405407
shade_background(ppc, axs, 1)
406-
axs[1].plot(df["dl_cons"], "o", color="black", markersize=4, label="Observed")
408+
axs[1].plot(df["dl_cons"], "o", color="black", markersize=6, label="Observed")
407409
axs[1].plot(np.arange(ppc.shape[0]), ppc[:, 1, :].mean(axis=1), color="cyan", label="Mean")
408410
axs[1].set_title("Differenced and Logged Consumption")
409411
axs[1].legend()
@@ -491,9 +493,11 @@ def make_hierarchical_model(n_lags, n_eqs, df, group_field, mv_norm=True, prior_
491493
noise_chol, _, _ = pm.LKJCholeskyCov(
492494
f"noise_chol_{grp}", eta=10, n=n, sd_dist=pm.Exponential.dist(1)
493495
)
494-
omega = pm.Deterministic(f"omega_{grp}", rho * omega_global + 1 - rho * noise_chol)
496+
omega = pm.Deterministic(
497+
f"omega_{grp}", rho * omega_global + (1 - rho) * noise_chol
498+
)
495499
obs = pm.MvNormal(
496-
f"obs_{grp}", mu=mean, chol=omega_global, observed=df_grp.values[n_lags:]
500+
f"obs_{grp}", mu=mean, chol=omega, observed=df_grp.values[n_lags:]
497501
)
498502
else:
499503
sigma = pm.HalfNormal(f"noise_{grp}", sigma=1.0, dims=["equations"])
@@ -602,7 +606,15 @@ for ax, country in zip(axs, countries):
602606
plt.suptitle("Posterior Predictive Checks on Hierarchical VAR", fontsize=20);
603607
```
604608

605-
Here we can see that the model appears to have recovered reasonable enough posterior predictions for the observed data.
609+
Here we can see that the model appears to have recovered reasonable enough posterior predictions for the observed data. We'll save these model fits to be used in the next post in the series.
610+
611+
```{code-cell} ipython3
612+
with open("../data/hierarchical_var_fit.pickle", "wb") as handle:
613+
pickle.dump(idata_full_test, handle, protocol=pickle.HIGHEST_PROTOCOL)
614+
615+
with open("../data/ireland_var_fit.pickle", "wb") as handle:
616+
pickle.dump(idata_ireland, handle, protocol=pickle.HIGHEST_PROTOCOL)
617+
```
606618

607619
## Conclusion
608620

@@ -611,7 +623,7 @@ VAR modelling is a rich an interesting area of research within economics and the
611623
+++
612624

613625
## Authors
614-
* Adapted from the PYMC labs [Blog post](https://www.pymc-labs.io/blog-posts/bayesian-vector-autoregression/) and Jim Savage's discussion [here](https://rpubs.com/jimsavage/hierarchical_var) by [Nathaniel Forde](https://nathanielf.github.io/) in October 2022, 2021 ([pymc-examples#456](https://github.com/pymc-devs/pymc-examples/pull/456))
626+
* Adapted from the PYMC labs [Blog post](https://www.pymc-labs.io/blog-posts/bayesian-vector-autoregression/) and Jim Savage's discussion [here](https://rpubs.com/jimsavage/hierarchical_var) by [Nathaniel Forde](https://nathanielf.github.io/) in November 2022 ([pymc-examples#456](https://github.com/pymc-devs/pymc-examples/pull/456))
615627

616628
```{code-cell} ipython3
617629
%load_ext watermark

0 commit comments

Comments
 (0)