@@ -270,17 +270,17 @@ We start with a bivariate normal distribution pinned down by
270
270
271
271
$$
272
272
\mu=\left[\begin{array}{c}
273
- 0 \\
274
- 0
273
+ .5 \\
274
+ 1. 0
275
275
\end{array}\right],\quad\Sigma=\left[\begin{array}{cc}
276
276
1 & .5\\
277
- .5 & 2
277
+ .5 & 1
278
278
\end{array}\right]
279
279
$$
280
280
281
281
``` {code-cell} python3
282
- μ = np.array([0., 0 .])
283
- Σ = np.array([[1., .5], [.5 ,2 .]])
282
+ μ = np.array([.5, 1 .])
283
+ Σ = np.array([[1., .5], [.5 ,1 .]])
284
284
285
285
# construction of the multivariate normal instance
286
286
multi_normal = MultivariateNormal(μ, Σ)
@@ -291,17 +291,139 @@ k = 1 # choose partition
291
291
292
292
# partition and compute regression coefficients
293
293
multi_normal.partition(k)
294
- multi_normal.βs[0]
294
+ multi_normal.βs[0],multi_normal.βs[1]
295
295
```
296
296
297
+ Let's illustrate the fact that you _ can regress anything on anything else_ .
297
298
298
- To illustrate the idea that you _ can regress anything on anything else_ , let's first compute the mean and variance of the distribution of $z_2$
299
+ We have computed everything we need to compute two regression lines, one of $z_2$ on $z_1$, the other of $z_1$ on $z_2$.
300
+
301
+ We'll represent these regressions as
302
+
303
+ $$
304
+ z_1 = a_1 + b_1 z_2 + \epsilon_1
305
+ $$
306
+
307
+ and
308
+
309
+ $$
310
+ z_2 = a_2 + b_2 z_1 + \epsilon_2
311
+ $$
312
+
313
+ where we have the population least squares orthogonality conditions
314
+
315
+ $$
316
+ E \epsilon_1 z_2 = 0
317
+ $$
318
+
319
+ and
320
+
321
+ $$
322
+ E \epsilon_2 z_1 = 0
323
+ $$
324
+
325
+ Let's compute $a_1, a_2, b_1, b_2$.
326
+
327
+ ``` {code-cell} python3
328
+
329
+ beta = multi_normal.βs
330
+
331
+ a1 = μ[0] - beta[0]*μ[1]
332
+ b1 = beta[0]
333
+
334
+ a2 = μ[1] - beta[1]*μ[0]
335
+ b2 = beta[1]
336
+ ```
337
+
338
+ Let's print out the intercepts and slopes.
339
+
340
+
341
+ For the regression of $z_1$ on $z_2$ we have
342
+
343
+ ``` {code-cell} python3
344
+ print ("a1 = ", a1)
345
+ print ("b1 = ", b1)
346
+ ```
347
+
348
+ For the regression of $z_2$ on $z_1$ we have
349
+
350
+ ``` {code-cell} python3
351
+ print ("a2 = ", a2)
352
+ print ("b2 = ", b2)
353
+ ```
354
+
355
+
356
+
357
+ Now let's plot the two regression lines and stare at them.
358
+
359
+
360
+ ``` {code-cell} python3
361
+
362
+ z2 = np.linspace(-4,4,100)
363
+
364
+
365
+ a1 = np.squeeze(a1)
366
+ b1 = np.squeeze(b1)
367
+
368
+ a2 = np.squeeze(a2)
369
+ b2 = np.squeeze(b2)
370
+
371
+ z1 = b1*z2 + a1
372
+
373
+
374
+ z1h = z2/b2 - a2/b2
375
+
376
+
377
+ fig = plt.figure(figsize=(12,12))
378
+ ax = fig.add_subplot(1, 1, 1)
379
+ ax.set(xlim=(-4, 4), ylim=(-4, 4))
380
+ ax.spines['left'].set_position('center')
381
+ ax.spines['bottom'].set_position('zero')
382
+ ax.spines['right'].set_color('none')
383
+ ax.spines['top'].set_color('none')
384
+ ax.xaxis.set_ticks_position('bottom')
385
+ ax.yaxis.set_ticks_position('left')
386
+ plt.ylabel('$z_1$', loc = 'top')
387
+ plt.xlabel('$z_2$,', loc = 'right')
388
+ plt.title('two regressions')
389
+ plt.plot(z2,z1, 'r', label = "$z_1$ on $z_2$")
390
+ plt.plot(z2,z1h, 'b', label = "$z_2$ on $z_1$")
391
+ plt.legend()
392
+ plt.show()
393
+ ```
394
+
395
+ The red line is the expectation of $z_1$ conditional on $z_2$.
396
+
397
+ The intercept and slope of the red line are
398
+
399
+ ``` {code-cell} python3
400
+ print("a1 = ", a1)
401
+ print("b1 = ", b1)
402
+ ```
403
+
404
+ The blue line is the expectation of $z_2$ conditional on $z_1$.
405
+
406
+ The intercept and slope of the blue line are
407
+
408
+ ``` {code-cell} python3
409
+ print("-a2/b2 = ", - a2/b2)
410
+ print("1/b2 = ", 1/b2)
411
+ ```
412
+
413
+ We can use these regression lines or our code to compute conditional expectations.
414
+
415
+ Let's compute the mean and variance of the distribution of $z_2$
299
416
conditional on $z_1=5$.
300
417
301
418
After that we'll reverse what are on the left and right sides of the regression.
302
419
303
420
304
421
422
+
423
+
424
+
425
+
426
+
305
427
``` {code-cell} python3
306
428
# compute the cond. dist. of z1
307
429
ind = 1
0 commit comments