Skip to content

Commit 7cacd00

Browse files
Tom's July 31 edits of McCall lecture
1 parent 5c4bf48 commit 7cacd00

File tree

1 file changed

+75
-65
lines changed

1 file changed

+75
-65
lines changed

lectures/mccall_model.md

Lines changed: 75 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,15 @@ tags: [hide-output]
4545

4646
The McCall search model {cite}`McCall1970` helped transform economists' way of thinking about labor markets.
4747

48-
To clarify vague notions such as "involuntary" unemployment, McCall modeled the decision problem of unemployed agents directly, in terms of factors such as
48+
To clarify notions such as "involuntary" unemployment, McCall modeled the decision problem of an unemployed worker in terms of factors including
4949

5050
* current and likely future wages
5151
* impatience
5252
* unemployment compensation
5353

54-
To solve the decision problem he used dynamic programming.
54+
To solve the decision problem McCall used dynamic programming.
5555

56-
Here we set up McCall's model and adopt the same solution method.
56+
Here we set up McCall's model and use dynamic programming to analyze it.
5757

5858
As we'll see, McCall's model is not only interesting in its own right but also an excellent vehicle for learning dynamic programming.
5959

@@ -75,28 +75,28 @@ from quantecon.distributions import BetaBinomial
7575
```{index} single: Models; McCall
7676
```
7777

78-
An unemployed agent receives in each period a job offer at wage $w_t$.
78+
At the beginning of each period, a worker who was unemployed last period receives one job offer to work this period at all subsequent periods at a wage $w_t$.
7979

80-
The wage offer is a nonnegative function of some underlying state:
80+
The wage offer $w_t$ is a nonnegative function of some underlying state $s_t$:
8181

8282
$$
8383
w_t = w(s_t) \quad \text{where } \; s_t \in \mathbb{S}
8484
$$
8585

8686
Here you should think of state process $\{s_t\}$ as some underlying, unspecified
87-
random factor that impacts on wages.
87+
random factor that determines wages.
8888

8989
(Introducing an exogenous stochastic state process is a standard way for
9090
economists to inject randomness into their models.)
9191

9292
In this lecture, we adopt the following simple environment:
9393

94-
* $\{s_t\}$ is IID, with $q(s)$ being the probability of observing state $s$ in $\mathbb{S}$ at each point in time, and
94+
* $\{s_t\}$ is IID, with $q(s)$ being the probability of observing state $s$ in $\mathbb{S}$ at each point in time,
9595
* the agent observes $s_t$ at the start of $t$ and hence knows
9696
$w_t = w(s_t)$,
9797
* the set $\mathbb S$ is finite.
9898

99-
(In later lectures, we will relax all of these assumptions.)
99+
(In later lectures, we shall relax some of these assumptions.)
100100

101101
At time $t$, our agent has two choices:
102102

@@ -112,15 +112,16 @@ $$
112112

113113
The constant $\beta$ lies in $(0, 1)$ and is called a **discount factor**.
114114

115-
The smaller is $\beta$, the more the agent discounts future utility relative to current utility.
115+
The smaller is $\beta$, the more the agent discounts future utilities relative to current utility.
116116

117117
The variable $y_t$ is income, equal to
118118

119119
* his/her wage $w_t$ when employed
120120
* unemployment compensation $c$ when unemployed
121121

122-
The agent is assumed to know that $\{s_t\}$ is IID with common
123-
distribution $q$ and can use this when computing expectations.
122+
The worker knows that $\{s_t\}$ is IID with common
123+
distribution $q$ and uses knowledge when he or she computes mathematical expectations of various random variables that are functions of
124+
$s_t$.
124125

125126
### A Trade-Off
126127

@@ -133,40 +134,48 @@ To decide optimally in the face of this trade-off, we use dynamic programming.
133134

134135
Dynamic programming can be thought of as a two-step procedure that
135136

136-
1. first assigns values to "states" and
137+
1. first assigns values to "states"
137138
1. then deduces optimal actions given those values
138139

139140
We'll go through these steps in turn.
140141

141142
### The Value Function
142143

143-
In order to optimally trade-off current and future rewards, we need to think about two things:
144+
In order optimally to trade-off current and future rewards, we think about two things:
144145

145-
1. the current payoffs we get from different choices
146-
1. the different states that those choices will lead to in next period (in this case, either employment or unemployment)
146+
1. current payoffs that arise from making alternative choices
147+
1. different states that those choices take us to next period (in this case, either employment or unemployment)
147148

148-
To weigh these two aspects of the decision problem, we need to assign *values*
149+
To assess these two aspects of the decision problem, we assign expected discounted *values*
149150
to states.
150151

151-
To this end, let $v^*(s)$ be the total lifetime *value* accruing to an
152-
unemployed worker who enters the current period unemployed when the state is
153-
$s \in \mathbb{S}$.
152+
This leads us to construct an instance of the celebrated **value function** of dynamic programming.
154153

155-
In particular, the agent has wage offer $w(s)$ in hand.
154+
Definitions of value functions typically begin with the word ``let''.
156155

157-
More precisely, $v^*(s)$ denotes the value of the objective function
158-
{eq}`objective` when an agent in this situation makes *optimal* decisions now
159-
and at all future points in time.
156+
Thus,
160157

161-
Of course $v^*(s)$ is not trivial to calculate because we don't yet know
158+
Let $v^*(s)$ be the optimal value of the problem when $s \in \mathbb{S}$ for a previously unemployed worker who, starting this period, has just received an offer to work forever at wage $w(s)$ and who is yet to decide whether to accept or reject the offer.
159+
160+
161+
162+
163+
Thus, the function $v^*(s)$ is the maximum value of objective
164+
{eq}`objective` for a previously unemployed worker who has offer $w(s)$ in hand and has yet to choose whether to accept it.
165+
166+
Notice that $v^*(s)$ is part of the **solution** of the problem, so it isn't obvious that it is a good idea to start working on the problem by focusing on $v^*(s)$.
167+
168+
There is a chicken and egg problem: when don't know how to compute $v^*(s)$ because we don't yet know
162169
what decisions are optimal and what aren't!
163170

164-
But think of $v^*$ as a function that assigns to each possible state
165-
$s$ the maximal lifetime value that can be obtained with that offer in
171+
But it turns out to be a really good idea by asking what properties the optimal value function $v^*(s)$ must have in order it
172+
to qualify as an optimal value function.
173+
174+
Think of $v^*$ as a function that assigns to each possible state
175+
$s$ the maximal expected discounted income stream that can be obtained with that offer in
166176
hand.
167177

168-
A crucial observation is that this function $v^*$ must satisfy the
169-
recursion
178+
A crucial observation is that the optimal value function $v^*$ must satisfy functional equation
170179

171180
```{math}
172181
:label: odu_pv
@@ -180,7 +189,10 @@ v^*(s)
180189

181190
for every possible $s$ in $\mathbb S$.
182191

183-
This important equation is a version of the **Bellman equation**, which is
192+
Notice how the function $v^*(s)$ appears on both the right and left sides of equation {eq}`odu_pv` -- that is why it is called
193+
a **functional equation**, i.e., an equation that restricts a **function**.
194+
195+
This important equation is a version of a **Bellman equation**, an equation that is
184196
ubiquitous in economic dynamics and other fields involving planning over time.
185197

186198
The intuition behind it is as follows:
@@ -191,7 +203,7 @@ $$
191203
\frac{w(s)}{1 - \beta} = w(s) + \beta w(s) + \beta^2 w(s) + \cdots
192204
$$
193205

194-
* the second term inside the max operation is the **continuation value**, which is the lifetime payoff from rejecting the current offer and then behaving optimally in all subsequent periods
206+
* the second term inside the max operation is the lifetime payoff from rejecting the current offer and then behaving optimally in all subsequent periods
195207

196208
If we optimize and pick the best of these two options, we obtain maximal lifetime value from today, given current state $s$.
197209

@@ -202,13 +214,12 @@ But this is precisely $v^*(s)$, which is the l.h.s. of {eq}`odu_pv`.
202214
Suppose for now that we are able to solve {eq}`odu_pv` for the unknown
203215
function $v^*$.
204216

205-
Once we have this function in hand we can behave optimally (i.e., make the
206-
right choice between accept and reject).
217+
Once we have this function in hand we can figure out how behave optimally (i.e., to choose whether to accept and reject $w(s)$).
207218

208219
All we have to do is select the maximal choice on the r.h.s. of {eq}`odu_pv`.
209220

210-
The optimal action is best thought of as a **policy**, which is, in general, a map from
211-
states to actions.
221+
The optimal action in state $s$ can be thought of as a part of a **policy** that maps a
222+
state into an action.
212223

213224
Given *any* $s$, we can read off the corresponding best choice (accept or
214225
reject) by picking the max on the r.h.s. of {eq}`odu_pv`.
@@ -241,13 +252,13 @@ where
241252
\bar w := (1 - \beta) \left\{ c + \beta \sum_{s'} v^*(s') q (s') \right\}
242253
```
243254

244-
Here $\bar w$ (called the *reservation wage*) is a constant depending on $\beta, c$ and the wage distribution.
255+
Here $\bar w$ (called the *reservation wage*) is a constant that depends on $\beta, c$, and the wage probability distribution induced by $q(s)$ and $w(s)$.
245256

246-
The agent should accept if and only if the current wage offer exceeds the reservation wage.
257+
The agent should accept offer $w(s)$ if and only if it exceeds the reservation wage.
247258

248259
In view of {eq}`reswage`, we can compute this reservation wage if we can compute the value function.
249260

250-
## Computing the Optimal Policy: Take 1
261+
## Computing an Optimal Policy: Take 1
251262

252263
To put the above ideas into action, we need to compute the value function at
253264
each possible state $s \in \mathbb S$.
@@ -273,7 +284,7 @@ v^*(i)
273284

274285
### The Algorithm
275286

276-
To compute this vector, we use successive approximations:
287+
To compute th vector $v^*(i), i = 1, \ldots, n$, we use successive approximations:
277288

278289
Step 1: pick an arbitrary initial guess $v \in \mathbb R^n$.
279290

@@ -291,17 +302,17 @@ v'(i)
291302
\text{for } i = 1, \ldots, n
292303
```
293304

294-
Step 3: calculate a measure of the deviation between $v$ and $v'$, such as $\max_i |v(i)- v'(i)|$.
305+
Step 3: calculate a measure of a discrepancy between $v$ and $v'$, such as $\max_i |v(i)- v'(i)|$.
295306

296307
Step 4: if the deviation is larger than some fixed tolerance, set $v = v'$ and go to step 2, else continue.
297308

298309
Step 5: return $v$.
299310

300-
For small tolerance, the returned function $v$ is a close approximation to the value function $v^*$.
311+
For a small tolerance, the returned function $v$ is a close approximation to the value function $v^*$.
301312

302313
The theory below elaborates on this point.
303314

304-
### The Fixed Point Theory
315+
### Fixed Point Theory
305316

306317
What's the mathematics behind these ideas?
307318

@@ -320,7 +331,7 @@ itself via
320331
\text{for } i = 1, \ldots, n
321332
```
322333

323-
(A new vector $Tv$ is obtained from given vector $v$ by evaluating
334+
(A new vector $Tv$ is obtained a given vector $v$ by evaluating
324335
the r.h.s. at each $i$.)
325336

326337
The element $v_k$ in the sequence $\{v_k\}$ of successive
@@ -332,20 +343,20 @@ approximations corresponds to $T^k v$.
332343
One can show that the conditions of the [Banach fixed point theorem](https://en.wikipedia.org/wiki/Banach_fixed-point_theorem) are
333344
satisfied by $T$ on $\mathbb R^n$.
334345

335-
One implication is that $T$ has a unique fixed point in $\mathbb R^n$.
346+
One implication is that $T$ has a unique fixed point $\bar v \in \mathbb R^n$.
336347

337-
* That is, a unique vector $\bar v$ such that $T \bar v = \bar v$.
348+
* The fixed point is a unique vector $\bar v$ that satisfies $T \bar v = \bar v$.
338349

339350
Moreover, it's immediate from the definition of $T$ that this fixed
340351
point is $v^*$.
341352

342353
A second implication of the Banach contraction mapping theorem is that
343-
$\{ T^k v \}$ converges to the fixed point $v^*$ regardless of
344-
$v$.
354+
$\{ T^k v \}$ converges to the fixed point $v^*$ regardless of the initial
355+
$v \in \mathbb R^n$.
345356

346357
### Implementation
347358

348-
Our default for $q$, the distribution of the state process, will be
359+
Our default for the probability distribution $q$ of the state process is a
349360
[Beta-binomial](https://en.wikipedia.org/wiki/Beta-binomial_distribution).
350361

351362
```{code-cell} python3
@@ -360,7 +371,7 @@ w_min, w_max = 10, 60
360371
w_default = np.linspace(w_min, w_max, n+1)
361372
```
362373

363-
Here's a plot of the probabilities of different wage outcomes:
374+
Here's a plot of probabilities of different wage outcomes:
364375

365376
```{code-cell} python3
366377
fig, ax = plt.subplots()
@@ -375,7 +386,7 @@ We are going to use Numba to accelerate our code.
375386

376387
* See, in particular, the discussion of `@jitclass` in [our lecture on Numba](https://python-programming.quantecon.org/numba.html).
377388

378-
The following helps Numba by providing some type
389+
The following helps Numba by providing some information about types
379390

380391
```{code-cell} python3
381392
mccall_data = [
@@ -386,9 +397,8 @@ mccall_data = [
386397
]
387398
```
388399

389-
Here's a class that stores the data and computes the values of state-action pairs,
390-
i.e. the value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`,
391-
given the current state and an arbitrary feasible action.
400+
Here's a class that stores the data and computes values of state-action pairs,
401+
i.e., values associated with pairs consisting of the current state and alternative feasible actions that occur inside the maximum bracket on the right hand side of Bellman equation {eq}`odu_pv2p`.
392402

393403
Default parameter values are embedded in the class.
394404

@@ -418,7 +428,7 @@ class McCallModel:
418428
Based on these defaults, let's try plotting the first few approximate value functions
419429
in the sequence $\{ T^k v \}$.
420430

421-
We will start from guess $v$ given by $v(i) = w(i) / (1 - β)$, which is the value of accepting at every given wage.
431+
We will start from guess $v$ given by $v(i) = w(i) / (1 - β)$, which is the value of accepting $w(i)$.
422432

423433
Here's a function to implement this:
424434

@@ -445,7 +455,7 @@ def plot_value_function_seq(mcm, ax, num_plots=6):
445455
ax.legend(loc='lower right')
446456
```
447457

448-
Now let's create an instance of `McCallModel` and call the function:
458+
Now let's create an instance of `McCallModel` and watch iterations $T^k v$ converge from below:
449459

450460
```{code-cell} python3
451461
mcm = McCallModel()
@@ -457,9 +467,9 @@ plot_value_function_seq(mcm, ax)
457467
plt.show()
458468
```
459469

460-
You can see that convergence is occuring: successive iterates are getting closer together.
470+
You can see that convergence is occurring: successive iterates are getting closer together.
461471

462-
Here's a more serious iteration effort to compute the limit, which continues until measured deviation between successive iterates is below tol.
472+
Here's a more serious iteration effort to compute the limit, which continues until a discrepancy between successive iterates is below tol.
463473

464474
Once we obtain a good approximation to the limit, we will use it to calculate
465475
the reservation wage.
@@ -497,15 +507,15 @@ def compute_reservation_wage(mcm,
497507
return (1 - β) * (c + β * np.sum(v * q))
498508
```
499509

500-
The next line computes the reservation wage at the default parameters
510+
The next line computes the reservation wage at default parameters
501511

502512
```{code-cell} python3
503513
compute_reservation_wage(mcm)
504514
```
505515

506516
### Comparative Statics
507517

508-
Now we know how to compute the reservation wage, let's see how it varies with
518+
Now that we know how to compute the reservation wage, let's see how it varies with
509519
parameters.
510520

511521
In particular, let's look at what happens when we change $\beta$ and
@@ -547,12 +557,12 @@ As expected, the reservation wage increases both with patience and with
547557
unemployment compensation.
548558

549559
(mm_op2)=
550-
## Computing the Optimal Policy: Take 2
560+
## Computing an Optimal Policy: Take 2
551561

552-
The approach to dynamic programming just described is very standard and
562+
The approach to dynamic programming just described is standard and
553563
broadly applicable.
554564

555-
For this particular problem, there's also an easier way, which circumvents the
565+
But for our McCall search model there's also an easier way that circumvents the
556566
need to compute the value function.
557567

558568
Let $h$ denote the continuation value:
@@ -611,9 +621,9 @@ Step 3: calculate the deviation $|h - h'|$.
611621

612622
Step 4: if the deviation is larger than some fixed tolerance, set $h = h'$ and go to step 2, else return $h$.
613623

614-
Once again, one can use the Banach contraction mapping theorem to show that this process always converges.
624+
One can again use the Banach contraction mapping theorem to show that this process always converges.
615625

616-
The big difference here, however, is that we're iterating on a single number, rather than an $n$-vector.
626+
The big difference here, however, is that we're iterating on a scalar $h$, rather than an $n$-vector, $v(i), i = 1, \ldots, n$.
617627

618628
Here's an implementation:
619629

@@ -658,8 +668,8 @@ $c$ takes the following values
658668
659669
> `c_vals = np.linspace(10, 40, 25)`
660670
661-
That is, start the agent off as unemployed, compute their reservation wage
662-
given the parameters, and then simulate to see how long it takes to accept.
671+
That is, start a worker off as unemployed, compute a reservation wage
672+
given the parameters, and then simulate to see how long it takes the worker to accept.
663673
664674
Repeat a large number of times and take the average.
665675

0 commit comments

Comments
 (0)