You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/mccall_model.md
+75-65Lines changed: 75 additions & 65 deletions
Original file line number
Diff line number
Diff line change
@@ -45,15 +45,15 @@ tags: [hide-output]
45
45
46
46
The McCall search model {cite}`McCall1970` helped transform economists' way of thinking about labor markets.
47
47
48
-
To clarify vague notions such as "involuntary" unemployment, McCall modeled the decision problem of unemployed agents directly, in terms of factors such as
48
+
To clarify notions such as "involuntary" unemployment, McCall modeled the decision problem of an unemployed worker in terms of factors including
49
49
50
50
* current and likely future wages
51
51
* impatience
52
52
* unemployment compensation
53
53
54
-
To solve the decision problem he used dynamic programming.
54
+
To solve the decision problem McCall used dynamic programming.
55
55
56
-
Here we set up McCall's model and adopt the same solution method.
56
+
Here we set up McCall's model and use dynamic programming to analyze it.
57
57
58
58
As we'll see, McCall's model is not only interesting in its own right but also an excellent vehicle for learning dynamic programming.
59
59
@@ -75,28 +75,28 @@ from quantecon.distributions import BetaBinomial
75
75
```{index} single: Models; McCall
76
76
```
77
77
78
-
An unemployed agent receives in each period a job offer at wage $w_t$.
78
+
At the beginning of each period, a worker who was unemployed last period receives one job offer to work this period at all subsequent periods at a wage $w_t$.
79
79
80
-
The wage offer is a nonnegative function of some underlying state:
80
+
The wage offer $w_t$ is a nonnegative function of some underlying state $s_t$:
Here you should think of state process $\{s_t\}$ as some underlying, unspecified
87
-
random factor that impacts on wages.
87
+
random factor that determines wages.
88
88
89
89
(Introducing an exogenous stochastic state process is a standard way for
90
90
economists to inject randomness into their models.)
91
91
92
92
In this lecture, we adopt the following simple environment:
93
93
94
-
* $\{s_t\}$ is IID, with $q(s)$ being the probability of observing state $s$ in $\mathbb{S}$ at each point in time, and
94
+
* $\{s_t\}$ is IID, with $q(s)$ being the probability of observing state $s$ in $\mathbb{S}$ at each point in time,
95
95
* the agent observes $s_t$ at the start of $t$ and hence knows
96
96
$w_t = w(s_t)$,
97
97
* the set $\mathbb S$ is finite.
98
98
99
-
(In later lectures, we will relax all of these assumptions.)
99
+
(In later lectures, we shall relax some of these assumptions.)
100
100
101
101
At time $t$, our agent has two choices:
102
102
@@ -112,15 +112,16 @@ $$
112
112
113
113
The constant $\beta$ lies in $(0, 1)$ and is called a **discount factor**.
114
114
115
-
The smaller is $\beta$, the more the agent discounts future utility relative to current utility.
115
+
The smaller is $\beta$, the more the agent discounts future utilities relative to current utility.
116
116
117
117
The variable $y_t$ is income, equal to
118
118
119
119
* his/her wage $w_t$ when employed
120
120
* unemployment compensation $c$ when unemployed
121
121
122
-
The agent is assumed to know that $\{s_t\}$ is IID with common
123
-
distribution $q$ and can use this when computing expectations.
122
+
The worker knows that $\{s_t\}$ is IID with common
123
+
distribution $q$ and uses knowledge when he or she computes mathematical expectations of various random variables that are functions of
124
+
$s_t$.
124
125
125
126
### A Trade-Off
126
127
@@ -133,40 +134,48 @@ To decide optimally in the face of this trade-off, we use dynamic programming.
133
134
134
135
Dynamic programming can be thought of as a two-step procedure that
135
136
136
-
1. first assigns values to "states" and
137
+
1. first assigns values to "states"
137
138
1. then deduces optimal actions given those values
138
139
139
140
We'll go through these steps in turn.
140
141
141
142
### The Value Function
142
143
143
-
In order to optimally trade-off current and future rewards, we need to think about two things:
144
+
In order optimally to trade-off current and future rewards, we think about two things:
144
145
145
-
1.the current payoffs we get from different choices
146
-
1.the different states that those choices will lead to in next period (in this case, either employment or unemployment)
146
+
1. current payoffs that arise from making alternative choices
147
+
1. different states that those choices take us to next period (in this case, either employment or unemployment)
147
148
148
-
To weigh these two aspects of the decision problem, we need to assign *values*
149
+
To assess these two aspects of the decision problem, we assign expected discounted*values*
149
150
to states.
150
151
151
-
To this end, let $v^*(s)$ be the total lifetime *value* accruing to an
152
-
unemployed worker who enters the current period unemployed when the state is
153
-
$s \in \mathbb{S}$.
152
+
This leads us to construct an instance of the celebrated **value function** of dynamic programming.
154
153
155
-
In particular, the agent has wage offer $w(s)$ in hand.
154
+
Definitions of value functions typically begin with the word ``let''.
156
155
157
-
More precisely, $v^*(s)$ denotes the value of the objective function
158
-
{eq}`objective` when an agent in this situation makes *optimal* decisions now
159
-
and at all future points in time.
156
+
Thus,
160
157
161
-
Of course $v^*(s)$ is not trivial to calculate because we don't yet know
158
+
Let $v^*(s)$ be the optimal value of the problem when $s \in \mathbb{S}$ for a previously unemployed worker who, starting this period, has just received an offer to work forever at wage $w(s)$ and who is yet to decide whether to accept or reject the offer.
159
+
160
+
161
+
162
+
163
+
Thus, the function $v^*(s)$ is the maximum value of objective
164
+
{eq}`objective` for a previously unemployed worker who has offer $w(s)$ in hand and has yet to choose whether to accept it.
165
+
166
+
Notice that $v^*(s)$ is part of the **solution** of the problem, so it isn't obvious that it is a good idea to start working on the problem by focusing on $v^*(s)$.
167
+
168
+
There is a chicken and egg problem: when don't know how to compute $v^*(s)$ because we don't yet know
162
169
what decisions are optimal and what aren't!
163
170
164
-
But think of $v^*$ as a function that assigns to each possible state
165
-
$s$ the maximal lifetime value that can be obtained with that offer in
171
+
But it turns out to be a really good idea by asking what properties the optimal value function $v^*(s)$ must have in order it
172
+
to qualify as an optimal value function.
173
+
174
+
Think of $v^*$ as a function that assigns to each possible state
175
+
$s$ the maximal expected discounted income stream that can be obtained with that offer in
166
176
hand.
167
177
168
-
A crucial observation is that this function $v^*$ must satisfy the
169
-
recursion
178
+
A crucial observation is that the optimal value function $v^*$ must satisfy functional equation
170
179
171
180
```{math}
172
181
:label: odu_pv
@@ -180,7 +189,10 @@ v^*(s)
180
189
181
190
for every possible $s$ in $\mathbb S$.
182
191
183
-
This important equation is a version of the **Bellman equation**, which is
192
+
Notice how the function $v^*(s)$ appears on both the right and left sides of equation {eq}`odu_pv` -- that is why it is called
193
+
a **functional equation**, i.e., an equation that restricts a **function**.
194
+
195
+
This important equation is a version of a **Bellman equation**, an equation that is
184
196
ubiquitous in economic dynamics and other fields involving planning over time.
* the second term inside the max operation is the **continuation value**, which is the lifetime payoff from rejecting the current offer and then behaving optimally in all subsequent periods
206
+
* the second term inside the max operation is the lifetime payoff from rejecting the current offer and then behaving optimally in all subsequent periods
195
207
196
208
If we optimize and pick the best of these two options, we obtain maximal lifetime value from today, given current state $s$.
197
209
@@ -202,13 +214,12 @@ But this is precisely $v^*(s)$, which is the l.h.s. of {eq}`odu_pv`.
202
214
Suppose for now that we are able to solve {eq}`odu_pv` for the unknown
203
215
function $v^*$.
204
216
205
-
Once we have this function in hand we can behave optimally (i.e., make the
206
-
right choice between accept and reject).
217
+
Once we have this function in hand we can figure out how behave optimally (i.e., to choose whether to accept and reject $w(s)$).
207
218
208
219
All we have to do is select the maximal choice on the r.h.s. of {eq}`odu_pv`.
209
220
210
-
The optimal action is best thought of as a **policy**, which is, in general, a map from
211
-
states to actions.
221
+
The optimal action in state $s$ can be thought of as a part of a **policy** that maps a
222
+
state into an action.
212
223
213
224
Given *any* $s$, we can read off the corresponding best choice (accept or
214
225
reject) by picking the max on the r.h.s. of {eq}`odu_pv`.
@@ -241,13 +252,13 @@ where
241
252
\bar w := (1 - \beta) \left\{ c + \beta \sum_{s'} v^*(s') q (s') \right\}
242
253
```
243
254
244
-
Here $\bar w$ (called the *reservation wage*) is a constant depending on $\beta, c$and the wage distribution.
255
+
Here $\bar w$ (called the *reservation wage*) is a constant that depends on $\beta, c$, and the wage probability distribution induced by $q(s)$ and $w(s)$.
245
256
246
-
The agent should accept if and only if the current wage offer exceeds the reservation wage.
257
+
The agent should accept offer $w(s)$ if and only if it exceeds the reservation wage.
247
258
248
259
In view of {eq}`reswage`, we can compute this reservation wage if we can compute the value function.
249
260
250
-
## Computing the Optimal Policy: Take 1
261
+
## Computing an Optimal Policy: Take 1
251
262
252
263
To put the above ideas into action, we need to compute the value function at
253
264
each possible state $s \in \mathbb S$.
@@ -273,7 +284,7 @@ v^*(i)
273
284
274
285
### The Algorithm
275
286
276
-
To compute this vector, we use successive approximations:
287
+
To compute th vector $v^*(i), i = 1, \ldots, n$, we use successive approximations:
Here's a plot of the probabilities of different wage outcomes:
374
+
Here's a plot of probabilities of different wage outcomes:
364
375
365
376
```{code-cell} python3
366
377
fig, ax = plt.subplots()
@@ -375,7 +386,7 @@ We are going to use Numba to accelerate our code.
375
386
376
387
* See, in particular, the discussion of `@jitclass` in [our lecture on Numba](https://python-programming.quantecon.org/numba.html).
377
388
378
-
The following helps Numba by providing some type
389
+
The following helps Numba by providing some information about types
379
390
380
391
```{code-cell} python3
381
392
mccall_data = [
@@ -386,9 +397,8 @@ mccall_data = [
386
397
]
387
398
```
388
399
389
-
Here's a class that stores the data and computes the values of state-action pairs,
390
-
i.e. the value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`,
391
-
given the current state and an arbitrary feasible action.
400
+
Here's a class that stores the data and computes values of state-action pairs,
401
+
i.e., values associated with pairs consisting of the current state and alternative feasible actions that occur inside the maximum bracket on the right hand side of Bellman equation {eq}`odu_pv2p`.
392
402
393
403
Default parameter values are embedded in the class.
394
404
@@ -418,7 +428,7 @@ class McCallModel:
418
428
Based on these defaults, let's try plotting the first few approximate value functions
419
429
in the sequence $\{ T^k v \}$.
420
430
421
-
We will start from guess $v$ given by $v(i) = w(i) / (1 - β)$, which is the value of accepting at every given wage.
431
+
We will start from guess $v$ given by $v(i) = w(i) / (1 - β)$, which is the value of accepting $w(i)$.
0 commit comments