Skip to content

Commit 08885c1

Browse files
committed
add example, more links, more references to papers
1 parent 243b61a commit 08885c1

File tree

3 files changed

+53
-4
lines changed

3 files changed

+53
-4
lines changed

examples/causal_inference/difference_in_differences.ipynb

+21-2
Original file line numberDiff line numberDiff line change
@@ -52,17 +52,27 @@
5252
"\n",
5353
"This notebook provides a brief overview of the difference in differences approach to causal inference, and shows a working example of how to conduct this type of analysis under the Bayesian framework, using PyMC. While the notebooks provides a high level overview of the approach, I recommend consulting two excellent textbooks on causal inference. Both [The Effect](https://theeffectbook.net/) {cite:p}`huntington2021effect` and [Causal Inference: The Mixtape](https://mixtape.scunning.com) {cite:p}`cunningham2021causal` have chapters devoted to difference in differences.\n",
5454
"\n",
55-
"Difference in differences would be a good approach to take for causal inference if:\n",
55+
"Difference in differences](https://en.wikipedia.org/wiki/Difference_in_differences) would be a good approach to take for causal inference if:\n",
5656
"* you want to know the causal impact of a treatment/intervention\n",
5757
"* you have pre and post treatment measures\n",
5858
"* you have both a treatment and a control group\n",
59-
"* the treatment was _not_ allocated by randomisation.\n",
59+
"* the treatment was _not_ allocated by randomisation, that is, you are in a [quasi-experimental](https://en.wikipedia.org/wiki/Quasi-experiment) setting.\n",
6060
"\n",
6161
"Otherwise there are likely better suited approaches you could use.\n",
6262
"\n",
6363
"Note that our desire to estimate the causal impact of a treatment involves [counterfactual thinking](https://en.wikipedia.org/wiki/Counterfactual_thinking). This is because we are asking \"What would the post-treatment outcome of the treatment group be _if_ treatment had not been administered?\" but we can never observe this."
6464
]
6565
},
66+
{
67+
"cell_type": "markdown",
68+
"id": "6ec005f3-c443-4243-a4f5-c86252367fe8",
69+
"metadata": {},
70+
"source": [
71+
"### Example\n",
72+
"\n",
73+
"A classic example is given by a study by {cite:t}`card1993minimum`. This study examined the effects of increasing the minimum wage upon employment in the fast food sector. This is a quasi-experimental setting because the intervention (increase in minimum wages) was not applied to different geographical units (e.g. states) randomly. The intevention was applied to New Jersey in April 1992. If they measured pre and post intervention employment rates in New Jersey only, then they would have failed to control for omitted variables changing over time (e.g. seasonal effects) which could provide alternative causal explanations for changes in employment rates. But by selecting a control state (Pennsylvania), this allows one to infer that changes in employment in Pennsylvania would match the counterfactual - what _would have happened if_ New Jersey had not received the intervention?"
74+
]
75+
},
6676
{
6777
"cell_type": "markdown",
6878
"id": "54f5c8aa-2a4d-4b77-ba64-a0e9df729103",
@@ -1144,6 +1154,15 @@
11441154
"So there we have it, we have a full posterior distribution over our estimated causal impact using the difference in differences approach."
11451155
]
11461156
},
1157+
{
1158+
"cell_type": "markdown",
1159+
"id": "bf284262-ef3f-4cc1-af07-f20bb3c69ce3",
1160+
"metadata": {},
1161+
"source": [
1162+
"## Summary\n",
1163+
"Of course, when using the difference in differences approach for real applications, there is a lot more due diligence that's needed. Readers are encouraged to check out the textbooks listed above in the introduction as well as a useful review paper {cite:p}`wing2018designing` which covers the important contextual issues in more detail. Additionally, {cite:t}`bertrand2004much` takes a skeptical look at the approach as well as proposing solutions to some of the problems they highlight."
1164+
]
1165+
},
11471166
{
11481167
"cell_type": "markdown",
11491168
"id": "b3b2ee6b-2581-4ee5-a305-b9712dd49f09",

examples/references.bib

+19
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,16 @@ @book{berry1996statistics
4747
year = {1996},
4848
publisher = {Duxbury Press}
4949
}
50+
@article{bertrand2004much,
51+
title = {How much should we trust differences-in-differences estimates?},
52+
author = {Bertrand, Marianne and Duflo, Esther and Mullainathan, Sendhil},
53+
journal = {The Quarterly journal of economics},
54+
volume = {119},
55+
number = {1},
56+
pages = {249--275},
57+
year = {2004},
58+
publisher = {MIT Press}
59+
}
5060
@book{breen1996regression,
5161
title = {Regression models: Censored, sample selected, or truncated data},
5262
author = {Breen, Richard and others},
@@ -542,6 +552,15 @@ @book{wilkinson2005grammar
542552
issn = {1431-8784},
543553
isbn = {978-0-387-24544-7}
544554
}
555+
@article{wing2018designing,
556+
title = {Designing difference in difference studies: best practices for public health policy research},
557+
author = {Wing, Coady and Simon, Kosali and Bello-Gomez, Ricardo A},
558+
journal = {Annu Rev Public Health},
559+
volume = {39},
560+
number = {1},
561+
pages = {453--469},
562+
year = {2018}
563+
}
545564
@article{Yao_2018,
546565
doi = {10.1214/17-ba1091},
547566
url = {https://doi.org/10.1214\%2F17-ba1091},

myst_nbs/causal_inference/difference_in_differences.myst.md

+13-2
Original file line numberDiff line numberDiff line change
@@ -40,18 +40,24 @@ az.style.use("arviz-darkgrid")
4040

4141
This notebook provides a brief overview of the difference in differences approach to causal inference, and shows a working example of how to conduct this type of analysis under the Bayesian framework, using PyMC. While the notebooks provides a high level overview of the approach, I recommend consulting two excellent textbooks on causal inference. Both [The Effect](https://theeffectbook.net/) {cite:p}`huntington2021effect` and [Causal Inference: The Mixtape](https://mixtape.scunning.com) {cite:p}`cunningham2021causal` have chapters devoted to difference in differences.
4242

43-
Difference in differences would be a good approach to take for causal inference if:
43+
Difference in differences](https://en.wikipedia.org/wiki/Difference_in_differences) would be a good approach to take for causal inference if:
4444
* you want to know the causal impact of a treatment/intervention
4545
* you have pre and post treatment measures
4646
* you have both a treatment and a control group
47-
* the treatment was _not_ allocated by randomisation.
47+
* the treatment was _not_ allocated by randomisation, that is, you are in a [quasi-experimental](https://en.wikipedia.org/wiki/Quasi-experiment) setting.
4848

4949
Otherwise there are likely better suited approaches you could use.
5050

5151
Note that our desire to estimate the causal impact of a treatment involves [counterfactual thinking](https://en.wikipedia.org/wiki/Counterfactual_thinking). This is because we are asking "What would the post-treatment outcome of the treatment group be _if_ treatment had not been administered?" but we can never observe this.
5252

5353
+++
5454

55+
### Example
56+
57+
A classic example is given by a study by {cite:t}`card1993minimum`. This study examined the effects of increasing the minimum wage upon employment in the fast food sector. This is a quasi-experimental setting because the intervention (increase in minimum wages) was not applied to different geographical units (e.g. states) randomly. The intevention was applied to New Jersey in April 1992. If they measured pre and post intervention employment rates in New Jersey only, then they would have failed to control for omitted variables changing over time (e.g. seasonal effects) which could provide alternative causal explanations for changes in employment rates. But by selecting a control state (Pennsylvania), this allows one to infer that changes in employment in Pennsylvania would match the counterfactual - what _would have happened if_ New Jersey had not received the intervention?
58+
59+
+++
60+
5561
### Causal DAG
5662

5763
The causal DAG for difference in differences is given below. It says:
@@ -423,6 +429,11 @@ So there we have it, we have a full posterior distribution over our estimated ca
423429

424430
+++
425431

432+
## Summary
433+
Of course, when using the difference in differences approach for real applications, there is a lot more due diligence that's needed. Readers are encouraged to check out the textbooks listed above in the introduction as well as a useful review paper {cite:p}`wing2018designing` which covers the important contextual issues in more detail. Additionally, {cite:t}`bertrand2004much` takes a skeptical look at the approach as well as proposing solutions to some of the problems they highlight.
434+
435+
+++
436+
426437
## References
427438

428439
:::{bibliography}

0 commit comments

Comments
 (0)