@@ -72,28 +72,27 @@ processed using
72
72
``` r
73
73
library(epipredict )
74
74
case_death_rate_subset
75
+ # > An `epi_df` object, 20,496 x 4 with metadata:
76
+ # > * geo_type = state
77
+ # > * time_type = day
78
+ # > * as_of = 2022-05-31 12:08:25.791826
79
+ # >
80
+ # > # A tibble: 20,496 × 4
81
+ # > geo_value time_value case_rate death_rate
82
+ # > * <chr> <date> <dbl> <dbl>
83
+ # > 1 ak 2020-12-31 35.9 0.158
84
+ # > 2 al 2020-12-31 65.1 0.438
85
+ # > 3 ar 2020-12-31 66.0 1.27
86
+ # > 4 as 2020-12-31 0 0
87
+ # > 5 az 2020-12-31 76.8 1.10
88
+ # > 6 ca 2020-12-31 96.0 0.751
89
+ # > 7 co 2020-12-31 35.8 0.649
90
+ # > 8 ct 2020-12-31 52.1 0.819
91
+ # > 9 dc 2020-12-31 31.0 0.601
92
+ # > 10 de 2020-12-31 65.2 0.807
93
+ # > # ℹ 20,486 more rows
75
94
```
76
95
77
- #> An `epi_df` object, 20,496 x 4 with metadata:
78
- #> * geo_type = state
79
- #> * time_type = day
80
- #> * as_of = 2022-05-31 12:08:25.791826
81
- #>
82
- #> # A tibble: 20,496 × 4
83
- #> geo_value time_value case_rate death_rate
84
- #> * <chr> <date> <dbl> <dbl>
85
- #> 1 ak 2020-12-31 35.9 0.158
86
- #> 2 al 2020-12-31 65.1 0.438
87
- #> 3 ar 2020-12-31 66.0 1.27
88
- #> 4 as 2020-12-31 0 0
89
- #> 5 az 2020-12-31 76.8 1.10
90
- #> 6 ca 2020-12-31 96.0 0.751
91
- #> 7 co 2020-12-31 35.8 0.649
92
- #> 8 ct 2020-12-31 52.1 0.819
93
- #> 9 dc 2020-12-31 31.0 0.601
94
- #> 10 de 2020-12-31 65.2 0.807
95
- #> # ℹ 20,486 more rows
96
-
97
96
To create and train a simple auto-regressive forecaster to predict the
98
97
death rate two weeks into the future using past (lagged) deaths and
99
98
cases, we could use the following function.
@@ -109,40 +108,24 @@ two_week_ahead <- arx_forecaster(
109
108
)
110
109
)
111
110
two_week_ahead
111
+ # > ══ A basic forecaster of type ARX Forecaster ═══════════════════════════════
112
+ # >
113
+ # > This forecaster was fit on 2023-12-23 09:12:46.
114
+ # >
115
+ # > Training data was an <epi_df> with:
116
+ # > • Geography: state,
117
+ # > • Time type: day,
118
+ # > • Using data up-to-date as of: 2022-05-31 12:08:25.
119
+ # >
120
+ # > ── Predictions ─────────────────────────────────────────────────────────────
121
+ # >
122
+ # > A total of 56 predictions are available for
123
+ # > • 56 unique geographic regions,
124
+ # > • At forecast date: 2021-12-31,
125
+ # > • For target date: 2022-01-14.
126
+ # >
112
127
```
113
128
114
- #> ══ A basic forecaster of type ARX Forecaster ═══════════════════════════════════
115
-
116
- #>
117
-
118
- #> This forecaster was fit on 2023-12-23 08:50:59.
119
-
120
- #>
121
-
122
- #> Training data was an <epi_df> with:
123
-
124
- #> • Geography: state,
125
-
126
- #> • Time type: day,
127
-
128
- #> • Using data up-to-date as of: 2022-05-31 12:08:25.
129
-
130
- #>
131
-
132
- #> ── Predictions ─────────────────────────────────────────────────────────────────
133
-
134
- #>
135
-
136
- #> A total of 56 predictions are available for
137
-
138
- #> • 56 unique geographic regions,
139
-
140
- #> • At forecast date: 2021-12-31,
141
-
142
- #> • For target date: 2022-01-14.
143
-
144
- #>
145
-
146
129
In this case, we have used a number of different lags for the case rate,
147
130
while only using 3 weekly lags for the death rate (as predictors). The
148
131
result is both a fitted model object which could be used any time in the
@@ -152,98 +135,69 @@ last available time value in the data.
152
135
153
136
``` r
154
137
two_week_ahead $ epi_workflow
138
+ # >
139
+ # > ══ Epi Workflow [trained] ══════════════════════════════════════════════════
140
+ # > Preprocessor: Recipe
141
+ # > Model: linear_reg()
142
+ # > Postprocessor: Frosting
143
+ # >
144
+ # > ── Preprocessor ────────────────────────────────────────────────────────────
145
+ # >
146
+ # > 6 Recipe steps.
147
+ # > 1. step_epi_lag()
148
+ # > 2. step_epi_lag()
149
+ # > 3. step_epi_ahead()
150
+ # > 4. step_naomit()
151
+ # > 5. step_naomit()
152
+ # > 6. step_training_window()
153
+ # >
154
+ # > ── Model ───────────────────────────────────────────────────────────────────
155
+ # >
156
+ # > Call:
157
+ # > stats::lm(formula = ..y ~ ., data = data)
158
+ # >
159
+ # > Coefficients:
160
+ # > (Intercept) lag_0_case_rate lag_1_case_rate lag_2_case_rate
161
+ # > -0.0073358 0.0030365 0.0012467 0.0009536
162
+ # > lag_3_case_rate lag_7_case_rate lag_14_case_rate lag_0_death_rate
163
+ # > 0.0011425 0.0012481 0.0003041 0.1351769
164
+ # > lag_7_death_rate lag_14_death_rate
165
+ # > 0.1471127 0.1062473
166
+ # >
167
+ # > ── Postprocessor ───────────────────────────────────────────────────────────
168
+ # >
169
+ # > 5 Frosting layers.
170
+ # > 1. layer_predict()
171
+ # > 2. layer_residual_quantiles()
172
+ # > 3. layer_add_forecast_date()
173
+ # > 4. layer_add_target_date()
174
+ # > 5. layer_threshold()
175
+ # >
155
176
```
156
177
157
- #>
158
-
159
- #> ══ Epi Workflow [trained] ══════════════════════════════════════════════════════
160
-
161
- #> Preprocessor: Recipe
162
-
163
- #> Model: linear_reg()
164
-
165
- #> Postprocessor: Frosting
166
-
167
- #>
168
-
169
- #> ── Preprocessor ────────────────────────────────────────────────────────────────
170
-
171
- #>
172
-
173
- #> 6 Recipe steps.
174
-
175
- #> 1. step_epi_lag()
176
-
177
- #> 2. step_epi_lag()
178
-
179
- #> 3. step_epi_ahead()
180
-
181
- #> 4. step_naomit()
182
-
183
- #> 5. step_naomit()
184
-
185
- #> 6. step_training_window()
186
-
187
- #>
188
-
189
- #> ── Model ───────────────────────────────────────────────────────────────────────
190
-
191
- #>
192
- #> Call:
193
- #> stats::lm(formula = ..y ~ ., data = data)
194
- #>
195
- #> Coefficients:
196
- #> (Intercept) lag_0_case_rate lag_1_case_rate lag_2_case_rate
197
- #> -0.0073358 0.0030365 0.0012467 0.0009536
198
- #> lag_3_case_rate lag_7_case_rate lag_14_case_rate lag_0_death_rate
199
- #> 0.0011425 0.0012481 0.0003041 0.1351769
200
- #> lag_7_death_rate lag_14_death_rate
201
- #> 0.1471127 0.1062473
202
-
203
- #>
204
-
205
- #> ── Postprocessor ───────────────────────────────────────────────────────────────
206
-
207
- #>
208
-
209
- #> 5 Frosting layers.
210
-
211
- #> 1. layer_predict()
212
-
213
- #> 2. layer_residual_quantiles()
214
-
215
- #> 3. layer_add_forecast_date()
216
-
217
- #> 4. layer_add_target_date()
218
-
219
- #> 5. layer_threshold()
220
-
221
- #>
222
-
223
178
The fitted model here involved preprocessing the data to appropriately
224
179
generate lagged predictors, estimating a linear model with ` stats::lm() `
225
180
and then postprocessing the results to be meaningful for epidemiological
226
181
tasks. We can also examine the predictions.
227
182
228
183
``` r
229
184
two_week_ahead $ predictions
185
+ # > # A tibble: 56 × 5
186
+ # > geo_value .pred .pred_distn forecast_date target_date
187
+ # > <chr> <dbl> <dist> <date> <date>
188
+ # > 1 ak 0.449 quantiles(0.45)[2] 2021-12-31 2022-01-14
189
+ # > 2 al 0.574 quantiles(0.57)[2] 2021-12-31 2022-01-14
190
+ # > 3 ar 0.673 quantiles(0.67)[2] 2021-12-31 2022-01-14
191
+ # > 4 as 0 quantiles(0.12)[2] 2021-12-31 2022-01-14
192
+ # > 5 az 0.679 quantiles(0.68)[2] 2021-12-31 2022-01-14
193
+ # > 6 ca 0.575 quantiles(0.57)[2] 2021-12-31 2022-01-14
194
+ # > 7 co 0.862 quantiles(0.86)[2] 2021-12-31 2022-01-14
195
+ # > 8 ct 1.07 quantiles(1.07)[2] 2021-12-31 2022-01-14
196
+ # > 9 dc 2.12 quantiles(2.12)[2] 2021-12-31 2022-01-14
197
+ # > 10 de 1.09 quantiles(1.09)[2] 2021-12-31 2022-01-14
198
+ # > # ℹ 46 more rows
230
199
```
231
200
232
- #> # A tibble: 56 × 5
233
- #> geo_value .pred .pred_distn forecast_date target_date
234
- #> <chr> <dbl> <dist> <date> <date>
235
- #> 1 ak 0.449 quantiles(0.45)[2] 2021-12-31 2022-01-14
236
- #> 2 al 0.574 quantiles(0.57)[2] 2021-12-31 2022-01-14
237
- #> 3 ar 0.673 quantiles(0.67)[2] 2021-12-31 2022-01-14
238
- #> 4 as 0 quantiles(0.12)[2] 2021-12-31 2022-01-14
239
- #> 5 az 0.679 quantiles(0.68)[2] 2021-12-31 2022-01-14
240
- #> 6 ca 0.575 quantiles(0.57)[2] 2021-12-31 2022-01-14
241
- #> 7 co 0.862 quantiles(0.86)[2] 2021-12-31 2022-01-14
242
- #> 8 ct 1.07 quantiles(1.07)[2] 2021-12-31 2022-01-14
243
- #> 9 dc 2.12 quantiles(2.12)[2] 2021-12-31 2022-01-14
244
- #> 10 de 1.09 quantiles(1.09)[2] 2021-12-31 2022-01-14
245
- #> # ℹ 46 more rows
246
-
247
201
The results above show a distributional forecast produced using data
248
202
through the end of 2021 for the 14th of January 2022. A prediction for
249
203
the death rate per 100K inhabitants is available for every state
0 commit comments