|
1 |
| -# Existing tools |
2 |
| - |
3 |
| -- [batchtools](https://mllg.github.io/batchtools/) |
4 |
| - - probably worth comparing the save output format with that generated by hubutils |
5 |
| - - definitely seems like it will save a lot of headache on exploration; probably not as useful for actual live forecasting |
6 |
| - - their "algorithm" should I think correspond to a forecaster |
7 |
| - - any `problem`s we add that modify the data should do so by returning the modified version in instance, rather than as access functions. |
8 |
| -- [hubutils](https://infectious-disease-modeling-hubs.github.io/hubUtils/index.html) |
9 |
| - - sort of a different direction focused more on aggregating results from several places. I think the output format is something I should target; file format of parquet |
10 |
| -- [epiforecasts](https://github.com/epiforecasts) |
11 |
| - - another group, they have a scoring utils package |
12 |
| - - [scoringutils](https://epiforecasts.io/scoringutils/) |
13 |
| - - it does not. For quantile models, they expect ‘true_value’, ‘prediction’, ‘quantile’ |
14 |
| - - hubverse expects 'output_type' 'output_type_id' and 'value' |
15 |
| - - easy enough to map between them though |
16 |
| - |
17 |
| -# Things I definitely need: |
18 |
| - |
19 |
| -- a way to produce forecasts |
20 |
| - this should also be easily used in production |
21 |
| -- a way to score forecasts |
22 |
| - - I currently have one that only does WIS; I think switching to scoringutils wouldn't take much time at all |
23 |
| -- a way to compare scores |
24 |
| - |
25 |
| -currently, I'm producing forecasts and evaluating at the same time. Actually, no I'm not. I'm first doing an `epix_slide` to produce forecasts, and then |
26 |
| - |
27 |
| -- parallel over forecasterXahead definitions: |
28 |
| - - for each (forecaster,ahead): |
29 |
| - - generate forecast |
30 |
| - - evaluate forecast |
31 |
| - - save |
32 |
| - |
33 |
| -# Kinds of forecasters |
34 |
| -## Basic |
| 1 | +# Exploration Tooling |
| 2 | + |
| 3 | +This repo is meant to be a place to explore different forecasting methods and tools for doing so. |
| 4 | +The goal is to unify COVID forecasting and flu forecasting in one repo. |
| 5 | +The repo is structured as a [targets](https://docs.ropensci.org/targets/) project, which means that it is easy to run things in parallel and to cache results. |
| 6 | +The repo is also structured as an R package, which means that it is easy to share code between different targets. |
| 7 | + |
| 8 | +## Usage |
| 9 | + |
| 10 | +```sh |
| 11 | +# Install renv and R dependencies. |
| 12 | +make install |
| 13 | + |
| 14 | +# Run the pipeline wrapper run.R. |
| 15 | +make run |
| 16 | +``` |
| 17 | + |
| 18 | +## Directory Layout |
| 19 | + |
| 20 | +- `R/`: R package code to be reused |
| 21 | +- `extras/`: plotting and notebook code |
| 22 | +- `covid_hosp_explore/`: a `targets` project for exploring covid hospitalization forecasters |
| 23 | +- `flu_hosp_explore/`: a `targets` project for exploring flu hospitalization forecasters |
| 24 | +- `covid_hosp_prod/`: a `targets` project for predicting covid hospitalizations |
| 25 | +- `flu_hosp_prod/`: a `targets` project for predicting flu hospitalizations |
| 26 | +- `testing`: for debugging forecasters and doing sanity checks |
| 27 | + |
| 28 | +## Tricky Gotchas |
| 29 | + |
| 30 | +Currently, to run in parallel, you need to make sure to install the package via `renv::install(".")` and not just via `devtools::load_all()`. |
| 31 | +Therefore we recommend developing serially, but running exploration in parallel. |
| 32 | + |
| 33 | +## Pipeline Design |
| 34 | + |
| 35 | +See [this diagram](https://excalidraw.com/#room=85f8bfeb397ddf29f110,q8nOcBql7ACvhgCyjXu98g). |
| 36 | +Double diamond objects represent plates (to evoke [plate notation](https://en.wikipedia.org/wiki/Plate_notation), but don't take the comparison too literally), which are used to represent multiple objects of the same type (e.g. different forecasters). |
| 37 | + |
| 38 | +## Notes on Forecaster Types |
| 39 | + |
| 40 | +### Basic |
| 41 | + |
35 | 42 | The basic forecaster takes in an epi_df, does some pre-processing, does an epipredict workflow, and then some post-processing
|
36 |
| -## Ensemble |
37 |
| -This kind of forecaster has two components: a list of existing forecasters it depends on, and a function that aggregates those forecasters. |
38 |
| -## (to be named) |
39 |
| -Any forecaster which requires a pre-trained component. An example is a forecaster with a sophisticated imputation method. Evaluating these has some thorns around training/testing splitting. It may be foldable into the basic variety though. |
40 |
| -# later things |
41 |
| -- a way to check that a given function is or is not in the right format to be a forecaster |
42 | 43 |
|
| 44 | +### Ensemble |
43 | 45 |
|
44 |
| -# Random notes |
45 |
| -Currently, to run in parallel, you need to install the package via `renv::install(".")`. |
46 |
| -The parallel workers will continue to use the version as of the last time you ran `renv::install`, while the non-parallel ones won't. This separates development from exploration. |
| 46 | +This kind of forecaster has two components: a list of existing forecasters it depends on, and a function that aggregates those forecasters. |
47 | 47 |
|
| 48 | +### (to be named) |
48 | 49 |
|
49 |
| -# Targets projects |
50 |
| -- testing: for debugging forecasters and doing sanity checks |
51 |
| -- flu_hosp_explore: for exploring flu hospitalization forecasters |
52 |
| -- covid_hosp_explore: for exploring covid hospitalization forecasters |
53 |
| -- flu_hosp_prod: for predicting flu hospitalizations |
54 |
| -- covid_hosp_prod: for predicting flu hospitalizations |
| 50 | +Any forecaster which requires a pre-trained component. An example is a forecaster with a sophisticated imputation method. Evaluating these has some thorns around training/testing splitting. It may be foldable into the basic variety though. |
0 commit comments