Skip to content

Commit 6723691

Browse files
Ananya JoshiAnanya Joshi
Ananya Joshi
authored and
Ananya Joshi
committed
originial flash
1 parent 41916c1 commit 6723691

File tree

5 files changed

+224
-266
lines changed

5 files changed

+224
-266
lines changed

_delphi_utils_python/delphi_utils/flash_eval/README.md

Lines changed: 3 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -21,47 +21,8 @@ Types of outliers/changes FlaSH intends to catch are:
2121
- Changes in health condition [ex: new variant]
2222

2323
## Running FlaSH-eval
24-
In params.json, there are two parameters: flagging_meta which is a dictionary and flagging, which is a list of dictionaries.
25-
Two key parameters in flagging_meta are:
26-
1. **remote**: True/False
27-
Are you running this system so that it's checking against the s3 filesystem or a local one?
2824

29-
True = S3, False = Local
30-
31-
Related Parameters: output_dir, input_dir
32-
33-
2. **flagger_type**: flagger_df/flagger_io
34-
35-
File regeneration is time-consuming. Should you regenerate all files specified or only those that need to be updated. For example, between different runs where only AR parameters are changed, the reference files can stay the same.
36-
37-
flagger_df: Regenerate All, flagger_io: Regenerate only necessary files
38-
39-
40-
There are a few different ways to run the flagger.
41-
42-
1. What input data are you using?
43-
Types: "api", "raw", "ratio"
44-
- API: In params.json, flagging, set "sig_type" to "api", and the dataframe will be generated.
45-
- Raw/Ratio: Create a file, flag_data.py, in delphi_* for the indicator of interest that handles different sig_types as expected. See changehc/delphi_changehc/flag_data.py
46-
- Existing csv: Point to relevant location in 'raw_df' location
47-
48-
- The dataframe should be as follows:
49-
50-
**Columns**: State Abbreviations [ak, ny, tx ...] & Lag Type. Total of 51 columns
51-
52-
**Index**: Dates
53-
54-
So a sample dataframe would look like this:
55-
56-
| - | ak | ny | tx | lags |
57-
|------------|-----|-----|-----|------|
58-
| 2021-12-03 | 100 | 123 | 45 | 1 |
59-
| 2021-12-04 | 30 | 20 | 78 | 1 |
60-
| 2021-12-03 | 300 | 323 | 90 | 2 |
61-
| 2021-12-04 | 90 | 40 | 100 | 2 |
62-
63-
64-
To run the flagging system, follow similar instructions as the validator readme copied below:
25+
First, run the indicator so that there are files for FlaSH to check.
6526

6627
You can excecute the Python module contained in this
6728
directory from the main directory of the indicator of interest.
@@ -85,7 +46,6 @@ the flagging system , as follows:
8546

8647
```
8748
env/bin/python -m delphi_INDICATORNAME
88-
env/bin/python create_df_process.py #this is up to you!
8949
env/bin/python -m delphi_utils.flagging
9050
```
9151

@@ -97,38 +57,12 @@ deactivate
9757
rm -r env
9858
```
9959

100-
You have a lot of flexibility for new functionality of the flagging module.
101-
10260
### Customization
10361

104-
All of the user-changable parameters are stored in the `flagging` field of the indicator's `params.json` file. If `params.json` does not already include a `flagging` field, please copy that provided in this module's `params.json.template`.
62+
All of the user-changable parameters are stored in the `flash` field of the indicator's `params.json` file. If `params.json` does not already include a `flash` field, please copy that provided in this module's `params.json.template`.
10563

10664
Please update the follow settings:
107-
- flagging_meta
108-
- "generate_dates": determines dates for parameters in flagging (below) are recreated daily
109-
- "aws_access_key_id": for remote options,
110-
- "aws_secret_access_key": for remote options,
111-
- "n_train": the number of days used for training
112-
- "ar_lags": the number of days used for the lag
113-
- "ar_type": what type of autoregressive model do you want to use [TODO]
114-
- "output_dir": location where files will be saved if using local filesystem
115-
- "flagger_type": flagger_df to regenerate all files or flagger_io to regenerate just the missing files
116-
- flagging: a list of dictionaries each with some of these params
117-
- "df_start_date": start date of dataframe (used to create input df)
118-
- "df_end_date": end date of dataframe (used to create input df)
119-
- "resid_start_date": used to create the residual distribution
120-
- "resid_end_date": used to create the residual distribution
121-
- "eval_start_date": date range to create flags
122-
- "eval_end_date": date range to create flags
123-
- "sig_str": usually the signal name, used to create/save files
124-
- "sig_fold": the name of the data source for organizational purposes
125-
- "sig_type": the type of signal (raw, api, ratio) for organizational purposes
126-
- "remote": are you using the local or S3 filesystem
127-
- "lags": how many lags do you want to consider. Consider if your signal does have lags and the role of backfill per signal
128-
- "raw_df": the location of the input dataframe
129-
- "input_dir": location of relevant files to create the raw df
130-
131-
65+
- signals: a list of which signals for that indicator go through FlaSH.
13266

13367
## Testing the code
13468

@@ -163,12 +97,6 @@ or
16397
The output will show the number of unit tests that passed and failed, along with the percentage of code covered by the tests. None of the tests should fail and the code lines that are not covered by unit tests should be small and should not include critical sub-routines.
16498

16599

166-
## Code tour
167-
* run.py: sends params.json fields to and runs the validation process
168-
* generate_reference.py: generates the reference files related to a specific run
169-
* generate_ar.py: generates the ar files related to a specific run
170-
* flag_io.py: various functions to figure out which files need to be generated with specific parameters.
171-
* flag_data.py (local): generates the input dataframe (see application in runner.py)
172100
## Adding checks
173101

174102
To add a new validation check. Each check should append a descriptive error message to the `raised` attribute if triggered. All checks should allow the user to override exception raising for a specific file using the `suppressed_errors` setting in `params.json`.
Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
from datetime import date
2-
import pandas as pd
1+
"""Constants used in FlaSH."""
32

43
#Regions considered under states
54
STATES = ['ak', 'al', 'ar', 'as', 'az', 'ca', 'co', 'ct', 'dc', 'de', 'fl', 'ga',
@@ -9,10 +8,5 @@
98
'or', 'pa', 'pr', 'ri', 'sc', 'sd', 'tn', 'tx', 'ut', 'va', 'vi', 'vt',
109
'wa', 'wi', 'wv', 'wy', 'us']
1110

12-
HOLIDAYS = pd.to_datetime(["1/1/2020","1/20/2020","2/17/2020","5/25/2020","7/3/2020","9/7/2020","10/12/2020","11/11/2020","11/26/2020","12/25/2020",
13-
"1/1/2021","1/18/2021","2/15/2021","5/31/2021","6/18/2021","7/05/2021","9/06/2021","10/11/2021","11/11/2021","11/25/2021","12/24/2021","12/31/2021",
14-
"1/17/2022","2/21/2022","5/30/2022","6/20/2022","7/04/2022","9/05/2022","10/10/2022","11/11/2022","11/24/2022","12/26/2022"])
15-
16-
1711
#HTML Link for the visualization tool alerts
18-
HTML_LINK = "<https://ananya-joshi-visapp-vis-523f3g.streamlitapp.com/?params="
12+
HTML_LINK = "<https://ananya-joshi-visapp-vis-523f3g.streamlitapp.com/?params="

0 commit comments

Comments
 (0)