You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _delphi_utils_python/delphi_utils/flash_eval/README.md
+3-75Lines changed: 3 additions & 75 deletions
Original file line number
Diff line number
Diff line change
@@ -21,47 +21,8 @@ Types of outliers/changes FlaSH intends to catch are:
21
21
- Changes in health condition [ex: new variant]
22
22
23
23
## Running FlaSH-eval
24
-
In params.json, there are two parameters: flagging_meta which is a dictionary and flagging, which is a list of dictionaries.
25
-
Two key parameters in flagging_meta are:
26
-
1.**remote**: True/False
27
-
Are you running this system so that it's checking against the s3 filesystem or a local one?
28
24
29
-
True = S3, False = Local
30
-
31
-
Related Parameters: output_dir, input_dir
32
-
33
-
2.**flagger_type**: flagger_df/flagger_io
34
-
35
-
File regeneration is time-consuming. Should you regenerate all files specified or only those that need to be updated. For example, between different runs where only AR parameters are changed, the reference files can stay the same.
36
-
37
-
flagger_df: Regenerate All, flagger_io: Regenerate only necessary files
38
-
39
-
40
-
There are a few different ways to run the flagger.
41
-
42
-
1. What input data are you using?
43
-
Types: "api", "raw", "ratio"
44
-
- API: In params.json, flagging, set "sig_type" to "api", and the dataframe will be generated.
45
-
- Raw/Ratio: Create a file, flag_data.py, in delphi_* for the indicator of interest that handles different sig_types as expected. See changehc/delphi_changehc/flag_data.py
46
-
- Existing csv: Point to relevant location in 'raw_df' location
47
-
48
-
- The dataframe should be as follows:
49
-
50
-
**Columns**: State Abbreviations [ak, ny, tx ...] & Lag Type. Total of 51 columns
51
-
52
-
**Index**: Dates
53
-
54
-
So a sample dataframe would look like this:
55
-
56
-
| - | ak | ny | tx | lags |
57
-
|------------|-----|-----|-----|------|
58
-
| 2021-12-03 | 100 | 123 | 45 | 1 |
59
-
| 2021-12-04 | 30 | 20 | 78 | 1 |
60
-
| 2021-12-03 | 300 | 323 | 90 | 2 |
61
-
| 2021-12-04 | 90 | 40 | 100 | 2 |
62
-
63
-
64
-
To run the flagging system, follow similar instructions as the validator readme copied below:
25
+
First, run the indicator so that there are files for FlaSH to check.
65
26
66
27
You can excecute the Python module contained in this
67
28
directory from the main directory of the indicator of interest.
@@ -85,7 +46,6 @@ the flagging system , as follows:
85
46
86
47
```
87
48
env/bin/python -m delphi_INDICATORNAME
88
-
env/bin/python create_df_process.py #this is up to you!
89
49
env/bin/python -m delphi_utils.flagging
90
50
```
91
51
@@ -97,38 +57,12 @@ deactivate
97
57
rm -r env
98
58
```
99
59
100
-
You have a lot of flexibility for new functionality of the flagging module.
101
-
102
60
### Customization
103
61
104
-
All of the user-changable parameters are stored in the `flagging` field of the indicator's `params.json` file. If `params.json` does not already include a `flagging` field, please copy that provided in this module's `params.json.template`.
62
+
All of the user-changable parameters are stored in the `flash` field of the indicator's `params.json` file. If `params.json` does not already include a `flash` field, please copy that provided in this module's `params.json.template`.
105
63
106
64
Please update the follow settings:
107
-
- flagging_meta
108
-
- "generate_dates": determines dates for parameters in flagging (below) are recreated daily
109
-
- "aws_access_key_id": for remote options,
110
-
- "aws_secret_access_key": for remote options,
111
-
- "n_train": the number of days used for training
112
-
- "ar_lags": the number of days used for the lag
113
-
- "ar_type": what type of autoregressive model do you want to use [TODO]
114
-
- "output_dir": location where files will be saved if using local filesystem
115
-
- "flagger_type": flagger_df to regenerate all files or flagger_io to regenerate just the missing files
116
-
- flagging: a list of dictionaries each with some of these params
117
-
- "df_start_date": start date of dataframe (used to create input df)
118
-
- "df_end_date": end date of dataframe (used to create input df)
119
-
- "resid_start_date": used to create the residual distribution
120
-
- "resid_end_date": used to create the residual distribution
121
-
- "eval_start_date": date range to create flags
122
-
- "eval_end_date": date range to create flags
123
-
- "sig_str": usually the signal name, used to create/save files
124
-
- "sig_fold": the name of the data source for organizational purposes
125
-
- "sig_type": the type of signal (raw, api, ratio) for organizational purposes
126
-
- "remote": are you using the local or S3 filesystem
127
-
- "lags": how many lags do you want to consider. Consider if your signal does have lags and the role of backfill per signal
128
-
- "raw_df": the location of the input dataframe
129
-
- "input_dir": location of relevant files to create the raw df
130
-
131
-
65
+
- signals: a list of which signals for that indicator go through FlaSH.
132
66
133
67
## Testing the code
134
68
@@ -163,12 +97,6 @@ or
163
97
The output will show the number of unit tests that passed and failed, along with the percentage of code covered by the tests. None of the tests should fail and the code lines that are not covered by unit tests should be small and should not include critical sub-routines.
164
98
165
99
166
-
## Code tour
167
-
* run.py: sends params.json fields to and runs the validation process
168
-
* generate_reference.py: generates the reference files related to a specific run
169
-
* generate_ar.py: generates the ar files related to a specific run
170
-
* flag_io.py: various functions to figure out which files need to be generated with specific parameters.
171
-
* flag_data.py (local): generates the input dataframe (see application in runner.py)
172
100
## Adding checks
173
101
174
102
To add a new validation check. Each check should append a descriptive error message to the `raised` attribute if triggered. All checks should allow the user to override exception raising for a specific file using the `suppressed_errors` setting in `params.json`.
0 commit comments