-
Notifications
You must be signed in to change notification settings - Fork 16
Revised: Basic FlaSH Implementation #1751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
350cf64
local commit, save
Ananya-Joshi 17236a7
functional code
Ananya-Joshi 897fb43
checkpoint commit
Ananya-Joshi ecdbe58
originial flash
d9ecd8f
basic testing and linting complete
2697edc
added in tests
d555acd
drop lambda wrapper in pd.apply
nmdefries f82b6a4
v3 FlaSH: simplified I/O structure & clearer outlier buckets
e340c1d
double quotes in the params files
c6072ec
fixed test so no overwrites, added aws code
b80e71c
minor changes resulting from string list elements in json files
f78bbfa
Squash Commits from PR testing
c02bbcc
don't evaluate days that are 0 if they're not updated daily
5d18239
additional features for saved FlaSH file
2e23698
typo
f3b022a
lint fix
f5d15e8
changed out-of-range handling and params for run when flash not part …
b5c9c93
channged additive factor
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# FlaSH System | ||
|
||
THIS README IS IN PROGRESS | ||
|
||
FlaSH is a real-time point-outlier detection system. We add the daily evaluation step to this indicators package (retraining is done offline). | ||
|
||
FlaSH produces a list of data points that are unusual or surprising so that stakeholders are aware of points that warrant further inspection. | ||
|
||
The guiding principles for the system are: | ||
- Flag relevant data points as soon as possible (ideally in an online setting) | ||
- Be aware of the false positive/false negative rates | ||
- Reduce cognitive load on data evaluators | ||
|
||
Types of outliers/changes FlaSH intends to catch are: | ||
- Out-of-range points | ||
- Large spikes | ||
- Points that are interesting for a particular weekday | ||
- Points that are interesting with respect to a particular stream's history | ||
- Points that are interesting with respect to all other streams | ||
- Change in data reporting schedule | ||
- Changes in health condition [ex: new variant] | ||
|
||
## Running FlaSH-eval | ||
|
||
First, run the indicator so that there are files for FlaSH to check. | ||
|
||
You can excecute the Python module contained in this | ||
directory from the main directory of the indicator of interest. | ||
|
||
The safest way to do this is to create a virtual environment, | ||
and install the common DELPHI tools, including flash, and the | ||
flash module and its dependencies to the virtual environment. | ||
|
||
To do this, navigate to the main directory of the indicator of interest and run the following code: | ||
|
||
``` | ||
python -m venv env | ||
source env/bin/activate | ||
pip install ../_delphi_utils_python/. | ||
pip install . | ||
``` | ||
|
||
To execute the module run the indicator to generate data files and then run | ||
the flash system , as follows: | ||
|
||
``` | ||
env/bin/python -m delphi_INDICATORNAME | ||
env/bin/python -m delphi_utils.flash_eval | ||
|
||
``` | ||
|
||
Once you are finished with the code, you can deactivate the virtual environment | ||
and (optionally) remove the environment itself. | ||
|
||
``` | ||
deactivate | ||
rm -r env | ||
``` | ||
|
||
### Customization | ||
|
||
All of the user-changable parameters are stored in the `flash` field of the indicator's `params.json` file. If `params.json` does not already include a `flash` field, please copy that provided in this module's `params.json.template`. | ||
|
||
Please update the follow settings: | ||
- signals: a list of which signals for that indicator go through FlaSH. | ||
|
||
## Testing the code | ||
|
||
To test the code, please create a new virtual environment in the main module directory using the following procedure, similar to above: | ||
|
||
``` | ||
make install | ||
``` | ||
|
||
To do a static test of the code style, it is recommended to run **pylint** on | ||
the module. To do this, run the following from the main module directory: | ||
|
||
``` | ||
make lint | ||
``` | ||
|
||
The most aggressive checks are turned off; only relatively important issues | ||
should be raised and they should be manually checked (or better, fixed). | ||
|
||
Unit tests are also included in the module. To execute these, run the following command from this directory: | ||
|
||
``` | ||
make test | ||
``` | ||
|
||
or | ||
|
||
``` | ||
(cd tests && ../env/bin/pytest test_file.py --cov=delphi_utils --cov-report=term-missing) | ||
``` | ||
|
||
The output will show the number of unit tests that passed and failed, along with the percentage of code covered by the tests. None of the tests should fail and the code lines that are not covered by unit tests should be small and should not include critical sub-routines. | ||
|
||
|
||
## Adding checks | ||
|
||
To add a new validation check. Each check should append a descriptive error message to the `raised` attribute if triggered. All checks should allow the user to override exception raising for a specific file using the `suppressed_errors` setting in `params.json`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
## Code Review (Python) | ||
|
||
A code review of this module should include a careful look at the code and the | ||
output. To assist in the process, but certainly not in replace of it, please | ||
check the following items. | ||
|
||
**Documentation** | ||
|
||
- [] the README.md file template is filled out and currently accurate; it is | ||
possible to load and test the code using only the instructions given | ||
- [] minimal docstrings (one line describing what the function does) are | ||
included for all functions; full docstrings describing the inputs and expected | ||
outputs should be given for non-trivial functions | ||
|
||
**Structure** | ||
|
||
- [] code should use 4 spaces for indentation; other style decisions are | ||
flexible, but be consistent within a module | ||
- [] any required metadata files are checked into the repository and placed | ||
within the directory `static` | ||
- [] any intermediate files that are created and stored by the module should | ||
be placed in the directory `cache` | ||
- [] final expected output files to be uploaded to the API are placed in the | ||
`receiving` directory; output files should not be committed to the respository | ||
- [] all options and API keys are passed through the file `params.json` | ||
- [] template parameter file (`params.json.template`) is checked into the | ||
code; no personal (i.e., usernames) or private (i.e., API keys) information is | ||
included in this template file | ||
|
||
**Testing** | ||
|
||
- [] module can be installed in a new virtual environment | ||
- [] pylint with the default `.pylint` settings run over the module produces | ||
minimal warnings; warnings that do exist have been confirmed as false positives | ||
- [] reasonably high level of unit test coverage covering all of the main logic | ||
of the code (e.g., missing coverage for raised errors that do not currently seem | ||
possible to reach are okay; missing coverage for options that will be needed are | ||
not) | ||
- [] all unit tests run without errors |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# -*- coding: utf-8 -*- | ||
"""Module to flagging interesting or unusual data points. | ||
|
||
This file defines the functions that are made public by the module. As the | ||
module is intended to be executed though the main method, these are primarily | ||
for testing. | ||
""" | ||
from __future__ import absolute_import | ||
from .constants import HTML_LINK |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
"""Call the function run_module when executed. | ||
|
||
This file indicates that calling the module (`python -m MODULE_NAME`) will | ||
call the function `run_module` found within the run.py file. There should be | ||
no need to change this template. | ||
""" | ||
|
||
from delphi_utils import read_params | ||
from .run import run_module # pragma: no cover | ||
|
||
run_module(read_params()) # pragma: no cover |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
"""Constants used in FlaSH.""" | ||
|
||
#Regions considered under states | ||
STATES = ['ak', 'al', 'ar', 'as', 'az', 'ca', 'co', 'ct', 'dc', 'de', 'fl', 'ga', | ||
'gu', 'hi', 'ia', 'id', 'il', 'in', 'ks', 'ky', 'la', | ||
'ma', 'md', 'me', 'mi', 'mn', 'mo', 'mp', 'ms', 'mt', 'nc', | ||
'nd', 'ne', 'nh', 'nj', 'nm', 'nv', 'ny', 'oh', 'ok', | ||
'or', 'pa', 'pr', 'ri', 'sc', 'sd', 'tn', 'tx', 'ut', 'va', 'vi', 'vt', | ||
'wa', 'wi', 'wv', 'wy'] | ||
|
||
#HTML Link for the visualization tool alerts | ||
HTML_LINK = "<https://ananya-joshi-visapp-vis-523f3g.streamlitapp.com/?params=" | ||
Ananya-Joshi marked this conversation as resolved.
Show resolved
Hide resolved
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the production version of this downloads files from somewhere, but the local/testing version runs off of pre-existing files, sounds like this tool will need two different modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Nat,
Thank you for this - I'll fix the linting bugs & we can talk about some of these high & low level comments in our meeting. FlaSH is still just a prototype, the major difference between this version and the prior version being removing some unnecessary features for the MVP. Chat soon!