Skip to content

Commit 82feea8

Browse files
committed
update plans
1 parent 61b0b99 commit 82feea8

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

validator/PLANS.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,17 +44,18 @@
4444

4545
### Larger issues
4646

47-
* Check if [errors raised from validating all signals](https://docs.google.com/spreadsheets/d/1_aRBDrNeaI-3ZwuvkRNSZuZ2wfHJk6Bxj35Ol_XZ9yQ/edit#gid=1226266834) are correct, not false positives, not overly verbose or repetitive
47+
* Improve errors and error reports
48+
* Check if [errors raised from validating all signals](https://docs.google.com/spreadsheets/d/1_aRBDrNeaI-3ZwuvkRNSZuZ2wfHJk6Bxj35Ol_XZ9yQ/edit#gid=1226266834) are correct, not false positives, not overly verbose or repetitive
49+
* Easier suppression of many errors at once
50+
* Maybe store errors as dict of dicts. Keys could be check strings (e.g. "check_bad_se"), then next layer geo type, etc
51+
* Nicer formatting for error “report”.
52+
* E.g. if a single type of error is raised for many different datasets, summarize all error messages into a single message? But it still has to be clear how to suppress each individually
4853
* Check for erratic data sources that wrongly report all zeroes
4954
* E.g. the error with the Wisconsin data for the 10/26 forecasts
5055
* Wary of a purely static check for this
5156
* Are there any geo regions where this might cause false positives? E.g. small counties or MSAs, certain signals (deaths, since it's << cases)
5257
* This test is partially captured by checking avgs in source vs reference data, unless erroneous zeroes continue for more than a week
5358
* Also partially captured by outlier checking. If zeroes aren't outliers, then it's hard to say that they're erroneous at all.
54-
* Easier suppression of many errors at once
55-
* Maybe store errors as dict of dicts. Keys could be check strings (e.g. "check_bad_se"), then next layer geo type, etc
56-
* Nicer formatting for error “report”.
57-
* E.g. if a single type of error is raised for many different datasets, summarize all error messages into a single message? But it still has to be clear how to suppress each individually
5859
* Use known erroneous/anomalous days of source data to tune static thresholds and test behavior
5960
* If can't get data from API, do we want to use substitute data for the comparative checks instead?
6061
* E.g. most recent successful API pull -- might end up being a couple weeks older

0 commit comments

Comments
 (0)