|
6 | 6 | * Recognized file name format
|
7 | 7 | * Recognized geographical type (county, state, etc)
|
8 | 8 | * Recognized geo id format (e.g. state is two lowercase letters)
|
9 |
| -* Geo id has been seen before, in historical data |
| 9 | +* Geo id has been seen before in historical data |
10 | 10 | * Missing geo type + signal + date combos based on the geo type + signal combos Covidcast metadata says should be available
|
11 | 11 | * Missing ‘val’ values
|
12 | 12 | * Negative ‘val’ values
|
|
22 | 22 | * Most recent date seen in source data is not older than most recent date seen in reference data
|
23 | 23 | * Similar number of obs per day as recent API data (static threshold)
|
24 | 24 | * Similar average value as API data (static threshold)
|
| 25 | +* Outliers in cases and deaths signals using [this method](https://github.com/cmu-delphi/covidcast-forecast/tree/dev/corrections/data_corrections) |
25 | 26 | * Source data for specified date range is empty
|
26 | 27 | * API data for specified date range is empty
|
27 | 28 |
|
|
44 | 45 |
|
45 | 46 | ### Larger issues
|
46 | 47 |
|
| 48 | +* Set up validator to use Sir-complains-a-lot alerting functionality on a signal-by-signal basis (should send alert output as a slack message and "@" a set person), as a stop-gap before the logging server is ready |
| 49 | + * This is [how Sir-CAL works](https://github.com/benjaminysmith/covidcast-indicators/blob/main/sir_complainsalot/delphi_sir_complainsalot/run.py) |
| 50 | + * [Example output](https://delphi-org.slack.com/archives/C01E81A3YKF/p1605793508000100) |
47 | 51 | * Improve errors and error report
|
48 | 52 | * Check if [errors raised from validating all signals](https://docs.google.com/spreadsheets/d/1_aRBDrNeaI-3ZwuvkRNSZuZ2wfHJk6Bxj35Ol_XZ9yQ/edit#gid=1226266834) are correct, not false positives, not overly verbose or repetitive
|
49 | 53 | * Easier suppression of many errors at once
|
50 | 54 | * Maybe store errors as dict of dicts. Keys could be check strings (e.g. "check_bad_se"), then next layer geo type, etc
|
51 | 55 | * Nicer formatting for error “report”.
|
| 56 | + * Potentially set `__print__()` method in ValidationError class |
52 | 57 | * E.g. if a single type of error is raised for many different datasets, summarize all error messages into a single message? But it still has to be clear how to suppress each individually
|
53 | 58 | * Check for erratic data sources that wrongly report all zeroes
|
54 | 59 | * E.g. the error with the Wisconsin data for the 10/26 forecasts
|
55 | 60 | * Wary of a purely static check for this
|
56 | 61 | * Are there any geo regions where this might cause false positives? E.g. small counties or MSAs, certain signals (deaths, since it's << cases)
|
57 | 62 | * This test is partially captured by checking avgs in source vs reference data, unless erroneous zeroes continue for more than a week
|
58 |
| - * Also partially captured by outlier checking. If zeroes aren't outliers, then it's hard to say that they're erroneous at all. |
| 63 | + * Also partially captured by outlier checking, depending on `size_cut` setting. If zeroes aren't outliers, then it's hard to say that they're erroneous at all. |
59 | 64 | * Use known erroneous/anomalous days of source data to tune static thresholds and test behavior
|
60 | 65 | * If can't get data from API, do we want to use substitute data for the comparative checks instead?
|
61 |
| - * E.g. most recent successful API pull -- might end up being a couple weeks older |
62 | 66 | * Currently, any API fetch problems just doesn't do comparative checks at all.
|
| 67 | + * E.g. most recent successful API pull -- might end up being a couple weeks older |
63 | 68 | * Improve performance and reduce runtime (no particular goal, just avoid being painfully slow!)
|
64 | 69 | * Profiling (iterate)
|
65 | 70 | * Check if saving intermediate files will improve efficiency (currently a bottleneck at "individual file checks" section. Parallelize?)
|
|
80 | 85 | * Raise errors when one p-value (per geo region, e.g.) is significant OR when a bunch of p-values for that same type of test (different geo regions, e.g.) are "close" to significant
|
81 | 86 | * Correct p-values for multiple testing
|
82 | 87 | * Bonferroni would be easy but is sensitive to choice of "family" of tests; Benjamimi-Hochberg is a bit more involved but is less sensitive to choice of "family"; [comparison of the two](https://delphi-org.slack.com/archives/D01A9KNTPKL/p1603294915000500)
|
| 88 | + * Use prophet package? Would require 2-3 months of API data. |
0 commit comments