Skip to content

Commit fae9e1d

Browse files
authored
Merge pull request #154 from cmu-delphi/quidel_covidtest
pipeline for Quidel covidtest
2 parents 53aeb39 + 36ea688 commit fae9e1d

30 files changed

+38687
-0
lines changed

quidel_covidtest/.pylintrc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[DESIGN]
2+
3+
min-public-methods=1
4+
5+
6+
[MESSAGES CONTROL]
7+
8+
disable=R0801, C0330, E1101, E0611, C0114, C0116, C0103, R0913, R0914, W0702

quidel_covidtest/DETAILS.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Quidel COVID Test
2+
3+
### Background
4+
Starting May 9, 2020, we began getting Quidel COVID Test data and started reporting it from May 26, 2020 due to limitation in the data volume. The data contains a number of features for every test, including localization at 5-digit Zip Code level, a TestDate and StorageDate, patient age, and several identifiers that uniquely identify the device on which the test was performed (SofiaSerNum, the individual test (FluTestNum), and the result (ResultID). Multiple tests are stored on each device. The present Quidel COVID Test sensor concerns the positive rate in the test result.
5+
6+
### Signal names
7+
- raw_pct_positive: estimates of the percentage of positive tests in total tests
8+
- smoothed_pct_positive: same as in the first one, but where the estimates are formed by pooling together the last 7 days of data
9+
10+
### Estimating percent positive test proportion
11+
Let n be the number of total COVID tests taken over a given time period and a given location (the test result can be negative/positive/invalid). Let x be the number of tests taken with positive results in this location over the given time period. We are interested in estimating the percentage of positive tests which is defined as:
12+
```
13+
p = 100 * x / n
14+
```
15+
We estimate p across 3 temporal-spatial aggregation schemes:
16+
- daily, at the MSA (metropolitan statistical area) level;
17+
- daily, at the HRR (hospital referral region) level;
18+
- daily, at the state level.
19+
We are able to make these aggregations accurately because each test is reported with its 5-digit ZIP code. We do not report estimates for individual counties, as typically each county has too few tests to make the estimated value statistically meaningful.
20+
21+
**MSA and HRR levels**: In a given MSA or HRR, suppose N flu tests are taken in a certain time period, X is the number of tests taken with positive results. If N >= 50, we simply use:
22+
```
23+
p = 100 * X / N
24+
```
25+
If N < 50, we lend 50 - N fake samples from its home state to shrink the estimate to the state's mean, which means:
26+
```
27+
p = 100 * [ N /50 * X/N + (50 - N)/50 * Xs /Ns ]
28+
```
29+
where Ns, Xs are the number of flu tests and the number of flu tests taken with positive results taken in its home state in the same time period.
30+
31+
**State level**: the states with sample sizes smaller than a certain threshold are discarded. (The threshold is set to be 50 temporarily). For the rest of the states with big enough sample sizes,
32+
```
33+
p = 100 * X / N
34+
```
35+
36+
The estimated standard error is simply:
37+
```
38+
se = 1/100 * sqrt{ p*(1-p)/N }
39+
```
40+
where we assume for each time point, the estimates follow a binomial distribution.
41+
42+
43+
### Temporal Pooling
44+
Additionally, as with the Quidel COVID Test signal, we consider smoothed estimates formed by pooling data over time. That is, daily, for each location, we first pool all data available in that location over the last 7 days, and we then recompute everything described in the last two subsections. Pooling in this data makes estimates available in more geographic areas, as many areas report very few tests per day, but have enough data to report when 7 days are considered.

quidel_covidtest/README.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Quidel COVID Test Indicators
2+
3+
## Running the Indicator
4+
5+
The indicator is run by directly executing the Python module contained in this
6+
directory. The safest way to do this is to create a virtual environment,
7+
installed the common DELPHI tools, and then install the module and its
8+
dependencies. To do this, run the following code from this directory:
9+
10+
```
11+
python -m venv env
12+
source env/bin/activate
13+
pip install ../_delphi_utils_python/.
14+
pip install .
15+
```
16+
17+
All of the user-changable parameters are stored in `params.json`. A template is
18+
included as `params.json.template`. At a minimum, you will need to include a
19+
password for the datadrop email account and the email address of the data sender.
20+
Note that setting `export_end_date` to an empty string will export data through
21+
today (GMT) minus 5 days. Setting `pull_end_date` to an empty string will pull data
22+
through today (GMT).
23+
24+
To execute the module and produce the output datasets (by default, in
25+
`receiving`), run the following:
26+
27+
```
28+
env/bin/python -m delphi_quidel_covidtest
29+
```
30+
31+
Once you are finished with the code, you can deactivate the virtual environment
32+
and (optionally) remove the environment itself.
33+
34+
```
35+
deactivate
36+
rm -r env
37+
```
38+
39+
## Testing the code
40+
41+
To do a static test of the code style, it is recommended to run **pylint** on
42+
the module. To do this, run the following from the main module directory:
43+
44+
```
45+
env/bin/pylint delphi_quidel_covidtest
46+
```
47+
48+
The most aggressive checks are turned off; only relatively important issues
49+
should be raised and they should be manually checked (or better, fixed).
50+
51+
Unit tests are also included in the module. To execute these, run the following
52+
command from this directory:
53+
54+
```
55+
(cd tests && ../env/bin/pytest --cov=delphi_quidel_covidtest --cov-report=term-missing)
56+
```
57+
58+
The output will show the number of unit tests that passed and failed, along
59+
with the percentage of code covered by the tests. None of the tests should
60+
fail and the code lines that are not covered by unit tests should be small and
61+
should not include critical sub-routines.

quidel_covidtest/REVIEW.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
## Code Review (Python)
2+
3+
A code review of this module should include a careful look at the code and the
4+
output. To assist in the process, but certainly not in replace of it, please
5+
check the following items.
6+
7+
**Documentation**
8+
9+
- [ ] the README.md file template is filled out and currently accurate; it is
10+
possible to load and test the code using only the instructions given
11+
- [ ] minimal docstrings (one line describing what the function does) are
12+
included for all functions; full docstrings describing the inputs and expected
13+
outputs should be given for non-trivial functions
14+
15+
**Structure**
16+
17+
- [ ] code should use 4 spaces for indentation; other style decisions are
18+
flexible, but be consistent within a module
19+
- [ ] any required metadata files are checked into the repository and placed
20+
within the directory `static`
21+
- [ ] any intermediate files that are created and stored by the module should
22+
be placed in the directory `cache`
23+
- [ ] final expected output files to be uploaded to the API are placed in the
24+
`receiving` directory; output files should not be committed to the respository
25+
- [ ] all options and API keys are passed through the file `params.json`
26+
- [ ] template parameter file (`params.json.template`) is checked into the
27+
code; no personal (i.e., usernames) or private (i.e., API keys) information is
28+
included in this template file
29+
30+
**Testing**
31+
32+
- [ ] module can be installed in a new virtual environment
33+
- [ ] pylint with the default `.pylint` settings run over the module produces
34+
minimal warnings; warnings that do exist have been confirmed as false positives
35+
- [ ] reasonably high level of unit test coverage covering all of the main logic
36+
of the code (e.g., missing coverage for raised errors that do not currently seem
37+
possible to reach are okay; missing coverage for options that will be needed are
38+
not)
39+
- [ ] all unit tests run without errors

quidel_covidtest/cache/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*.csv
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# -*- coding: utf-8 -*-
2+
"""Module to pull and clean indicators from the Quidel COVID Test.
3+
4+
This file defines the functions that are made public by the module. As the
5+
module is intended to be executed though the main method, these are primarily
6+
for testing.
7+
"""
8+
9+
from __future__ import absolute_import
10+
11+
from . import geo_maps
12+
from . import data_tools
13+
from . import generate_sensor
14+
from . import export
15+
from . import pull
16+
from . import run
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# -*- coding: utf-8 -*-
2+
"""Call the function run_module when executed.
3+
4+
This file indicates that calling the module (`python -m MODULE_NAME`) will
5+
call the function `run_module` found within the run.py file. There should be
6+
no need to change this template.
7+
"""
8+
9+
from .run import run_module # pragma: no cover
10+
11+
run_module() # pragma: no cover

0 commit comments

Comments
 (0)