|
1 | 1 | ---
|
2 | 2 | title: Quidel
|
3 |
| -parent: Inactive Signals |
| 3 | +parent: Data Sources and Signals |
4 | 4 | grand_parent: COVIDcast API
|
5 | 5 | ---
|
6 | 6 |
|
7 | 7 | # Quidel
|
| 8 | +{: .no_toc} |
8 | 9 |
|
9 | 10 | * **Source name:** `quidel`
|
| 11 | + |
| 12 | +## Table of contents |
| 13 | +{: .no_toc .text-delta} |
| 14 | + |
| 15 | +1. TOC |
| 16 | +{:toc} |
| 17 | + |
| 18 | +## COVID-19 Tests |
| 19 | + |
| 20 | +* **First issued:** 27 July 2020 |
| 21 | +* **Number of data revisions since 19 May 2020:** 0 |
| 22 | +* **Date of last change:** Never |
| 23 | +* **Available for:** hrr, msa, state (see [geography coding docs](../covidcast_geography.md)) |
| 24 | + |
| 25 | +Data source based on COVID-19 Antigen tests, provided to us by Quidel, Inc. When |
| 26 | +a patient (whether at a doctor’s office, clinic, or hospital) has COVID-like |
| 27 | +symptoms, doctors may order an antigen test. An antigen test can detect parts of |
| 28 | +the virus that are present during an active infection. This is in contrast with |
| 29 | +antibody tests, which detect parts of the immune system that react to the virus, |
| 30 | +but which persist long after the infection has passed. Quidel began providing us |
| 31 | +with test data starting May 9, 2020, and data volume increased to statistically |
| 32 | +meaningful levels starting May 26, 2020. |
| 33 | + |
| 34 | +| Signal | Description | |
| 35 | +| --- | --- | |
| 36 | +| `covid_ag_raw_pct_positive` | Percentage of antigen tests that were positive for COVID-19, with no smoothing applied. | |
| 37 | +| `covid_ag_smoothed_pct_positive` | Percentage of antigen tests that were positive for COVID-19, smoothed by pooling together the last 7 days of tests. | |
| 38 | + |
| 39 | +### Estimation |
| 40 | + |
| 41 | +The source data from which we derive our estimates contains a number of features |
| 42 | +for every test, including localization at 5-digit Zip Code level, a TestDate and |
| 43 | +StorageDate, patient age, and unique identifiers for the device on which the |
| 44 | +test was performed, the individual test, and the result. Multiple tests are |
| 45 | +stored on each device. |
| 46 | + |
| 47 | +Let $$n$$ be the number of total COVID tests taken over a given time period and a |
| 48 | +given location (the test result can be negative, positive, or invalid). Let $$x$$ be the |
| 49 | +number of tests taken with positive results in this location over the given time |
| 50 | +period. We are interested in estimating the percentage of positive tests which |
| 51 | +is defined as: |
| 52 | + |
| 53 | +$$ |
| 54 | +p = \frac{100 x}{n} |
| 55 | +$$ |
| 56 | + |
| 57 | +We estimate p across 3 temporal-spatial aggregation schemes: |
| 58 | +- daily, at the MSA (metropolitan statistical area) level; |
| 59 | +- daily, at the HRR (hospital referral region) level; |
| 60 | +- daily, at the state level. |
| 61 | + |
| 62 | +**MSA and HRR levels**: In a given MSA or HRR, suppose $$N$$ COVID tests are taken |
| 63 | +in a certain time period, $$X$$ is the number of tests taken with positive |
| 64 | +results. If $$N \geq 50$$, we simply use: |
| 65 | + |
| 66 | +$$ |
| 67 | +p = \frac{100 X}{N} |
| 68 | +$$ |
| 69 | + |
| 70 | +If $$N < 50$$, we lend $$50 - N$$ fake samples from its home state to shrink the |
| 71 | +estimate to the state's mean, which means: |
| 72 | + |
| 73 | +$$ |
| 74 | +p = 100 \left( \frac{N}{50} \frac{X}{N} + \frac{50 - N}{50} \frac{X_s}{N_s} \right) |
| 75 | +$$ |
| 76 | + |
| 77 | +where $$N_s, X_s$$ are the number of COVID tests and the number of COVID tests |
| 78 | +taken with positive results taken in its home state in the same time period. |
| 79 | + |
| 80 | +**State level**: the states with fewer than 50 tests are discarded. For the |
| 81 | +rest of the states with sufficient samples, |
| 82 | + |
| 83 | +$$ |
| 84 | +p = \frac{100 X}{N} |
| 85 | +$$ |
| 86 | + |
| 87 | +#### Standard Error |
| 88 | + |
| 89 | +We assume the estimates for each time point follow a binomial distribution. The |
| 90 | +estimated standard error then is: |
| 91 | + |
| 92 | +$$ |
| 93 | +\text{se} = \sqrt{ \frac{p(1-p)}{N} } |
| 94 | +$$ |
| 95 | + |
| 96 | +#### Smoothing |
| 97 | + |
| 98 | +Smoothed estimates are formed by pooling data over time. That is, daily, for |
| 99 | +each location, we first pool all data available in that location over the last 7 |
| 100 | +days, and we then recompute everything described in the last two |
| 101 | +subsections. Pooling in this way makes estimates available in more geographic |
| 102 | +areas, as many areas report very few tests per day, but have enough data to |
| 103 | +report when 7 days are considered. |
| 104 | + |
| 105 | +### Limitations |
| 106 | + |
| 107 | +This data source is based on data provided to us by a lab testing company. They can report on a portion of United States COVID-19 Antigen tests, but not all of them, and so this source only represents those tests known to them. Their coverage may vary across the United States. |
| 108 | + |
| 109 | +### Missingness |
| 110 | + |
| 111 | +When fewer than 50 tests are reported in a state on a specific day, no data is |
| 112 | +reported for that area on that day; an API query for all reported states on that |
| 113 | +day will not include it. |
| 114 | + |
| 115 | +When fewer than 50 tests are reported in an HRR or MSA on a specific day, and |
| 116 | +not enough samples can be filled in from the parent state, no data is reported |
| 117 | +for that area on that day; an API query for all reported geographic areas on |
| 118 | +that day will not include it. |
| 119 | + |
| 120 | +### Lag and Backfill |
| 121 | + |
| 122 | +Because testing centers may report their data to Quidel several days after they |
| 123 | +occur, these signals are typically available with 5-6 days of lag. This |
| 124 | +means that estimates for a specific day first become available 5-6 days |
| 125 | +later. |
| 126 | + |
| 127 | +The amount of lag in reporting can vary, and not all tests are reported with the |
| 128 | +same lag. After we first report estimates for a specific date, further data may |
| 129 | +arrive about tests that occurred on that date, sometimes six weeks later or |
| 130 | +more. When this happens, we issue new estimates for those dates. This means that |
| 131 | +a reported estimate for, say, June 10th may first be available in the API on |
| 132 | +June 14th and subsequently revised on June 16th. |
| 133 | + |
| 134 | + |
| 135 | +## Flu Tests |
| 136 | + |
| 137 | +* **First issued:** 20 April 2020 |
| 138 | +* **Last issued:** 19 May 2020 |
10 | 139 | * **Number of data revisions since 19 May 2020:** 0
|
11 | 140 | * **Date of last change:** Never
|
12 | 141 | * **Available for:** msa, state (see [geography coding docs](../covidcast_geography.md))
|
|
0 commit comments