Skip to content

Commit e7cd101

Browse files
authored
Merge pull request #169 from krivard/docs/v1.7
Docs/v1.7
2 parents e172ced + e6a5a82 commit e7cd101

File tree

3 files changed

+151
-22
lines changed

3 files changed

+151
-22
lines changed

docs/api/covidcast-signals/quidel.md

Lines changed: 130 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,141 @@
11
---
22
title: Quidel
3-
parent: Inactive Signals
3+
parent: Data Sources and Signals
44
grand_parent: COVIDcast API
55
---
66

77
# Quidel
8+
{: .no_toc}
89

910
* **Source name:** `quidel`
11+
12+
## Table of contents
13+
{: .no_toc .text-delta}
14+
15+
1. TOC
16+
{:toc}
17+
18+
## COVID-19 Tests
19+
20+
* **First issued:** 27 July 2020
21+
* **Number of data revisions since 19 May 2020:** 0
22+
* **Date of last change:** Never
23+
* **Available for:** hrr, msa, state (see [geography coding docs](../covidcast_geography.md))
24+
25+
Data source based on COVID-19 Antigen tests, provided to us by Quidel, Inc. When
26+
a patient (whether at a doctor’s office, clinic, or hospital) has COVID-like
27+
symptoms, doctors may order an antigen test. An antigen test can detect parts of
28+
the virus that are present during an active infection. This is in contrast with
29+
antibody tests, which detect parts of the immune system that react to the virus,
30+
but which persist long after the infection has passed. Quidel began providing us
31+
with test data starting May 9, 2020, and data volume increased to statistically
32+
meaningful levels starting May 26, 2020.
33+
34+
| Signal | Description |
35+
| --- | --- |
36+
| `covid_ag_raw_pct_positive` | Percentage of antigen tests that were positive for COVID-19, with no smoothing applied. |
37+
| `covid_ag_smoothed_pct_positive` | Percentage of antigen tests that were positive for COVID-19, smoothed by pooling together the last 7 days of tests. |
38+
39+
### Estimation
40+
41+
The source data from which we derive our estimates contains a number of features
42+
for every test, including localization at 5-digit Zip Code level, a TestDate and
43+
StorageDate, patient age, and unique identifiers for the device on which the
44+
test was performed, the individual test, and the result. Multiple tests are
45+
stored on each device.
46+
47+
Let $$n$$ be the number of total COVID tests taken over a given time period and a
48+
given location (the test result can be negative, positive, or invalid). Let $$x$$ be the
49+
number of tests taken with positive results in this location over the given time
50+
period. We are interested in estimating the percentage of positive tests which
51+
is defined as:
52+
53+
$$
54+
p = \frac{100 x}{n}
55+
$$
56+
57+
We estimate p across 3 temporal-spatial aggregation schemes:
58+
- daily, at the MSA (metropolitan statistical area) level;
59+
- daily, at the HRR (hospital referral region) level;
60+
- daily, at the state level.
61+
62+
**MSA and HRR levels**: In a given MSA or HRR, suppose $$N$$ COVID tests are taken
63+
in a certain time period, $$X$$ is the number of tests taken with positive
64+
results. If $$N \geq 50$$, we simply use:
65+
66+
$$
67+
p = \frac{100 X}{N}
68+
$$
69+
70+
If $$N < 50$$, we lend $$50 - N$$ fake samples from its home state to shrink the
71+
estimate to the state's mean, which means:
72+
73+
$$
74+
p = 100 \left( \frac{N}{50} \frac{X}{N} + \frac{50 - N}{50} \frac{X_s}{N_s} \right)
75+
$$
76+
77+
where $$N_s, X_s$$ are the number of COVID tests and the number of COVID tests
78+
taken with positive results taken in its home state in the same time period.
79+
80+
**State level**: the states with fewer than 50 tests are discarded. For the
81+
rest of the states with sufficient samples,
82+
83+
$$
84+
p = \frac{100 X}{N}
85+
$$
86+
87+
#### Standard Error
88+
89+
We assume the estimates for each time point follow a binomial distribution. The
90+
estimated standard error then is:
91+
92+
$$
93+
\text{se} = \sqrt{ \frac{p(1-p)}{N} }
94+
$$
95+
96+
#### Smoothing
97+
98+
Smoothed estimates are formed by pooling data over time. That is, daily, for
99+
each location, we first pool all data available in that location over the last 7
100+
days, and we then recompute everything described in the last two
101+
subsections. Pooling in this way makes estimates available in more geographic
102+
areas, as many areas report very few tests per day, but have enough data to
103+
report when 7 days are considered.
104+
105+
### Limitations
106+
107+
This data source is based on data provided to us by a lab testing company. They can report on a portion of United States COVID-19 Antigen tests, but not all of them, and so this source only represents those tests known to them. Their coverage may vary across the United States.
108+
109+
### Missingness
110+
111+
When fewer than 50 tests are reported in a state on a specific day, no data is
112+
reported for that area on that day; an API query for all reported states on that
113+
day will not include it.
114+
115+
When fewer than 50 tests are reported in an HRR or MSA on a specific day, and
116+
not enough samples can be filled in from the parent state, no data is reported
117+
for that area on that day; an API query for all reported geographic areas on
118+
that day will not include it.
119+
120+
### Lag and Backfill
121+
122+
Because testing centers may report their data to Quidel several days after they
123+
occur, these signals are typically available with 5-6 days of lag. This
124+
means that estimates for a specific day first become available 5-6 days
125+
later.
126+
127+
The amount of lag in reporting can vary, and not all tests are reported with the
128+
same lag. After we first report estimates for a specific date, further data may
129+
arrive about tests that occurred on that date, sometimes six weeks later or
130+
more. When this happens, we issue new estimates for those dates. This means that
131+
a reported estimate for, say, June 10th may first be available in the API on
132+
June 14th and subsequently revised on June 16th.
133+
134+
135+
## Flu Tests
136+
137+
* **First issued:** 20 April 2020
138+
* **Last issued:** 19 May 2020
10139
* **Number of data revisions since 19 May 2020:** 0
11140
* **Date of last change:** Never
12141
* **Available for:** msa, state (see [geography coding docs](../covidcast_geography.md))

docs/api/covidcast_signals.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -21,20 +21,21 @@ data in this API are listed in the [API changelog](covidcast_changelog.md).
2121
The following signals are currently displayed on [the public COVIDcast
2222
map](https://covidcast.cmu.edu/):
2323

24-
| Name | Source | Signal |
25-
| --- | --- | --- |
26-
| Doctor's Visits | [`doctor-visits`](covidcast-signals/doctor-visits.md) | `smoothed_adj_cli` |
27-
| Hospital Admissions | [`hospital-admissions`](covidcast-signals/hospital-admissions.md) | `smoothed_adj_covid19` |
28-
| Symptoms (Facebook) | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_cli` |
29-
| Symptoms in Community (Facebook) | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_hh_cmnty_cli` |
30-
| Away from Home 6hr+ (SafeGraph) | [`safegraph`](covidcast-signals/safegraph.md) | `full_time_work_prop` |
31-
| Away from Home 3-6hr (SafeGraph) | [`safegraph`](covidcast-signals/safegraph.md) | `part_time_work_prop` |
32-
| Search Trends (Google) | [`ght`](covidcast-signals/ght.md) | `smoothed_search` |
33-
| Combined | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `nmf_day_doc_fbc_fbs_ght` |
34-
| Cases | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `confirmed_7dav_incidence_num` |
35-
| Cases per 100,000 People | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `confirmed_7dav_incidence_prop` |
36-
| Deaths | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `deaths_7dav_incidence_num` |
37-
| Deaths per 100,000 People | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `deaths_7dav_incidence_prop` |
24+
| Kind | Name | Source | Signal |
25+
| ---- | ---- | ------ | ------ |
26+
| Public Behavior | Away from Home 6hr+ (SafeGraph) | [`safegraph`](covidcast-signals/safegraph.md) | `full_time_work_prop` |
27+
| Public Behavior | Away from Home 3-6hr (SafeGraph) | [`safegraph`](covidcast-signals/safegraph.md) | `part_time_work_prop` |
28+
| Public Behavior | Search Trends (Google) | [`ght`](covidcast-signals/ght.md) | `smoothed_search` |
29+
| Early Indicators | Symptoms (Facebook) | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_cli` |
30+
| Early Indicators | Symptoms in Community (Facebook) | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_hh_cmnty_cli` |
31+
| Early Indicators | Doctor's Visits | [`doctor-visits`](covidcast-signals/doctor-visits.md) | `smoothed_adj_cli` |
32+
| Early Indicators | Combined | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `nmf_day_doc_fbc_fbs_ght` |
33+
| Late Indicators | Test Positivity Rate | [`quidel`](covidcast-signals/quidel.md) | `covid_ag_smoothed_pct_positive` |
34+
| Late Indicators | Cases | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `confirmed_7dav_incidence_num` |
35+
| Late Indicators | Cases per 100,000 People | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `confirmed_7dav_incidence_prop` |
36+
| Late Indicators | Deaths | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `deaths_7dav_incidence_num` |
37+
| Late Indicators | Deaths per 100,000 People | [`indicator-combination`](covidcast-signals/indicator-combination.md) | `deaths_7dav_incidence_prop` |
38+
| Late Indicators | Hospital Admissions | [`hospital-admissions`](covidcast-signals/hospital-admissions.md) | `smoothed_adj_covid19` |
3839

3940
## All Available Sources and Signals
4041

docs/symptom-survey/survey-files.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,19 +27,18 @@ where the data is hosted.
2727

2828
## Naming Conventions
2929

30-
All dates in filenames are of the form `YYYY_mm_dd`.
31-
3230
Cumulative files:
3331

34-
cvid_responses_{from}_-_{to}.csv.gz
32+
{YYYY_mm}.tar
3533

3634
Incremental files:
3735

38-
cvid_responses_{for}_recordedby_{recorded}.csv
36+
cvid_responses_{for}_recordedby_{recorded}.csv.gz
3937

40-
`from`, `to`, and `for` refer to the day the survey response was started, in the
41-
Pacific time zone (UTC - 7). `recorded` refers to the day survey data was
42-
retrieved; see the [lag policy](#lag-policy) for more details.
38+
Dates in incremental filenames are of the form `YYYY_mm_dd`. `for` refers to the
39+
day the survey response was started, in the Pacific time zone (UTC -
40+
7). `recorded` refers to the day survey data was retrieved; see the [lag
41+
policy](#lag-policy) for more details.
4342

4443
Every day, we write response files for *all* days of data, with today's
4544
`recorded` date. You need only load the most recent set of `recorded` files to

0 commit comments

Comments
 (0)