Skip to content

Commit d9a45bb

Browse files
committed
Merge branch 'dev' into krivard/delete_csvs
2 parents b88f364 + 92c73be commit d9a45bb

File tree

15 files changed

+261
-88
lines changed

15 files changed

+261
-88
lines changed

.bumpversion.cfg

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 0.3.9
2+
current_version = 0.3.10
33
commit = False
44
tag = False
55

docs/api/covidcast-signals/quidel.md

+43-37
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ grand_parent: COVIDcast Epidata API
2020
* **Earliest issue available:** July 29, 2020
2121
* **Number of data revisions since May 19, 2020:** 1
2222
* **Date of last change:** October 22, 2020
23-
* **Available for:** hrr, msa, state (see [geography coding docs](../covidcast_geography.md))
23+
* **Available for:** county, hrr, msa, state, HHS, nation (see [geography coding docs](../covidcast_geography.md))
2424
* **Time type:** day (see [date format docs](../covidcast_times.md))
2525
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
2626

@@ -68,60 +68,66 @@ $$
6868
p = \frac{100 x}{n}
6969
$$
7070

71-
We estimate p across 3 temporal-spatial aggregation schemes:
71+
We estimate p across 6 temporal-spatial aggregation schemes:
72+
- daily, at the county level;
7273
- daily, at the MSA (metropolitan statistical area) level;
7374
- daily, at the HRR (hospital referral region) level;
74-
- daily, at the state level.
75+
- daily, at the state level;
76+
- daily, at the HHS level;
77+
- daily, at the US national level.
7578

76-
**MSA and HRR levels**: In a given MSA or HRR, suppose $$N$$ COVID tests are taken
77-
in a certain time period, $$X$$ is the number of tests taken with positive
78-
results.
79+
#### Standard Error
7980

80-
For raw signals:
81-
- if $$N \geq 50$$, we simply use:
81+
We assume the estimates for each time point follow a binomial distribution. The
82+
estimated standard error then is:
8283

8384
$$
84-
p = \frac{100 X}{N}
85+
\text{se} = 100 \sqrt{ \frac{\frac{p}{100}(1- \frac{p}{100})}{N} }
8586
$$
8687

87-
For smoothed signals, before taking the temporal pooling average,
88-
- if $$N \geq 50$$, we also use:
88+
#### Smoothing
89+
90+
We add two kinds of smoothing to the smoothed signals:
91+
92+
##### Temporal Smoothing
93+
Smoothed estimates are formed by pooling data over time. That is, daily, for
94+
each location, we first pool all data available in that location over the last 7
95+
days, and we then recompute everything described in the two subsections above.
96+
97+
Pooling in this way makes estimates available in more geographic areas, as many areas
98+
report very few tests per day, but have enough data to report when 7 days are considered.
99+
100+
##### Geographical Smoothing
101+
102+
**County, MSA and HRR levels**: In a given County, MSA or HRR, suppose $$N$$ COVID tests
103+
are taken in a certain time period, $$X$$ is the number of tests taken with positive
104+
results.
105+
106+
107+
For smoothed signals, after taking the temporal pooling,
108+
- if $$N \geq 50$$, we still use:
89109
$$
90110
p = \frac{100 X}{N}
91111
$$
92-
- if $$25 \leq N < 50$$, we lend $$50 - N$$ fake samples from its home state to shrink the
112+
- if $$25 \leq N < 50$$, we lend $$50 - N$$ fake samples from its parent state to shrink the
93113
estimate to the state's mean, which means:
94114
$$
95115
p = 100 \left( \frac{N}{50} \frac{X}{N} + \frac{50 - N}{50} \frac{X_s}{N_s} \right)
96116
$$
97117
where $$N_s, X_s$$ are the number of COVID tests and the number of COVID tests
98-
taken with positive results taken in its home state in the same time period.
118+
taken with positive results taken in its parent state in the same time period.
119+
A parent state is defined as the state with the largest proportion of the population
120+
in this county/MSA/HRR.
99121

100-
**State level**: the states with fewer than 50 tests are discarded. For the
101-
rest of the states with sufficient samples,
122+
Counties with sample sizes smaller than 50 are merged into megacounties for
123+
the raw signals; counties with sample sizes smaller than 25 are merged into megacounties for
124+
the smoothed signals.
102125

126+
**State level, HHS level, National level**: locations with fewer than 50 tests are discarded. For the remaining locations,
103127
$$
104128
p = \frac{100 X}{N}
105129
$$
106130

107-
#### Standard Error
108-
109-
We assume the estimates for each time point follow a binomial distribution. The
110-
estimated standard error then is:
111-
112-
$$
113-
\text{se} = 100 \sqrt{ \frac{\frac{p}{100}(1- \frac{p}{100})}{N} }
114-
$$
115-
116-
#### Smoothing
117-
118-
Smoothed estimates are formed by pooling data over time. That is, daily, for
119-
each location, we first pool all data available in that location over the last 7
120-
days, and we then recompute everything described in the last two
121-
subsections. Pooling in this way makes estimates available in more geographic
122-
areas, as many areas report very few tests per day, but have enough data to
123-
report when 7 days are considered.
124-
125131
### Lag and Backfill
126132

127133
Because testing centers may report their data to Quidel several days after they
@@ -142,13 +148,13 @@ This data source is based on data provided to us by a lab testing company. They
142148

143149
### Missingness
144150

145-
When fewer than 50 tests are reported in a state on a specific day, no data is
151+
When fewer than 50 tests are reported in a state/a HHS region/US on a specific day, no data is
146152
reported for that area on that day; an API query for all reported states on that
147153
day will not include it.
148154

149-
When fewer than 50 tests are reported in an HRR or MSA on a specific day, and
150-
not enough samples can be filled in from the parent state, no data is reported
151-
for that area on that day; an API query for all reported geographic areas on
155+
When fewer than 50 tests are reported in a county, HRR or MSA on a specific day, and
156+
not enough samples can be filled in from the parent state for smoothed signals specifically,
157+
no data is reported for that area on that day; an API query for all reported geographic areas on
152158
that day will not include it.
153159

154160
## Flu Tests

docs/api/covidcast_signals.md

+20-27
Original file line numberDiff line numberDiff line change
@@ -8,47 +8,40 @@ has_children: true
88
# Delphi's COVID-19 Data Sources and Signals
99

1010
Delphi's COVID-19 Surveillance Streams data includes the following data sources.
11-
Data from these sources is expected to be updated daily. You can use the
12-
[`covidcast_meta`](covidcast_meta.md) API endpoint to get summary information
11+
Data from most of these sources is typically updated daily. You can use the
12+
[`covidcast_meta`](covidcast_meta.md) endpoint to get summary information
1313
about the ranges of the different attributes for the different data sources.
1414

1515
The API for retrieving data from these sources is described in the
16-
[COVIDcast API endpoint documentation](covidcast.md). Changes and corrections to
17-
data in this API are listed in the [API changelog](covidcast_changelog.md).
16+
[COVIDcast endpoint documentation](covidcast.md). Changes and corrections to
17+
data from this endpoint are listed in the [changelog](covidcast_changelog.md).
1818

1919
To obtain many of these signals and update them daily, Delphi has written
2020
extensive software to obtain data from various sources, aggregate the data,
2121
calculate statistical estimates, and format the data to be shared through the
22-
COVIDcast API. This code is [open source and available on
23-
GitHub](https://github.com/cmu-delphi/covidcast-indicators), and contributions
24-
are welcome.
22+
COVIDcast endpoint of the Delphi Epidata API. This code is
23+
[open source and available on GitHub](https://github.com/cmu-delphi/covidcast-indicators),
24+
and contributions are welcome.
2525

26-
## COVIDcast Map Signals
26+
## COVIDcast Dashboard Signals
2727

2828
The following signals are currently displayed on [the public COVIDcast
29-
map](https://delphi.cmu.edu/covidcast/) and available in its [data export
30-
tool](https://delphi.cmu.edu/covidcast/export/):
29+
dashboard](https://delphi.cmu.edu/covidcast/):
3130

3231
| Kind | Name | Source | Signal |
3332
| ---- | ---- | ------ | ------ |
34-
| Public Behavior | At Away Location 6hr+ | [`safegraph`](covidcast-signals/safegraph.md) | `full_time_work_prop_7dav` |
35-
| Public Behavior | At Away Location 3-6hr | [`safegraph`](covidcast-signals/safegraph.md) | `part_time_work_prop_7dav` |
36-
| Public Behavior | Bar Visits | [`safegraph`](covidcast-signals/safegraph.md) | `bars_visit_prop` |
37-
| Public Behavior | Restaurant Visits | [`safegraph`](covidcast-signals/safegraph.md) | `restaurant_visit_prop` |
38-
| Public Behavior | People Wearing Masks | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_wearing_mask_7d` |
39-
| Public Behavior | Vaccine Acceptance | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_covid_vaccinated_or_accept` |
40-
| Public Behavior | COVID Symptom Searches on Google | [`google-symptoms`](covidcast-signals/google-symptoms.md) | `sum_anosmia_ageusia_smoothed_search` |
33+
| Public Behavior | People Wearing Masks | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_wwearing_mask_7d` |
34+
| Public Behavior | Vaccine Acceptance | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_wcovid_vaccinated_appointment_or_accept` |
35+
| Public Behavior | COVID Symptom Searches on Google | [`google-symptoms`](covidcast-signals/google-symptoms.md) | `sum_anosmia_ageusia_smoothed_search` |
36+
| Early Indicators | COVID-Like Symptoms | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_wcli` |
37+
| Early Indicators | COVID-Like Symptoms in Community | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_whh_cmnty_cli` |
4138
| Early Indicators | COVID-Related Doctor Visits | [`doctor-visits`](covidcast-signals/doctor-visits.md) | `smoothed_adj_cli` |
42-
| Early Indicators | COVID-Like Symptoms | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_cli` |
43-
| Early Indicators | COVID-Like Symptoms in Community | [`fb-survey`](covidcast-signals/fb-survey.md) | `smoothed_hh_cmnty_cli` |
44-
| Late Indicators | COVID Antigen Test Positivity (Quidel) | [`quidel`](covidcast-signals/quidel.md) | `covid_ag_smoothed_pct_positive` |
45-
| Late Indicators | Claims-Based COVID Hospital Admissions | [`hospital-admissions`](covidcast-signals/hospital-admissions.md) | `smoothed_adj_covid19_from_claims` |
46-
| Late Indicators | Cases | [`jhu-csse`](covidcast-signals/jhu-csse.md) | `confirmed_7dav_incidence_num` |
47-
| Late Indicators | Cases per 100,000 People | [`jhu-csse`](covidcast-signals/jhu-csse.md) | `confirmed_7dav_incidence_prop` |
48-
| Late Indicators | Deaths | [`jhu-csse`](covidcast-signals/jhu-csse.md) | `deaths_7dav_incidence_num` |
49-
| Late Indicators | Deaths per 100,000 People | [`jhu-csse`](covidcast-signals/jhu-csse.md) | `deaths_7dav_incidence_prop` |
39+
| Cases and Testing| COVID Antigen Test Positivity (Quidel) | [`quidel`](covidcast-signals/quidel.md) | `covid_ag_smoothed_pct_positive` |
40+
| Cases and Testing| COVID Cases | [`jhu-csse`](covidcast-signals/jhu-csse.md) | `confirmed_7dav_incidence_prop` |
41+
| Late Indicators | COVID Hospital Admissions | [`hhs`](covidcast-signals/hhs.md) | `confirmed_admissions_covid_1d_prop_7dav` |
42+
| Late Indicators | Deaths | [`jhu-csse`](covidcast-signals/jhu-csse.md) | `deaths_7dav_incidence_prop` |
5043

5144
## All Available Sources and Signals
5245

53-
Beyond the signals available on the COVIDcast map, numerous other signals are
54-
available directly through the API:
46+
Beyond the signals available on the COVIDcast dashboard, numerous other signals are
47+
available through our [data export tool](https://delphi.cmu.edu/covidcast/export/) or directly through the API:

0 commit comments

Comments
 (0)