Skip to content

Commit 4d5d7c2

Browse files
authored
Merge pull request #295 from cmu-delphi/docs/1.11
Release new signals for 1.11
2 parents 00e19f9 + d5b7844 commit 4d5d7c2

29 files changed

+195
-39
lines changed

docs/api/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Epidata API (Other Epidemics)
2+
title: Epidata API (Other Diseases)
33
nav_order: 3
44
has_children: true
55
---

docs/api/afhsb.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: AFHSB
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# AFHSB

docs/api/cdc.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: CDC
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# CDC

docs/api/covid_hosp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# COVID-19 Hospitalization
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
title: Google Symptoms
3+
parent: Data Sources and Signals
4+
grand_parent: COVIDcast API
5+
---
6+
7+
# Google Symptoms
8+
{: .no_toc}
9+
10+
* **Source name:** `google-symptoms`
11+
* **First issued:** 30 November 2020
12+
* **Number of data revisions since 19 May 2020:** 0
13+
* **Date of last change:** Never
14+
* **Available for:** county, MSA, HRR, state (see [geography coding docs](../covidcast_geography.md))
15+
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
16+
17+
This data source is based on the [COVID-19 Search Trends symptoms
18+
dataset](https://github.com/google-research/open-covid-19-data/tree/master/data/exports/search_trends_symptoms_dataset). Using
19+
this search data, we estimate the volume of searches mapped to symptoms related
20+
to COVID-19 such as _anosmia_ (lack of smell) and _ageusia_(lack of taste). The
21+
resulting daily dataset for each region shows the relative frequency of searches
22+
for each symptom. The signals are measured in arbitrary units that are
23+
normalized for population and scaled by the maximum value of the normalized
24+
popularity within a geographic region across a specific time range. **Thus,
25+
values are NOT comparable across geographic regions**. Larger numbers represent
26+
higher numbers of symptom-related searches.
27+
28+
| Signal | Description |
29+
| --- | --- |
30+
| `anosmia_raw_search` | Google search volume for anosmia-related searches, in arbitrary units that are normalized for population |
31+
| `anosmia_smoothed_search` | Google search volume for anosmia-related searches, in arbitrary units that are normalized for population, smoothed by 7-day average |
32+
| `ageusia_raw_search` | Google search volume for ageusia-related searches, in arbitrary units that are normalized for population |
33+
| `ageusia_smoothed_search` | Google search volume for ageusia-related searches, in arbitrary units that are normalized for population, smoothed by 7-day average |
34+
| `sum_anosmia_ageusia_raw_search` | The sum of Google search volume for anosmia and ageusia related searches, in an arbitrary units that are normalized for population |
35+
| `sum_anosmia_ageusia_smoothed_search` | The sum of Google search volume for anosmia and ageusia related searches, in an arbitrary units that are normalized for population, smoothed by 7-day average |
36+
37+
38+
## Table of contents
39+
{: .no_toc .text-delta}
40+
41+
1. TOC
42+
{:toc}
43+
## Estimation
44+
The `sum_anosmia_ageusia_raw_search` signals are simply the raw sum of the
45+
values of `anosmia_raw_search` and `ageusia_raw_search`, but not the union of
46+
anosmia and ageusia related searches. This is because the data volume is
47+
calculated based on search queries. A single search query can be mapped to more
48+
than one symptom. Currently, Google does not provide _intersection/union_
49+
data. Users should be careful when considering such signals.
50+
51+
## Limitation
52+
When daily volume in a region does not meet quality or privacy thresholds, set
53+
by Google, no value will be reported. Since Google uses differential privacy,
54+
there is artificial noise added to the raw datasets to avoid identifying any
55+
individual persons without affecting the quality of results.
56+
57+
The data is normalized by the total number of Search users in certain regions
58+
for a certain time period and is scaled considering the maximum value of the
59+
normalized popularity across the entire published time range for that region
60+
over all symptoms. The values of symptom popularity are **NOT** comparable
61+
across geographic regions. Due to the scaling step, most of the values should be
62+
in the range 0-1. However, since the scaling factor is calculated and stored at
63+
a certain time point, the symptom popularity released after that time point is
64+
likely to exceed the previously-observed maximum value which results in values
65+
larger than 1.
66+
67+
68+
## Geographical Aggregation
69+
The state-level and county-level `raw_search` signals for specific symptoms such
70+
as _anosmia_ and _ageusia_ are taken directly from the [COVID-19 Search Trends
71+
symptoms
72+
dataset](https://github.com/google-research/open-covid-19-data/tree/master/data/exports/search_trends_symptoms_dataset)
73+
without changes. We aggregate the county-level data to the MSA and HRR levels
74+
using the population-weighted average. For MSAs/HRRs that include counties that
75+
have no data provided due to quality or privacy issues for a certain day, we
76+
simply assume the values to be 0 during aggregation. The values for MSAs/HRRs
77+
with no counties having non-NaN values will not be reported. Thus, the resulting
78+
MSA/HRR level data does not fully match the _actual_ MSA/HRR level data (which
79+
we are not provided).
80+
81+
82+
## Lag and Backfill
83+
Google does not update the search data daily, but has an uncertain update
84+
frequency. The delay can range from 1 day to 10 days or even more. We check for
85+
updates every day and provide the most up-to-date data.

docs/api/covidcast-signals/safegraph.md

Lines changed: 83 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,23 +4,35 @@ parent: Data Sources and Signals
44
grand_parent: COVIDcast Epidata API
55
---
66

7-
# SafeGraph Mobility
8-
7+
# SafeGraph
8+
{: .no_toc}
99
* **Source name:** `safegraph`
10+
* **Available for:** county, MSA, HRR, state (see [geography coding docs](../covidcast_geography.md))
11+
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
12+
13+
This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
14+
using anonymized location data from mobile phones. SafeGraph provides several
15+
different datasets to eligible researchers. We surface signals from two such
16+
datasets.
17+
18+
## Table of contents
19+
{: .no_toc .text-delta}
20+
21+
1. TOC
22+
{:toc}
23+
24+
## SafeGraph Social Distancing Metrics
25+
1026
* **First issued:** 23 June 2020
1127
* **Number of data revisions since 23 June 2020:** 1
1228
* **Date of last change:** 3 November 2020
13-
* **Available for:** county, state (see [geography coding docs](../covidcast_geography.md))
14-
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
1529

16-
This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
17-
using anonymized location data from mobile phones. SafeGraph provides [social
18-
distancing metrics](https://docs.safegraph.com/docs/social-distancing-metrics)
19-
to eligible researchers who obtain an API key. SafeGraph provides this data for
20-
individual census block groups, using differential privacy to protect the
21-
privacy of individual people in the data.
30+
Data source based on [social distancing
31+
metrics](https://docs.safegraph.com/docs/social-distancing-metrics). SafeGraph
32+
provides this data for individual census block groups, using differential
33+
privacy to protect individual people's data privacy.
2234

23-
Delphi creates features of the Safegraph data at the census block group level,
35+
Delphi creates features of the SafeGraph data at the census block group level,
2436
then aggregates these features to the county and state levels. The aggregated
2537
data is freely available through the COVIDcast API.
2638

@@ -34,6 +46,10 @@ documentation](https://docs.safegraph.com/docs/social-distancing-metrics).
3446
| `full_time_work_prop` | The fraction of mobile devices that spent more than 6 hours at a location other than their home during the daytime (SafeGraph's `full_time_work_behavior_devices / device_count`) |
3547
| `part_time_work_prop` | The fraction of devices that spent between 3 and 6 hours at a location other than their home during the daytime (SafeGraph's `part_time_work_behavior_devices / device_count`) |
3648
| `median_home_dwell_time` | The median time spent at home for all devices at this location for this time period, in minutes |
49+
| `completely_home_prop_7dav` | Offers a 7-day trailing window average of the `completely_home_prop`. |
50+
| `full_time_work_prop_7dav` | Offers a 7-day trailing window average of the`full_time_work_prop`. |
51+
| `part_time_work_prop_7dav` | Offers a 7-day trailing window average of the`part_time_work_prop`.|
52+
| `median_home_dwell_time_7dav` | Offers a 7-day trailing window average of the `median_home_dwell_time`.|
3753

3854
After computing each metric on the census block group (CBG) level, we aggregate
3955
to the county-level by taking the mean over CBGs in a county to obtain the value
@@ -43,8 +59,63 @@ doing so, we make the simplifying assumption that each CBG contributes an iid
4359
observation to the county-level distribution. `n` also serves as the sample
4460
size. The same method is used for aggregation to states.
4561

46-
## Lag
62+
SafeGraph's signals measure mobility each day, which causes strong day-of-week
63+
effects: weekends have substantially different values than weekdays. Users
64+
interested in long-term trends, rather than mobility on one specific day, may
65+
prefer the `7dav` signals since averaging over the preceding 7 days removes
66+
these day-of-week effects.
67+
68+
### Lag
4769

4870
SafeGraph provides this data with a three-day lag, meaning estimates for a
4971
specific day are only available three days later. It may take up to an
5072
additional day for SafeGraph's data to be ingested into the COVIDcast API.
73+
74+
75+
## SafeGraph Weekly Patterns
76+
77+
* **First issued:** 30 November 2020
78+
* **Number of data revisions since 23 June 2020:** 0
79+
* **Date of last change:** never
80+
81+
Data source based on [Weekly
82+
Patterns](https://docs.safegraph.com/docs/weekly-patterns) dataset. SafeGraph
83+
provides this data for different points of interest
84+
([POIs](https://docs.safegraph.com/v4.0/docs#section-core-places)) considering
85+
individual census block groups, using differential privacy to protect individual
86+
people's data privacy.
87+
88+
Delphi gathers the number of daily visits to POIs of certain types(bars,
89+
restaurants, etc.) from SafeGraph's Weekly Patterns data at the 5-digit ZipCode
90+
level, then aggregates and reports these features to the county, MSA, HRR, and
91+
state levels. The aggregated data is freely available through the COVIDcast API.
92+
93+
For precise definitions of the quantities below, consult the [SafeGraph Weekly
94+
Patterns documentation](https://docs.safegraph.com/docs/weekly-patterns).
95+
96+
| Signal | Description |
97+
| --- | --- |
98+
| `bars_visit_num` | The number of daily visits to bar-related POIs in a certain region |
99+
| `bars_visit_prop` | The number of daily visits to bar-related POIs in a certain region, per 100,000 population |
100+
| `restaurants_visit_num` | The number of daily visits to restaurant-related POIs in a certain region |
101+
| `restaurants_visit_prop` | The number of daily visits to restaurant-related POIs in a certain region, per 100,000 population |
102+
103+
SafeGraph delivers the number of daily visits to U.S. POIs, the details of which
104+
are described in the [Places
105+
Manual](https://readme.safegraph.com/docs/places-manual#section-placekey)
106+
dataset. Delphi aggregates the number of visits to certain types of places,
107+
such as bars (places with [NAICS code =
108+
722410](https://www.census.gov/cgi-bin/sssd/naics/naicsrch?input=722410&search=2017+NAICS+Search&search=2017))
109+
and restaurants (places with [NAICS code =
110+
722511](https://www.census.gov/cgi-bin/sssd/naics/naicsrch)). For example,
111+
Adagio Teas is coded as a bar because it serves alcohol, while Napkin Burger is
112+
considered to be a full-service restaurant. More information on NAICS codes is
113+
available from the [US Census Bureau: North American Industry Classification
114+
System](https://www.census.gov/eos/www/naics/index.html).
115+
116+
### Lag
117+
118+
SafeGraph provides newly updated data for the previous week every Wednesday,
119+
meaning estimates for a specific day are only available 3-9 days later. It may
120+
take up to an additional day for SafeGraph's data to be ingested into the
121+
COVIDcast API.

docs/api/delphi.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Delphi Forecasts
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# Delphi Forecasts

docs/api/dengue_nowcast.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Dengue Nowcast
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# Delphi's Dengue Nowcast

docs/api/dengue_sensors.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Dengue Digital Surveillance
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# Dengue Digital Surveillance Sensors

docs/api/ecdc_ili.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: ECDC ILI
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# ECDC ILI

docs/api/flusurv.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Flusurv
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# FluSurv

docs/api/fluview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: FluView
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# FluView

docs/api/fluview_clinical.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: FluView Clinical
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# FluView Clinical

docs/api/fluview_meta.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: FluView metadata
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# FluView metadata

docs/api/gft.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Google Flu Trends
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# Google Flu Trends

docs/api/ght.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Google Health Trends
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# Google Health Trends

docs/api/kcdc_ili.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: KCDC ILI
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# KCDC ILI

docs/api/meta.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Metadata
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# API Metadata

docs/api/meta_afhsb.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: AFHSB Metadata
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# AFHSB Metadata

docs/api/meta_norostat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: NoroSTAT Metadata
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# NoroSTAT Metadata

docs/api/nidss_dengue.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: NIDSS Dengue
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# NIDSS Dengue

docs/api/nidss_flu.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: NIDSS Flu
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# NIDSS Flu

docs/api/norostat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: NoroSTAT
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# NoroSTAT

docs/api/nowcast.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: ILI Nearby Nowcast
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# ILI Nearby Nowcast

docs/api/paho_dengue.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: PAHO Dengue
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# PAHO Dengue

docs/api/quidel.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Quidel
3-
parent: Epidata API (Other Epidemics)
3+
parent: Epidata API (Other Diseases)
44
---
55

66
# Quidel

0 commit comments

Comments
 (0)