Skip to content

Release new signals for 1.11 #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Dec 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
9b67091
documentation update for smoothed safegraph signals
sgsmob Nov 10, 2020
3011bc0
discussion of why we are introducing windows
sgsmob Nov 10, 2020
fcfe3da
update safegraph signals to use the new naming convention
sgsmob Nov 18, 2020
c2cf711
Update docs/api/covidcast-signals/safegraph.md
sgsmob Nov 19, 2020
3cbe9a6
Update docs/api/covidcast-signals/safegraph.md
sgsmob Nov 19, 2020
d780cba
Update docs/api/covidcast-signals/safegraph.md
sgsmob Nov 19, 2020
4f46e1c
Update docs/api/covidcast-signals/safegraph.md
sgsmob Nov 19, 2020
4003942
Update docs/api/covidcast-signals/safegraph.md
sgsmob Nov 19, 2020
3530a2f
add api doc for google-symptoms
Nov 9, 2020
d81d0ee
fix some errors
Nov 9, 2020
c6295e5
Add available geographical levels
jingjtang Nov 9, 2020
66a6108
Update docs/api/covidcast-signals/google-symptoms.md
jingjtang Nov 13, 2020
c45e890
Update docs/api/covidcast-signals/google-symptoms.md
jingjtang Nov 13, 2020
d057c97
Update docs/api/covidcast-signals/google-symptoms.md
jingjtang Nov 13, 2020
b264356
Update docs/api/covidcast-signals/google-symptoms.md
jingjtang Nov 13, 2020
91e9b63
add signal description. Emphasize the incomparability between regions
Nov 16, 2020
6529aaa
add api doc for safegraph-patterns
Nov 9, 2020
902614c
Update docs/api/covidcast-signals/safegraph-patterns.md
jingjtang Nov 13, 2020
5c56c3e
Update docs/api/covidcast-signals/safegraph-patterns.md
jingjtang Nov 13, 2020
e8f4626
Fix some grammatical errors
Nov 16, 2020
6329354
add safegraph-patterns to the old safegraph.md
Nov 17, 2020
1ba107a
remove the safegraph-patterns.md
Nov 17, 2020
2dbce9b
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 17, 2020
33bd22e
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 17, 2020
8f2b52e
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 17, 2020
af0d490
fix some errors
Nov 17, 2020
1592921
add NAICS code
Nov 19, 2020
ccde380
update the markdown format
Nov 19, 2020
49acd0c
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 19, 2020
ae86f13
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 19, 2020
3d1b2c3
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 19, 2020
8182e8d
Update docs/api/covidcast-signals/safegraph.md
jingjtang Nov 20, 2020
a98e9d7
Safegraph docs: correct first issue date, factor out common headers
krivard Nov 30, 2020
f27ab78
Docs: switch Other Epidemics to Other Diseases
krivard Nov 30, 2020
e5c9062
GS docs: Add missing headers
krivard Nov 30, 2020
17e4b64
GS docs: clarify phrasing
krivard Nov 30, 2020
d5b7844
1.11 docs: whitespace changes only
krivard Dec 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/api/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Epidata API (Other Epidemics)
title: Epidata API (Other Diseases)
nav_order: 3
has_children: true
---
Expand Down
2 changes: 1 addition & 1 deletion docs/api/afhsb.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: AFHSB
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# AFHSB
Expand Down
2 changes: 1 addition & 1 deletion docs/api/cdc.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: CDC
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# CDC
Expand Down
2 changes: 1 addition & 1 deletion docs/api/covid_hosp.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# COVID-19 Hospitalization
Expand Down
85 changes: 85 additions & 0 deletions docs/api/covidcast-signals/google-symptoms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
title: Google Symptoms
parent: Data Sources and Signals
grand_parent: COVIDcast API
---

# Google Symptoms
{: .no_toc}

* **Source name:** `google-symptoms`
* **First issued:** 30 November 2020
* **Number of data revisions since 19 May 2020:** 0
* **Date of last change:** Never
* **Available for:** county, MSA, HRR, state (see [geography coding docs](../covidcast_geography.md))
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)

This data source is based on the [COVID-19 Search Trends symptoms
dataset](https://github.com/google-research/open-covid-19-data/tree/master/data/exports/search_trends_symptoms_dataset). Using
this search data, we estimate the volume of searches mapped to symptoms related
to COVID-19 such as _anosmia_ (lack of smell) and _ageusia_(lack of taste). The
resulting daily dataset for each region shows the relative frequency of searches
for each symptom. The signals are measured in arbitrary units that are
normalized for population and scaled by the maximum value of the normalized
popularity within a geographic region across a specific time range. **Thus,
values are NOT comparable across geographic regions**. Larger numbers represent
higher numbers of symptom-related searches.

| Signal | Description |
| --- | --- |
| `anosmia_raw_search` | Google search volume for anosmia-related searches, in arbitrary units that are normalized for population |
| `anosmia_smoothed_search` | Google search volume for anosmia-related searches, in arbitrary units that are normalized for population, smoothed by 7-day average |
| `ageusia_raw_search` | Google search volume for ageusia-related searches, in arbitrary units that are normalized for population |
| `ageusia_smoothed_search` | Google search volume for ageusia-related searches, in arbitrary units that are normalized for population, smoothed by 7-day average |
| `sum_anosmia_ageusia_raw_search` | The sum of Google search volume for anosmia and ageusia related searches, in an arbitrary units that are normalized for population |
| `sum_anosmia_ageusia_smoothed_search` | The sum of Google search volume for anosmia and ageusia related searches, in an arbitrary units that are normalized for population, smoothed by 7-day average |


## Table of contents
{: .no_toc .text-delta}

1. TOC
{:toc}
## Estimation
The `sum_anosmia_ageusia_raw_search` signals are simply the raw sum of the
values of `anosmia_raw_search` and `ageusia_raw_search`, but not the union of
anosmia and ageusia related searches. This is because the data volume is
calculated based on search queries. A single search query can be mapped to more
than one symptom. Currently, Google does not provide _intersection/union_
data. Users should be careful when considering such signals.

## Limitation
When daily volume in a region does not meet quality or privacy thresholds, set
by Google, no value will be reported. Since Google uses differential privacy,
there is artificial noise added to the raw datasets to avoid identifying any
individual persons without affecting the quality of results.

The data is normalized by the total number of Search users in certain regions
for a certain time period and is scaled considering the maximum value of the
normalized popularity across the entire published time range for that region
over all symptoms. The values of symptom popularity are **NOT** comparable
across geographic regions. Due to the scaling step, most of the values should be
in the range 0-1. However, since the scaling factor is calculated and stored at
a certain time point, the symptom popularity released after that time point is
likely to exceed the previously-observed maximum value which results in values
larger than 1.


## Geographical Aggregation
The state-level and county-level `raw_search` signals for specific symptoms such
as _anosmia_ and _ageusia_ are taken directly from the [COVID-19 Search Trends
symptoms
dataset](https://github.com/google-research/open-covid-19-data/tree/master/data/exports/search_trends_symptoms_dataset)
without changes. We aggregate the county-level data to the MSA and HRR levels
using the population-weighted average. For MSAs/HRRs that include counties that
have no data provided due to quality or privacy issues for a certain day, we
simply assume the values to be 0 during aggregation. The values for MSAs/HRRs
with no counties having non-NaN values will not be reported. Thus, the resulting
MSA/HRR level data does not fully match the _actual_ MSA/HRR level data (which
we are not provided).


## Lag and Backfill
Google does not update the search data daily, but has an uncertain update
frequency. The delay can range from 1 day to 10 days or even more. We check for
updates every day and provide the most up-to-date data.
95 changes: 83 additions & 12 deletions docs/api/covidcast-signals/safegraph.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,35 @@ parent: Data Sources and Signals
grand_parent: COVIDcast Epidata API
---

# SafeGraph Mobility

# SafeGraph
{: .no_toc}
* **Source name:** `safegraph`
* **Available for:** county, MSA, HRR, state (see [geography coding docs](../covidcast_geography.md))
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)

This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
using anonymized location data from mobile phones. SafeGraph provides several
different datasets to eligible researchers. We surface signals from two such
datasets.

## Table of contents
{: .no_toc .text-delta}

1. TOC
{:toc}

## SafeGraph Social Distancing Metrics

* **First issued:** 23 June 2020
* **Number of data revisions since 23 June 2020:** 1
* **Date of last change:** 3 November 2020
* **Available for:** county, state (see [geography coding docs](../covidcast_geography.md))
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)

This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
using anonymized location data from mobile phones. SafeGraph provides [social
distancing metrics](https://docs.safegraph.com/docs/social-distancing-metrics)
to eligible researchers who obtain an API key. SafeGraph provides this data for
individual census block groups, using differential privacy to protect the
privacy of individual people in the data.
Data source based on [social distancing
metrics](https://docs.safegraph.com/docs/social-distancing-metrics). SafeGraph
provides this data for individual census block groups, using differential
privacy to protect individual people's data privacy.

Delphi creates features of the Safegraph data at the census block group level,
Delphi creates features of the SafeGraph data at the census block group level,
then aggregates these features to the county and state levels. The aggregated
data is freely available through the COVIDcast API.

Expand All @@ -34,6 +46,10 @@ documentation](https://docs.safegraph.com/docs/social-distancing-metrics).
| `full_time_work_prop` | The fraction of mobile devices that spent more than 6 hours at a location other than their home during the daytime (SafeGraph's `full_time_work_behavior_devices / device_count`) |
| `part_time_work_prop` | The fraction of devices that spent between 3 and 6 hours at a location other than their home during the daytime (SafeGraph's `part_time_work_behavior_devices / device_count`) |
| `median_home_dwell_time` | The median time spent at home for all devices at this location for this time period, in minutes |
| `completely_home_prop_7dav` | Offers a 7-day trailing window average of the `completely_home_prop`. |
| `full_time_work_prop_7dav` | Offers a 7-day trailing window average of the`full_time_work_prop`. |
| `part_time_work_prop_7dav` | Offers a 7-day trailing window average of the`part_time_work_prop`.|
| `median_home_dwell_time_7dav` | Offers a 7-day trailing window average of the `median_home_dwell_time`.|

After computing each metric on the census block group (CBG) level, we aggregate
to the county-level by taking the mean over CBGs in a county to obtain the value
Expand All @@ -43,8 +59,63 @@ doing so, we make the simplifying assumption that each CBG contributes an iid
observation to the county-level distribution. `n` also serves as the sample
size. The same method is used for aggregation to states.

## Lag
SafeGraph's signals measure mobility each day, which causes strong day-of-week
effects: weekends have substantially different values than weekdays. Users
interested in long-term trends, rather than mobility on one specific day, may
prefer the `7dav` signals since averaging over the preceding 7 days removes
these day-of-week effects.

### Lag

SafeGraph provides this data with a three-day lag, meaning estimates for a
specific day are only available three days later. It may take up to an
additional day for SafeGraph's data to be ingested into the COVIDcast API.


## SafeGraph Weekly Patterns

* **First issued:** 30 November 2020
* **Number of data revisions since 23 June 2020:** 0
* **Date of last change:** never

Data source based on [Weekly
Patterns](https://docs.safegraph.com/docs/weekly-patterns) dataset. SafeGraph
provides this data for different points of interest
([POIs](https://docs.safegraph.com/v4.0/docs#section-core-places)) considering
individual census block groups, using differential privacy to protect individual
people's data privacy.

Delphi gathers the number of daily visits to POIs of certain types(bars,
restaurants, etc.) from SafeGraph's Weekly Patterns data at the 5-digit ZipCode
level, then aggregates and reports these features to the county, MSA, HRR, and
state levels. The aggregated data is freely available through the COVIDcast API.

For precise definitions of the quantities below, consult the [SafeGraph Weekly
Patterns documentation](https://docs.safegraph.com/docs/weekly-patterns).

| Signal | Description |
| --- | --- |
| `bars_visit_num` | The number of daily visits to bar-related POIs in a certain region |
| `bars_visit_prop` | The number of daily visits to bar-related POIs in a certain region, per 100,000 population |
| `restaurants_visit_num` | The number of daily visits to restaurant-related POIs in a certain region |
| `restaurants_visit_prop` | The number of daily visits to restaurant-related POIs in a certain region, per 100,000 population |

SafeGraph delivers the number of daily visits to U.S. POIs, the details of which
are described in the [Places
Manual](https://readme.safegraph.com/docs/places-manual#section-placekey)
dataset. Delphi aggregates the number of visits to certain types of places,
such as bars (places with [NAICS code =
722410](https://www.census.gov/cgi-bin/sssd/naics/naicsrch?input=722410&search=2017+NAICS+Search&search=2017))
and restaurants (places with [NAICS code =
722511](https://www.census.gov/cgi-bin/sssd/naics/naicsrch)). For example,
Adagio Teas is coded as a bar because it serves alcohol, while Napkin Burger is
considered to be a full-service restaurant. More information on NAICS codes is
available from the [US Census Bureau: North American Industry Classification
System](https://www.census.gov/eos/www/naics/index.html).

### Lag

SafeGraph provides newly updated data for the previous week every Wednesday,
meaning estimates for a specific day are only available 3-9 days later. It may
take up to an additional day for SafeGraph's data to be ingested into the
COVIDcast API.
2 changes: 1 addition & 1 deletion docs/api/delphi.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Delphi Forecasts
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# Delphi Forecasts
Expand Down
2 changes: 1 addition & 1 deletion docs/api/dengue_nowcast.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Dengue Nowcast
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# Delphi's Dengue Nowcast
Expand Down
2 changes: 1 addition & 1 deletion docs/api/dengue_sensors.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Dengue Digital Surveillance
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# Dengue Digital Surveillance Sensors
Expand Down
2 changes: 1 addition & 1 deletion docs/api/ecdc_ili.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: ECDC ILI
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# ECDC ILI
Expand Down
2 changes: 1 addition & 1 deletion docs/api/flusurv.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Flusurv
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# FluSurv
Expand Down
2 changes: 1 addition & 1 deletion docs/api/fluview.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: FluView
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# FluView
Expand Down
2 changes: 1 addition & 1 deletion docs/api/fluview_clinical.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: FluView Clinical
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# FluView Clinical
Expand Down
2 changes: 1 addition & 1 deletion docs/api/fluview_meta.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: FluView metadata
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# FluView metadata
Expand Down
2 changes: 1 addition & 1 deletion docs/api/gft.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Google Flu Trends
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# Google Flu Trends
Expand Down
2 changes: 1 addition & 1 deletion docs/api/ght.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Google Health Trends
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# Google Health Trends
Expand Down
2 changes: 1 addition & 1 deletion docs/api/kcdc_ili.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: KCDC ILI
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# KCDC ILI
Expand Down
2 changes: 1 addition & 1 deletion docs/api/meta.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Metadata
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# API Metadata
Expand Down
2 changes: 1 addition & 1 deletion docs/api/meta_afhsb.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: AFHSB Metadata
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# AFHSB Metadata
Expand Down
2 changes: 1 addition & 1 deletion docs/api/meta_norostat.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: NoroSTAT Metadata
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# NoroSTAT Metadata
Expand Down
2 changes: 1 addition & 1 deletion docs/api/nidss_dengue.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: NIDSS Dengue
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# NIDSS Dengue
Expand Down
2 changes: 1 addition & 1 deletion docs/api/nidss_flu.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: NIDSS Flu
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# NIDSS Flu
Expand Down
2 changes: 1 addition & 1 deletion docs/api/norostat.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: NoroSTAT
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# NoroSTAT
Expand Down
2 changes: 1 addition & 1 deletion docs/api/nowcast.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: ILI Nearby Nowcast
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# ILI Nearby Nowcast
Expand Down
2 changes: 1 addition & 1 deletion docs/api/paho_dengue.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: PAHO Dengue
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# PAHO Dengue
Expand Down
2 changes: 1 addition & 1 deletion docs/api/quidel.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Quidel
parent: Epidata API (Other Epidemics)
parent: Epidata API (Other Diseases)
---

# Quidel
Expand Down
Loading