Skip to content

Commit d3eb8d5

Browse files
authored
Merge pull request #932 from cmu-delphi/krivard/deactivate-safegraph
Deactivate SafeGraph Weekly Patterns signals.
2 parents b44473e + dcb36d8 commit d3eb8d5

File tree

2 files changed

+67
-92
lines changed

2 files changed

+67
-92
lines changed

docs/api/covidcast-signals/safegraph-inactive.md

+67-5
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ grand_parent: COVIDcast Epidata API
1212
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
1313

1414
This data source uses data reported by [SafeGraph](https://www.safegraph.com/)
15-
using anonymized location data from mobile phones. SafeGraph provides several
16-
different datasets to eligible researchers. We surface signals from two such
17-
datasets. This dataset is no longer updated after April 19th, 2021.
15+
using anonymized location data from mobile phones. From June 2020-July 2022,
16+
SafeGraph provided several different datasets to eligible researchers. We
17+
surface signals from two such datasets.
1818

1919
## Table of Contents
2020
{: .no_toc .text-delta}
@@ -28,10 +28,12 @@ datasets. This dataset is no longer updated after April 19th, 2021.
2828
* **Number of data revisions since June 23, 2020:** 1
2929
* **Date of last change:** November 3, 2020
3030

31+
**This dataset is no longer updated after April 19th, 2021.**
32+
3133
Data source based on [social distancing
3234
metrics](https://docs.safegraph.com/docs/social-distancing-metrics). SafeGraph
33-
provides this data for individual census block groups, using differential
34-
privacy to protect individual people's data privacy.
35+
provided this data for individual census block groups, using differential
36+
privacy to protect individual people's data privacy.
3537

3638
Delphi creates features of the SafeGraph data at the census block group level,
3739
then aggregates these features to the county and state levels. The aggregated
@@ -72,3 +74,63 @@ SafeGraph provides this data with a three-day lag, meaning estimates for a
7274
specific day are only available three days later. It may take up to an
7375
additional day for SafeGraph's data to be ingested into the COVIDcast API.
7476

77+
## SafeGraph Weekly Patterns
78+
79+
* **Earliest issue available:** November 30, 2020
80+
* **Number of data revisions since June 23, 2020:** 0
81+
* **Date of last change:** never
82+
83+
**This dataset is no longer updated after July 15th, 2022.**
84+
85+
Data source based on [Weekly
86+
Patterns](https://docs.safegraph.com/docs/weekly-patterns) dataset. SafeGraph
87+
provided this data for different points of interest
88+
([POIs](https://docs.safegraph.com/v4.0/docs#section-core-places)) considering
89+
individual census block groups, using differential privacy to protect individual
90+
people's data privacy.
91+
92+
Delphi gathers the number of daily visits to POIs of certain types (bars,
93+
restaurants, etc.) from SafeGraph's Weekly Patterns data at the 5-digit ZipCode
94+
level, then aggregates and reports these features to the county, MSA, HRR, and
95+
state levels. The aggregated data is freely available through the COVIDcast API.
96+
97+
For precise definitions of the quantities below, consult the [SafeGraph Weekly
98+
Patterns documentation](https://docs.safegraph.com/docs/weekly-patterns).
99+
100+
| Signal | Description |
101+
| --- | --- |
102+
| `bars_visit_num` | The number of daily visits made by those with SafeGraph's apps to bar-related POIs in a certain region <br/> **Earliest date available:** 01/01/2019 |
103+
| `bars_visit_prop` | The number of daily visits made by those with SafeGraph's apps to bar-related POIs in a certain region, per 100,000 population <br/> **Earliest date available:** 01/01/2019 |
104+
| `restaurants_visit_num` | The number of daily visits made by those with SafeGraph's apps to restaurant-related POIs in a certain region <br/> **Earliest date available:** 01/01/2019 |
105+
| `restaurants_visit_prop` | The number of daily visits made by those with SafeGraph's apps to restaurant-related POIs in a certain region, per 100,000 population <br/> **Earliest date available:** 01/01/2019 |
106+
107+
SafeGraph delivered the number of daily visits to U.S. POIs, the details of which
108+
are described in the [Places
109+
Manual](https://readme.safegraph.com/docs/places-manual#section-placekey)
110+
dataset. Delphi aggregates the number of visits to certain types of places,
111+
such as bars (places with [NAICS code =
112+
722410](https://www.census.gov/cgi-bin/sssd/naics/naicsrch?input=722410&search=2017+NAICS+Search&search=2017))
113+
and restaurants (places with [NAICS code =
114+
722511](https://www.census.gov/cgi-bin/sssd/naics/naicsrch)). For example,
115+
Adagio Teas is coded as a bar because it serves alcohol, while Napkin Burger is
116+
considered to be a full-service restaurant. More information on NAICS codes is
117+
available from the [US Census Bureau: North American Industry Classification
118+
System](https://www.census.gov/eos/www/naics/index.html).
119+
120+
The number of POIs coded as bars is much smaller than the number of POIs coded as restaurants.
121+
SafeGraph's Weekly Patterns data consistently lacks data on bar visits for Alaska, Delaware, Maine, North Dakota, New Hampshire, South Dakota, Vermont, West Virginia, and Wyoming.
122+
For certain dates, bar visits data is also missing for District of Columbia, Idaho and Washington. Restaurant visits data is available for all of the states, as well as the District of Columbia and Puerto Rico.
123+
124+
### Lag
125+
126+
SafeGraph provided newly updated data for the previous week every Wednesday,
127+
meaning estimates for a specific day are only available 3-9 days later. It may
128+
take up to an additional day for SafeGraph's data to be ingested into the
129+
COVIDcast API.
130+
131+
## Limitations
132+
133+
SafeGraph's Social Distancing Metrics and Weekly Patterns are based on mobile devices that are members of SafeGraph panels, which is not necessarily the same thing as measuring the general public. These counts do not represent absolute counts, and only count visits by members of the panel in that region. This can result in several biases:
134+
135+
* **Geographic bias.** If some regions have a greater density of SafeGraph panel members as a percentage of the population than other regions, comparisons of metrics between regions may be biased. Regions with more SafeGraph panel members will appear to have more visits counted, even if the rate of visits in the general population is the same.
136+
* **Demographic bias.** SafeGraph panels may not be representative of the local population as a whole. For example, [some research suggests](https://doi.org/10.1145/3442188.3445881) that "older and non-white voters are less likely to be captured by mobility data", so this data will not accurately reflect behavior in those populations. Since population demographics vary across the United States, this can also contribute to geographic biases.

docs/api/covidcast-signals/safegraph.md

-87
This file was deleted.

0 commit comments

Comments
 (0)