Skip to content

Release covidcast-indicators 0.3.2 #1516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 49 commits into from
Feb 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
20a46d2
add age groups
Jan 11, 2022
801e2f5
update code for adding megacounties
Jan 14, 2022
15f7690
update unit tests
Jan 14, 2022
010491d
get smoothers out of the complicated loop
Jan 14, 2022
d215d6c
fix a linting
Jan 14, 2022
a120175
fix an error
Jan 14, 2022
b199cad
ignore too-many-branches in pylintrc
Jan 14, 2022
e7870e8
fix a linting error
Jan 14, 2022
07642fa
update signal names, add two super age groups
Jan 14, 2022
1ee31ae
fix a linting error
Jan 14, 2022
4eee961
remove 18-64 age group
Jan 18, 2022
a81050c
add whitespace and add comments
Jan 18, 2022
dc06d9c
add tests for ages 0-17
Jan 18, 2022
0c0f9f5
small udpates for suggested changes
Jan 18, 2022
1707cfb
add state_id for megacounties
Jan 21, 2022
e1226e1
add tests for state_id
Jan 21, 2022
562773d
Add minimal censored counties test and get error?
dshemetov Jan 21, 2022
ef41f6a
add suggested changes
Jan 21, 2022
c81298b
update unit tests based on the current strategy
Jan 23, 2022
7721290
geo_id should be integers in the unit tests
Jan 23, 2022
b7f94c9
udpate geographical pooling
Jan 25, 2022
1fdbe49
update unit tests
Jan 25, 2022
52a4aa6
update unit tests in test_run
Jan 25, 2022
9f3f6c3
delete trailing whitespaces
Jan 25, 2022
49af726
Add a few tests to double check county censoring
dshemetov Jan 26, 2022
3218b1e
Remove faux-breakpoint, update test_data, update test_run
dshemetov Jan 26, 2022
5c6d798
fix the test in test_run
Jan 27, 2022
3e9232e
remove the question in comments
Jan 27, 2022
ab25b35
add tests for values
Jan 27, 2022
55f7ee4
manually added pop size to zipcodes in derive_zip_population_table
rafaelcatoia Feb 2, 2022
7c12efc
minor alteration: ignore index when concatenating
rafaelcatoia Feb 2, 2022
8fbff93
add archiver section to quidel params
Feb 2, 2022
963bb5a
fix params
Feb 2, 2022
784cfd8
Merge pull request #1511 from cmu-delphi/bot/sync-prod-main
krivard Feb 3, 2022
ecc66dd
updetd zip_pop.csv
rafaelcatoia Feb 3, 2022
0a2eeb4
changed readme file
rafaelcatoia Feb 3, 2022
b46ebb8
typo fixex
rafaelcatoia Feb 3, 2022
43e2151
Merge branch 'main' into zips-fips-crosswalk
rafaelcatoia Feb 3, 2022
bd536ce
Update _delphi_utils_python/data_proc/geomap/README.md
rafaelcatoia Feb 4, 2022
efbb07e
added the missing population in a txt file
rafaelcatoia Feb 4, 2022
1523879
changed txt to csv
rafaelcatoia Feb 4, 2022
f473a3c
deleting txt file
rafaelcatoia Feb 4, 2022
5789cda
letting only the csv file
rafaelcatoia Feb 4, 2022
5270383
Merge pull request #1512 from cmu-delphi/zips-fips-crosswalk
krivard Feb 4, 2022
ea31835
Merge pull request #1467 from cmu-delphi/Add_Age_Group_to_QuidelCovid…
krivard Feb 8, 2022
724046b
[quidel] Activate archivediffer in prod params
krivard Feb 8, 2022
881a53b
Merge pull request #1515 from cmu-delphi/krivard/quidel-archivediffer
krivard Feb 8, 2022
c8294f4
chore: bump delphi_utils to 0.3.1
Feb 8, 2022
6019bc4
chore: bump covidcast-indicators to 0.3.2
Feb 8, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.3.1
current_version = 0.3.2
commit = True
message = chore: bump covidcast-indicators to {new_version}
tag = False
2 changes: 1 addition & 1 deletion _delphi_utils_python/.bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.3.0
current_version = 0.3.1
commit = True
message = chore: bump delphi_utils to {new_version}
tag = False
Expand Down
8 changes: 4 additions & 4 deletions _delphi_utils_python/data_proc/geomap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ You can see consistency checks and diffs with old sources in ./consistency_check

We support the following geocodes.

- The ZIP code and the FIPS code are the most granular geocodes we support.
- The ZIP code and the FIPS code are the most granular geocodes we support.
- The [ZIP code](https://en.wikipedia.org/wiki/ZIP_Code) is a US postal code used by the USPS and the [FIPS code](https://en.wikipedia.org/wiki/FIPS_county_code) is an identifier for US counties and other associated territories. The ZIP code is five digit code (with leading zeros).
- The FIPS code is a five digit code (with leading zeros), where the first two digits are a two-digit state code and the last three are a three-digit county code (see this [US Census Bureau page](https://www.census.gov/library/reference/code-lists/ansi.html) for detailed information).
- The Metropolitan Statistical Area (MSA) code refers to regions around cities (these are sometimes referred to as CBSA codes). More information on these can be found at the [US Census Bureau](https://www.census.gov/programs-surveys/metro-micro/about.html).
- We are reserving 10001-10099 for states codes of the form 100XX where XX is the FIPS code for the state (the current smallest CBSA is 10100). In the case that the CBSA codes change then it should be verified that these are not used.
- We are reserving 10001-10099 for states codes of the form 100XX where XX is the FIPS code for the state (the current smallest CBSA is 10100). In the case that the CBSA codes change then it should be verified that these are not used.
- State codes are a series of equivalent identifiers for US state. They include the state name, the state number (state_id), and the state two-letter abbreviation (state_code). The state number is the state FIPS code. See [here](https://en.wikipedia.org/wiki/List_of_U.S._state_and_territory_abbreviations) for more.
- The Hospital Referral Region (HRR) and the Hospital Service Area (HSA). More information [here](https://www.dartmouthatlas.org/covid-19/hrr-mapping/).
- The JHU signal contains its own geographic identifier, labeled the UID. Documentation is provided at [their repo](https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data#uid-lookup-table-logic). Its FIPS codes depart in some special cases, so we produce manual changes listed below.
Expand All @@ -30,7 +30,7 @@ We support the following geocodes.

The source files are requested from a government URL when `geo_data_proc.py` is run (see the top of said script for the URLs). Below we describe the locations to find updated versions of the source files, if they are ever needed.

- ZIP -> FIPS (county) population tables available from [US Census](https://www.census.gov/geographies/reference-files/time-series/geo/relationship-files.html#par_textimage_674173622). This file contains the population of the intersections between ZIP and FIPS regions, allowing the creation of a population-weighted transform between the two.
- ZIP -> FIPS (county) population tables available from [US Census](https://www.census.gov/geographies/reference-files/time-series/geo/relationship-files.html#par_textimage_674173622). This file contains the population of the intersections between ZIP and FIPS regions, allowing the creation of a population-weighted transform between the two. As of 4 February 2022, this source did not include population information for 24 ZIPs that appear in our indicators. We have added those values manually using information available from the [zipdatamaps website](www.zipdatamaps.com).
- ZIP -> HRR -> HSA crosswalk file comes from the 2018 version at the [Dartmouth Atlas Project](https://atlasdata.dartmouth.edu/static/supp_research_data).
- FIPS -> MSA crosswalk file comes from the September 2018 version of the delineation files at the [US Census Bureau](https://www.census.gov/geographies/reference-files/time-series/demo/metro-micro/delineation-files.html).
- State Code -> State ID -> State Name comes from the ANSI standard at the [US Census](https://www.census.gov/library/reference/code-lists/ansi.html#par_textimage_3). The first two digits of a FIPS codes should match the state code here.
Expand Down Expand Up @@ -60,6 +60,6 @@ The rest of the crosswalk tables are derived from the mappings above. We provide
- MSA tables from March 2020 [here](https://www.census.gov/geographies/reference-files/time-series/demo/metro-micro/delineation-files.html). This file seems to differ in a few fips codes from the source for the 02_20_uszip file which Jingjing constructed. There are at least 10 additional fips in 03_20_msa that are not in the uszip file, and one of the msa codes seems to be incorrect: 49020 (a google search confirms that it is incorrect in uszip and correct in the census data).
- MSA tables from 2019 [here](https://apps.bea.gov/regional/docs/msalist.cfm)

## Notes
## Notes

- The NAs in the coding currently zero-fills.
15 changes: 15 additions & 0 deletions _delphi_utils_python/data_proc/geomap/geo_data_proc.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
FIPS_POPULATION_URL = f"https://www2.census.gov/programs-surveys/popest/datasets/2010-{YEAR}/counties/totals/co-est{YEAR}-alldata.csv"
FIPS_PUERTO_RICO_POPULATION_URL = "https://www2.census.gov/geo/docs/maps-data/data/rel/zcta_county_rel_10.txt?"
STATE_HHS_FILE = "hhs.txt"
ZIP_POP_MISSING_FILE = "zip_pop_filling.csv"

# Out files
FIPS_STATE_OUT_FILENAME = "fips_state_table.csv"
Expand Down Expand Up @@ -181,6 +182,7 @@ def create_jhu_uid_fips_crosswalk():
]
)


jhu_df = pd.read_csv(JHU_FIPS_URL, dtype={"UID": str, "FIPS": str}).query("Country_Region == 'US'")
jhu_df = jhu_df.rename(columns={"UID": "jhu_uid", "FIPS": "fips"}).dropna(subset=["fips"])

Expand Down Expand Up @@ -336,6 +338,7 @@ def create_hhs_population_table():
state_pop = pd.read_csv(join(OUTPUT_DIR, STATE_POPULATION_OUT_FILENAME), dtype={"state_code": str, "hhs": int}, usecols=["state_code", "pop"])
state_hhs = pd.read_csv(join(OUTPUT_DIR, STATE_HHS_OUT_FILENAME), dtype=str)
hhs_pop = state_pop.merge(state_hhs, on="state_code").groupby("hhs", as_index=False).sum()

hhs_pop.sort_values("hhs").to_csv(join(OUTPUT_DIR, HHS_POPULATION_OUT_FILENAME), index=False)


Expand Down Expand Up @@ -363,6 +366,18 @@ def derive_zip_population_table():
df = census_pop.merge(fz_df, on="fips", how="left")
df["pop"] = df["pop"].multiply(df["weight"], axis=0)
df = df.drop(columns=["fips", "weight"]).groupby("zip").sum().dropna().reset_index()

## loading populatoin of some zips- #Issue 0648
zip_pop_missing = pd.read_csv(
ZIP_POP_MISSING_FILE,sep=",",
dtype={"zip":str,"pop":np.int32}
)
## cheking if each zip still missing, and concatenating if True
for x_zip in zip_pop_missing['zip']:
if x_zip not in df['zip']:
df = pd.concat([df, zip_pop_missing[zip_pop_missing['zip'] == x_zip]],
ignore_index=True)

df["pop"] = df["pop"].astype(int)
df.sort_values("zip").to_csv(join(OUTPUT_DIR, ZIP_POPULATION_OUT_FILENAME), index=False)

Expand Down
25 changes: 25 additions & 0 deletions _delphi_utils_python/data_proc/geomap/zip_pop_filling.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
zip,pop
57756,1126
57764,1923
57770,5271
57772,2048
57794,644
99554,677
99563,938
99566,192
99573,1115
99574,2348
99581,762
99585,417
99586,605
99604,1093
99620,577
99632,813
99650,568
99657,329
99658,616
99662,480
99666,189
99677,88
99686,4005
99693,248
2 changes: 1 addition & 1 deletion _delphi_utils_python/delphi_utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@
from .nancodes import Nans
from .weekday import Weekday

__version__ = "0.3.0"
__version__ = "0.3.1"
24 changes: 24 additions & 0 deletions _delphi_utils_python/delphi_utils/data/2019/zip_pop.csv
Original file line number Diff line number Diff line change
Expand Up @@ -19549,15 +19549,19 @@ zip,pop
57752,317
57754,4067
57755,119
57756,1126
57758,219
57759,585
57760,1395
57761,1360
57762,547
57763,266
57764,1923
57766,224
57767,167
57769,3915
57770,5271
57772,2048
57773,145
57775,251
57776,15
Expand All @@ -19572,6 +19576,7 @@ zip,pop
57791,213
57792,143
57793,2061
57794,644
57799,707
58001,54
58002,27
Expand Down Expand Up @@ -32756,29 +32761,37 @@ zip,pop
99551,677
99552,373
99553,1092
99554,677
99555,224
99556,2659
99557,784
99558,79
99559,8248
99561,451
99563,938
99564,88
99565,76
99566,183
99566,192
99567,9090
99568,295
99569,64
99571,170
99572,313
99573,1115
99573,1064
99574,2348
99574,2242
99575,113
99576,2640
99577,25433
99578,319
99579,106
99580,116
99581,762
99583,37
99585,417
99586,605
99586,577
99587,2220
99588,958
Expand All @@ -32787,6 +32800,7 @@ zip,pop
99591,107
99602,166
99603,10427
99604,1093
99605,223
99606,457
99607,233
Expand All @@ -32797,6 +32811,7 @@ zip,pop
99613,372
99614,690
99615,12347
99620,577
99621,781
99622,346
99624,86
Expand All @@ -32806,6 +32821,7 @@ zip,pop
99628,449
99630,206
99631,241
99632,813
99633,456
99634,382
99636,517
Expand All @@ -32820,18 +32836,23 @@ zip,pop
99647,42
99648,110
99649,78
99650,568
99651,65
99652,4506
99653,155
99654,63494
99655,723
99656,27
99657,329
99658,616
99659,422
99660,503
99661,1040
99662,480
99663,438
99664,5226
99665,77
99666,189
99667,79
99668,92
99669,15038
Expand All @@ -32840,6 +32861,7 @@ zip,pop
99672,3925
99674,1757
99676,1881
99677,88
99677,84
99678,903
99679,403
Expand All @@ -32850,11 +32872,13 @@ zip,pop
99684,725
99685,4438
99686,3824
99686,4005
99688,3358
99689,579
99690,302
99691,88
99692,159
99693,248
99693,236
99694,1840
99695,6
Expand Down
2 changes: 1 addition & 1 deletion _delphi_utils_python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

setup(
name="delphi_utils",
version="0.3.0",
version="0.3.1",
description="Shared Utility Functions for Indicators",
long_description=long_description,
long_description_content_type="text/markdown",
Expand Down
9 changes: 9 additions & 0 deletions ansible/templates/quidel_covidtest-params-prod.json.j2
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,15 @@
]
}
},
"archive": {
"aws_credentials": {
"aws_access_key_id": "{{ delphi_aws_access_key_id }}",
"aws_secret_access_key": "{{ delphi_aws_secret_access_key }}"
},
"bucket_name": "delphi-covidcast-indicator-output",
"cache_dir": "./archivediffer_cache",
"indicator_prefix": "quidel"
},
"delivery": {
"delivery_dir": "/common/covidcast/receiving/quidel"
}
Expand Down
1 change: 1 addition & 0 deletions quidel_covidtest/.pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
disable=logging-format-interpolation,
too-many-locals,
too-many-arguments,
too-many-branches,
# Allow pytest functions to be part of a class.
no-self-use,
# Allow pytest classes to have one test.
Expand Down
12 changes: 11 additions & 1 deletion quidel_covidtest/delphi_quidel_covidtest/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
MIN_OBS = 50 # minimum number of observations in order to compute a proportion.
POOL_DAYS = 7 # number of days in the past (including today) to pool over
END_FROM_TODAY_MINUS = 5 # report data until - X days
# Signal names
# Signal Types
SMOOTHED_POSITIVE = "covid_ag_smoothed_pct_positive"
RAW_POSITIVE = "covid_ag_raw_pct_positive"
SMOOTHED_TEST_PER_DEVICE = "covid_ag_smoothed_test_per_device"
Expand All @@ -22,6 +22,7 @@
HRR,
]

# state should be last one
NONPARENT_GEO_RESOLUTIONS = [
HHS,
NATION,
Expand All @@ -39,3 +40,12 @@
# SMOOTHED_TEST_PER_DEVICE: (True, True),
# RAW_TEST_PER_DEVICE: (True, False)
}
AGE_GROUPS = [
"total",
"age_0_4",
"age_5_17",
"age_18_49",
"age_50_64",
"age_65plus",
"age_0_17",
]
20 changes: 11 additions & 9 deletions quidel_covidtest/delphi_quidel_covidtest/data_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,15 +67,14 @@ def _slide_window_sum(arr, k):
sarr = np.convolve(temp, np.ones(k, dtype=int), 'valid')
return sarr


def _geographical_pooling(tpooled_tests, tpooled_ptests, min_obs):
"""
Determine how many samples from the parent geography must be borrowed.
If there are no samples available in the parent, the borrow_prop is 0. If
the parent does not have enough samples, we return a borrow_prop of 1, and
the fact that the pooled samples are insufficient are handled in the
statistic fitting step.
If there are no samples available in the parent, the borrow_prop is 0.
If the parent does not have enough samples, we return a borrow_prop of 1.
No more samples borrowed from the parent compared to the number of samples
we currently have.
Args:
tpooled_tests: np.ndarray[float]
Expand All @@ -93,10 +92,12 @@ def _geographical_pooling(tpooled_tests, tpooled_ptests, min_obs):
"""
if (np.any(np.isnan(tpooled_tests)) or np.any(np.isnan(tpooled_ptests))):
raise ValueError('[parent] tests should be non-negative '
'with no np.nan')
'with no np.nan')
# STEP 1: "TOP UP" USING PARENT LOCATION
# Number of observations we need to borrow to "top up"
# Can't borrow more than total no. observations.
borrow_tests = np.maximum(min_obs - tpooled_tests, 0)
borrow_tests = np.minimum(borrow_tests, tpooled_tests)
# There are many cases (a, b > 0):
# Case 1: a / b => no problem
# Case 2: a / 0 => np.inf => borrow_prop becomes 1
Expand All @@ -108,13 +109,14 @@ def _geographical_pooling(tpooled_tests, tpooled_ptests, min_obs):
with np.errstate(divide='ignore', invalid='ignore'):
borrow_prop = borrow_tests / tpooled_ptests
# If there's nothing to borrow, then ya can't borrow
borrow_prop[np.isnan(borrow_prop)] = 0
# Can't borrow more than total no. observations.
borrow_prop[(np.isnan(borrow_prop))
| (tpooled_tests == 0)
| (tpooled_ptests == 0)] = 0
# Can't borrow more than total no. observations in the parent state
# Relies on the fact that np.inf > 1
borrow_prop[borrow_prop > 1] = 1
return borrow_prop


def raw_positive_prop(positives, tests, min_obs):
"""
Calculate the proportion of positive tests without any temporal smoothing.
Expand Down
Loading