Skip to content

Release nchs_mortality to production #605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 95 commits into from
Dec 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
7f087c3
Get CHC to pass pydocstyle
chinandrew Nov 20, 2020
a5223a1
Add to makefile
chinandrew Nov 20, 2020
e816a30
Get utils to pass pydocstyle
chinandrew Nov 20, 2020
0374860
Get quidel_covidtest to pass pydocstyle
chinandrew Nov 20, 2020
44eba98
Add pydocstyle to makefile
chinandrew Nov 20, 2020
44b4754
Correct makefile
chinandrew Nov 21, 2020
ac86ad8
Get quidel to pass pydocstyle
chinandrew Nov 21, 2020
5d9929d
Check for duplicate rows
JedGrabman Nov 23, 2020
2545dce
Update plans.md to remove duplicate rows task
JedGrabman Nov 23, 2020
cddfb38
move parameter checking to be separate from other validation
sgsmob Nov 24, 2020
9794653
remove vestigial checks
sgsmob Nov 24, 2020
e0a29b8
refactor of validate into a report
sgsmob Nov 24, 2020
3938f85
documentation of report
sgsmob Nov 24, 2020
51840c1
plug report logic into the running
sgsmob Nov 24, 2020
08b8446
adding pylintrc to validator package
sgsmob Nov 24, 2020
c7d1872
pylint compliance
sgsmob Nov 24, 2020
76c043f
add pylintrc
sgsmob Nov 24, 2020
0ae148e
Test Fix and Documentation Update
JedGrabman Nov 25, 2020
a9ba2ba
Add:
korlaxxalrok Nov 25, 2020
02692f9
refactor of validator for lint compliance
sgsmob Nov 30, 2020
cc34365
change order of super().__init__() call to preserve error function me…
sgsmob Nov 30, 2020
f9e9900
trailing newline to pylintrc
sgsmob Nov 30, 2020
feafbed
add TODO for last lint error
sgsmob Nov 30, 2020
727fe7a
Merge branch 'main' of github.com:cmu-delphi/covidcast-indicators int…
sgsmob Nov 30, 2020
f6653aa
get validation pylint compliant
sgsmob Dec 1, 2020
49bc1b2
Merge branch 'main' of github.com:cmu-delphi/covidcast-indicators int…
sgsmob Dec 1, 2020
ec9cc20
Merge branch 'validator' into validation_report
sgsmob Dec 1, 2020
48b75a8
fix bug with missing argument to refactored function
sgsmob Dec 1, 2020
331dd15
tests for utils
sgsmob Dec 1, 2020
29c2f12
constants capitalization
sgsmob Dec 1, 2020
71779b2
Use production ingestion directory
korlaxxalrok Dec 1, 2020
1750b13
tests for datafetching
sgsmob Dec 1, 2020
2cb2553
Add constants for cli
rumackaaron Dec 1, 2020
5e22295
Load files for cli
rumackaaron Dec 1, 2020
595e1dd
Adjust pipeline
rumackaaron Dec 1, 2020
604d11b
Merge branch 'main' of https://github.com/cmu-delphi/covidcast-indica…
rumackaaron Dec 1, 2020
c2c89fd
Merge branch 'validator' into validation_report
sgsmob Dec 1, 2020
d0db106
Fix linting
rumackaaron Dec 1, 2020
b7f213f
Update tests
rumackaaron Dec 1, 2020
f00586e
Modularized run
rumackaaron Dec 1, 2020
41e1584
params refactor
rumackaaron Dec 1, 2020
3290130
update backfill docstring
chinandrew Dec 1, 2020
90b285d
Merge pull request #586 from cmu-delphi/release-nchs_mortality
korlaxxalrok Dec 2, 2020
9d3e0fb
only iterate over geo,signal combos available on covidcast
sgsmob Dec 2, 2020
5abec9b
Merge pull request #587 from cmu-delphi/chng-dv
krivard Dec 2, 2020
22e2d59
Merge branch 'main' into chc-docs
chinandrew Dec 2, 2020
68f304a
Fix conflicts and new lints
chinandrew Dec 2, 2020
6fc6340
remove trailing whitespace
chinandrew Dec 2, 2020
db299e3
Update CHNG production params to include CLI inputs
krivard Dec 2, 2020
f091bdf
Merge pull request #569 from cmu-delphi/chc-docs
krivard Dec 2, 2020
a808f4f
Merge pull request #592 from cmu-delphi/chng/update-production-params
krivard Dec 2, 2020
d2e5a29
Merge pull request #572 from cmu-delphi/quidelpydocs
krivard Dec 2, 2020
8a734ff
Merge pull request #571 from cmu-delphi/quidel-pydocs
krivard Dec 2, 2020
6ec6824
Update _delphi_utils_python/delphi_utils/archive.py
chinandrew Dec 2, 2020
59a2945
Update _delphi_utils_python/delphi_utils/archive.py
chinandrew Dec 2, 2020
07dfdd2
Update _delphi_utils_python/delphi_utils/archive.py
chinandrew Dec 2, 2020
f1f927f
Update _delphi_utils_python/delphi_utils/archive.py
chinandrew Dec 2, 2020
6d588b5
Timezone issue described in #593
rumackaaron Dec 2, 2020
a51ee4d
Merge pull request #570 from cmu-delphi/utils-docs
krivard Dec 2, 2020
914cc1f
SirCAL default config: Add new signals, missing signals, and update m…
krivard Dec 2, 2020
a74b033
Update params template
rumackaaron Dec 3, 2020
1ed7184
Change Ansible params also
rumackaaron Dec 3, 2020
43320d8
Merge pull request #576 from JedGrabman/dev-duplicate-rows
krivard Dec 3, 2020
4382ab5
change super().__init__() call to get the correct error message printed
sgsmob Dec 3, 2020
484d3aa
move check increment to outside the loop:
sgsmob Dec 3, 2020
c372974
Merge pull request #597 from cmu-delphi/chng-params
krivard Dec 3, 2020
deacc35
Add diff bucket to params
korlaxxalrok Dec 3, 2020
352209e
Merge pull request #598 from cmu-delphi/update-nchs_mortality
korlaxxalrok Dec 3, 2020
09215a4
Add staging api proxy to Ansible inventory
korlaxxalrok Dec 3, 2020
5aac8af
Add new Ansible and template
korlaxxalrok Dec 3, 2020
ef74971
Add step to Jenkinsfile, update file name
korlaxxalrok Dec 3, 2020
787c28a
Add specific remote user for this playbook
korlaxxalrok Dec 3, 2020
16b5532
Merge pull request #596 from cmu-delphi/sircal/add-new-signals
krivard Dec 3, 2020
3beda49
Merge pull request #595 from cmu-delphi/chng-time
krivard Dec 3, 2020
7fb09b6
Merge pull request #578 from sgsmob/validator
krivard Dec 3, 2020
e01ac36
Update ansible/templates/staging-api-match-list.j2
korlaxxalrok Dec 3, 2020
659e19a
Update signal list
korlaxxalrok Dec 3, 2020
3f1571e
Accept changes
korlaxxalrok Dec 3, 2020
df6e13b
Merge pull request #599 from cmu-delphi/add-ansible-to-set-api-match-…
korlaxxalrok Dec 3, 2020
ff005e8
Update Jenkinsfile, add shell wrapper for Ansible playbook
korlaxxalrok Dec 3, 2020
7296aa8
Merge branch 'main' of github.com:cmu-delphi/covidcast-indicators int…
sgsmob Dec 4, 2020
7ef8f0a
Merge branch 'main' of github.com:cmu-delphi/covidcast-indicators int…
sgsmob Dec 4, 2020
723b75c
Merge pull request #602 from cmu-delphi/fix-staging-api-deploy
krivard Dec 4, 2020
a2c7709
Fix daily archive differ removing None files
eujing Dec 4, 2020
65d7d53
Fix file permissions
korlaxxalrok Dec 4, 2020
4409fd8
change name of unsuppressed errors
sgsmob Dec 4, 2020
fabb50b
change polarity of if/else to be positive
sgsmob Dec 4, 2020
9e3b957
Merge pull request #604 from cmu-delphi/fix-staging-api-deploy
krivard Dec 4, 2020
82fe76d
Merge pull request #588 from sgsmob/validator_combos
krivard Dec 4, 2020
e1546d9
Merge pull request #603 from cmu-delphi/fix-nchs_mortality
krivard Dec 4, 2020
be1750a
Merge branch 'main' of github.com:cmu-delphi/covidcast-indicators int…
sgsmob Dec 5, 2020
27c3a79
add_raised_warning documentation
sgsmob Dec 5, 2020
da71d08
duplicate rows to use report
sgsmob Dec 5, 2020
a8ca165
tests for reports
sgsmob Dec 5, 2020
acf2aef
Merge pull request #589 from sgsmob/validation_report
krivard Dec 7, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ pipeline {
}
parallel deploy_staging
}
sh "jenkins/deploy-staging-api-match-list.sh"
}
}
stage('Deploy production') {
Expand Down
3 changes: 2 additions & 1 deletion _delphi_utils_python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ install: venv

lint:
. env/bin/activate; \
pylint $(dir)
pylint $(dir); \
pydocstyle $(dir)

test:
. env/bin/activate ;\
Expand Down
3 changes: 1 addition & 2 deletions _delphi_utils_python/delphi_utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# -*- coding: utf-8 -*-
"""Common Utility Functions to Support DELPHI Indicators
"""
"""Common Utility Functions to Support DELPHI Indicators."""

from __future__ import absolute_import

Expand Down
51 changes: 31 additions & 20 deletions _delphi_utils_python/delphi_utils/archive.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""
Utilities for diffing and archiving covidcast export CSVs.

Aims to simplify the creation of issues for new and backfilled value for indicators.
Also handles archiving of export CSVs to some backend (git, S3 etc.) before replacing them.

Expand Down Expand Up @@ -52,6 +53,7 @@ def diff_export_csv(
) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]:
"""
Find differences in exported covidcast CSVs, using geo_id as the index.

Treats NA == NA as True.

Parameters
Expand All @@ -68,7 +70,6 @@ def diff_export_csv(
changed_df is the pd.DataFrame of common rows from after_csv with changed values.
added_df is the pd.DataFrame of added rows from after_csv.
"""

export_csv_dtypes = {"geo_id": str, "val": float,
"se": float, "sample_size": float}

Expand Down Expand Up @@ -99,7 +100,7 @@ def run_module(archive_type: str,
cache_dir: str,
export_dir: str,
**kwargs):
"""Builds and runs an ArchiveDiffer.
"""Build and run an ArchiveDiffer.

Parameters
----------
Expand Down Expand Up @@ -132,13 +133,11 @@ def run_module(archive_type: str,


class ArchiveDiffer:
"""
Base class for performing diffing and archiving of exported covidcast CSVs
"""
"""Base class for performing diffing and archiving of exported covidcast CSVs."""

def __init__(self, cache_dir: str, export_dir: str):
"""
Initialize an ArchiveDiffer
Initialize an ArchiveDiffer.

Parameters
----------
Expand All @@ -157,15 +156,17 @@ def __init__(self, cache_dir: str, export_dir: str):

def update_cache(self):
"""
For making sure cache_dir is updated correctly from a backend.
Make sure cache_dir is updated correctly from a backend.

To be implemented by specific archiving backends.
Should set self._cache_updated = True after verifying cache is updated.
"""
raise NotImplementedError

def diff_exports(self) -> Tuple[Files, FileDiffMap, Files]:
"""
Finds diffs across and within CSV files, from cache_dir to export_dir.
Find diffs across and within CSV files, from cache_dir to export_dir.

Should be called after update_cache() succeeds. Only works on *.csv files,
ignores every other file.

Expand Down Expand Up @@ -223,7 +224,8 @@ def diff_exports(self) -> Tuple[Files, FileDiffMap, Files]:

def archive_exports(self, exported_files: Files) -> Tuple[Files, Files]:
"""
Handles actual archiving of files, depending on specific backend.
Handle actual archiving of files, depending on specific backend.

To be implemented by specific archiving backends.

Parameters
Expand All @@ -241,6 +243,8 @@ def archive_exports(self, exported_files: Files) -> Tuple[Files, Files]:

def filter_exports(self, common_diffs: FileDiffMap):
"""
Filter export directory to only contain relevant files.

Filters down the export_dir to only contain:
1) New files, 2) Changed files, filtered-down to the ADDED and CHANGED rows only.
Should be called after archive_exports() so we archive the raw exports before
Expand Down Expand Up @@ -269,7 +273,7 @@ def filter_exports(self, common_diffs: FileDiffMap):
replace(diff_file, exported_file)

def run(self):
"""Runs the differ and archives the changed and new files."""
"""Run the differ and archive the changed and new files."""
self.update_cache()

# Diff exports, and make incremental versions
Expand All @@ -293,7 +297,8 @@ def run(self):

class S3ArchiveDiffer(ArchiveDiffer):
"""
AWS S3 backend for archving
AWS S3 backend for archiving.

Archives CSV files into a S3 bucket, with keys "{indicator_prefix}/{csv_file_name}".
Ideally, versioning should be enabled in this bucket to track versions of each CSV file.
"""
Expand All @@ -306,6 +311,7 @@ def __init__(
):
"""
Initialize a S3ArchiveDiffer.

See this link for possible aws_credentials kwargs:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session

Expand All @@ -330,9 +336,7 @@ def __init__(
self.indicator_prefix = indicator_prefix

def update_cache(self):
"""
For making sure cache_dir is updated with all latest files from the S3 bucket.
"""
"""Make sure cache_dir is updated with all latest files from the S3 bucket."""
# List all indicator-related objects from S3
archive_objects = self.bucket.objects.filter(
Prefix=self.indicator_prefix).all()
Expand All @@ -358,7 +362,7 @@ def archive_exports(self, # pylint: disable=arguments-differ
update_s3: bool = True
) -> Tuple[Files, Files]:
"""
Handles actual archiving of files to the S3 bucket.
Handle actual archiving of files to the S3 bucket.

Parameters
----------
Expand Down Expand Up @@ -398,7 +402,8 @@ def archive_exports(self, # pylint: disable=arguments-differ

class GitArchiveDiffer(ArchiveDiffer):
"""
Local git repo backend for archiving
Local git repo backend for archiving.

Archives CSV files into a local git repo as commits.
Assumes that a git repository is already set up.
"""
Expand Down Expand Up @@ -446,7 +451,8 @@ def __init__(

def get_branch(self, branch_name: Optional[str] = None) -> Head:
"""
Retrieves a Head object representing a branch of specified name.
Retrieve a Head object representing a branch of specified name.

Creates the branch from the current active branch if does not exist yet.

Parameters
Expand All @@ -469,6 +475,8 @@ def get_branch(self, branch_name: Optional[str] = None) -> Head:
@contextmanager
def archiving_branch(self):
"""
Context manager for checking out a branch.

Useful for checking out self.branch within a context, then switching back
to original branch when finished.
"""
Expand All @@ -482,8 +490,9 @@ def archiving_branch(self):

def update_cache(self):
"""
Check if cache_dir is clean: has everything nicely committed if override_dirty=False.

Since we are using a local git repo, assumes there is nothing to update from.
Checks if cache_dir is clean: has everything nice committed if override_dirty=False
"""
# Make sure cache directory is clean: has everything nicely committed
if not self.override_dirty:
Expand All @@ -495,14 +504,16 @@ def update_cache(self):

def diff_exports(self) -> Tuple[Files, FileDiffMap, Files]:
"""
Same as base class diff_exports, but in context of specified branch
Find diffs across and within CSV files, from cache_dir to export_dir.

Same as base class diff_exports, but in context of specified branch.
"""
with self.archiving_branch():
return super().diff_exports()

def archive_exports(self, exported_files: Files) -> Tuple[Files, Files]:
"""
Handles actual archiving of files to the local git repo.
Handle actual archiving of files to the local git repo.

Parameters
----------
Expand Down
18 changes: 10 additions & 8 deletions _delphi_utils_python/delphi_utils/geomap.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,9 @@ class GeoMapper: # pylint: disable=too-many-public-methods
"""

def __init__(self):
"""Initialize geomapper. Holds loading the crosswalk tables
until a conversion function is first used.
"""Initialize geomapper.

Holds loading the crosswalk tables until a conversion function is first used.

Parameters
---------
Expand All @@ -110,7 +111,7 @@ def __init__(self):

# Utility functions
def _load_crosswalk(self, from_code, to_code):
"""Loads the crosswalk from from_code -> to_code."""
"""Load the crosswalk from from_code -> to_code."""
stream = pkg_resources.resource_stream(
__name__, self.crosswalk_filepaths[from_code][to_code]
)
Expand Down Expand Up @@ -189,7 +190,7 @@ def _load_crosswalk(self, from_code, to_code):

@staticmethod
def convert_fips_to_mega(data, fips_col="fips", mega_col="megafips"):
"""convert fips string to a megafips string"""
"""Convert fips string to a megafips string."""
data = data.copy()
data[mega_col] = data[fips_col].astype(str).str.zfill(5)
data[mega_col] = data[mega_col].str.slice_replace(start=2, stop=5, repl="000")
Expand All @@ -205,7 +206,7 @@ def megacounty_creation(
date_col="date",
mega_col="megafips",
):
"""create megacounty column
"""Create megacounty column.

Parameters
---------
Expand Down Expand Up @@ -412,8 +413,9 @@ def replace_geocode(

def add_population_column(self, data, geocode_type, geocode_col=None, dropna=True):
"""
Appends a population column to a dataframe, based on the FIPS or ZIP code. If no
dataframe is provided, the full crosswalk from geocode to population is returned.
Append a population column to a dataframe, based on the FIPS or ZIP code.

If no dataframe is provided, the full crosswalk from geocode to population is returned.

Parameters
---------
Expand Down Expand Up @@ -464,7 +466,7 @@ def fips_to_megacounty(
mega_col="megafips",
count_cols=None,
):
"""Convert and aggregate from FIPS to megaFIPS
"""Convert and aggregate from FIPS to megaFIPS.

Parameters
---------
Expand Down
7 changes: 4 additions & 3 deletions _delphi_utils_python/delphi_utils/signal.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
import covidcast

def add_prefix(signal_names, wip_signal, prefix="wip_"):
"""Adds prefix to signal if there is a WIP signal
"""Add prefix to signal if there is a WIP signal.

Parameters
----------
signal_names: List[str]
Expand All @@ -18,7 +19,6 @@ def add_prefix(signal_names, wip_signal, prefix="wip_"):
List of signal names
wip/non wip signals for further computation
"""

if wip_signal is True:
return [prefix + signal for signal in signal_names]
if isinstance(wip_signal, list):
Expand All @@ -37,7 +37,8 @@ def add_prefix(signal_names, wip_signal, prefix="wip_"):


def public_signal(signal):
"""Checks if the signal name is already public using COVIDcast
"""Check if the signal name is already public using COVIDcast.

Parameters
----------
signal : str
Expand Down
2 changes: 1 addition & 1 deletion _delphi_utils_python/delphi_utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from shutil import copyfile

def read_params():
"""Reads a file named 'params.json' in the current working directory.
"""Read a file named 'params.json' in the current working directory.

If the file does not exist, it copies the file 'params.json.template' to
'param.json' and then reads the file.
Expand Down
14 changes: 14 additions & 0 deletions ansible/ansible-deploy-staging-api-proxy-match-list.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
- hosts: api_proxy_staging
remote_user: deploy
vars_files:
- vars.yaml
- vault.yaml
tasks:
- name: Set staging api proxy openresty signal match list template.
template:
src: "templates/staging-api-match-list.j2"
dest: "/common/staging-api-match-list"
owner: "deploy"
group: "deploy"
mode: "0777"
3 changes: 3 additions & 0 deletions ansible/inventory
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@ delphi-master-prod-01.delphi.cmu.edu

[runtime_host_staging]
app-mono-dev-01.delphi.cmu.edu

[api_proxy_staging]
api-staging.delphi.cmu.edu
11 changes: 9 additions & 2 deletions ansible/templates/changehc-params-prod.json.j2
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@
"static_file_dir": "./static",
"export_dir": "/common/covidcast/receiving/chng",
"cache_dir": "./cache",
"input_denom_file": null,
"input_covid_file": null,
"input_files": {
"denom": null,
"covid": null,
"flu": null,
"mixed": null,
"flu_like": null,
"covid_like": null
},
"start_date": "2020-02-01",
"end_date": null,
"drop_date": null,
Expand All @@ -13,6 +19,7 @@
"parallel": false,
"geos": ["state", "msa", "hrr", "county"],
"weekday": [true, false],
"types": ["covid","cli"],
"wip_signal": "",
"aws_credentials": {
"aws_access_key_id": "",
Expand Down
15 changes: 15 additions & 0 deletions ansible/templates/nchs_mortality-params-prod.json.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"export_start_date": "2020-02-01",
"static_file_dir": "./static",
"export_dir": "/common/covidcast/receiving/nchs-mortality",
"cache_dir": "./cache",
"daily_export_dir": "./daily_receiving",
"daily_cache_dir": "./daily_cache",
"token": "{{ nchs_mortality_token }}",
"mode":"",
"aws_credentials": {
"aws_access_key_id": "{{ delphi_aws_access_key_id }}",
"aws_secret_access_key": "{{ delphi_aws_secret_access_key }}"
},
"bucket_name": "delphi-covidcast-indicator-output"
}
4 changes: 4 additions & 0 deletions ansible/templates/staging-api-match-list.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
data_source=quidel-staging&signal=covid_ag_
data_source=chng
data_source=safegraph
data_source=google-symptoms
1 change: 1 addition & 0 deletions ansible/vars.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ changehc_sftp_host: "{{ vault_changehc_sftp_host }}"
changehc_sftp_port: "{{ vault_changehc_sftp_port }}"
changehc_sftp_user: "{{ vault_changehc_sftp_user }}"
changehc_sftp_password: "{{ vault_changehc_sftp_password }}"
nchs_mortality_token: "{{ vault_nchs_mortality_token }}"
Loading