Skip to content

Commit 7e43d70

Browse files
authored
Merge branch 'main-nan-testing' into nans_nchs
2 parents 05be4e4 + 2897308 commit 7e43d70

File tree

31 files changed

+14332
-600
lines changed

31 files changed

+14332
-600
lines changed

Jenkinsfile

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,10 @@ pipeline {
1919
stages {
2020
stage('Build and Package') {
2121
when {
22+
anyOf {
2223
branch "main";
24+
branch "main-nan-testing";
25+
}
2326
}
2427
steps {
2528
script {
@@ -34,7 +37,10 @@ pipeline {
3437
}
3538
stage('Deploy staging') {
3639
when {
40+
anyOf {
3741
branch "main";
42+
branch "main-nan-testing";
43+
}
3844
}
3945
steps {
4046
script {

_delphi_utils_python/DEVELOP.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# DELPHI Common Utility Functions (Python)
2+
3+
This directory contains the Python module `delphi_utils`. It includes a number of
4+
common functions that are useful across multiple indicators.
5+
6+
## Installing the Module
7+
8+
To install the module in your default version of Python, run the
9+
following from this directory:
10+
11+
```
12+
pip install .
13+
```
14+
15+
As described in each of the indicator code directories, you will want to install
16+
this module within a virtual environment when testing the various code bases.
17+
18+
### Testing the code
19+
20+
To do a static test of the code style, it is recommended to run **pylint** on
21+
the module. To do this, run the following from the main module directory:
22+
23+
```
24+
pylint delphi_utils
25+
```
26+
27+
The most aggressive checks are turned off; only relatively important issues
28+
should be raised and they should be manually checked (or better, fixed).
29+
30+
Unit tests are also included in the module. These should be run by first
31+
installing the module into a virtual environment:
32+
33+
```
34+
python -m venv env
35+
source env/bin/activate
36+
pip install .
37+
```
38+
39+
And then running the unit tests with:
40+
41+
```
42+
(cd tests && ../env/bin/pytest --cov=delphi_utils --cov-report=term-missing)
43+
```
44+
45+
The output will show the number of unit tests that passed and failed, along
46+
with the percentage of code covered by the tests. None of the tests should
47+
fail and the code lines that are not covered by unit tests should be small and
48+
should not include critical sub-routines.
49+
50+
When you are finished, the virtual environment can be deactivated and
51+
(optionally) removed.
52+
53+
```
54+
deactivate
55+
rm -r env
56+
```

_delphi_utils_python/LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
The MIT License (MIT)
2+
3+
Copyright (c) 2021 The Delphi Group at Carnegie Mellon University
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

_delphi_utils_python/README.md

Lines changed: 21 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,21 @@
1-
# DELPHI Common Utility Functions (Python)
2-
3-
This director contains the Python module `delphi_utils`. It includes a number of
4-
common functions that are useful across multiple indicators.
5-
6-
## Installing the Module
7-
8-
To install the module in your default version of Python, run the
9-
following from this directory:
10-
11-
```
12-
pip install .
13-
```
14-
15-
As described in each of the indicator code directories, you will want to install
16-
this module within a virtual environment when testing the various code bases.
17-
18-
### Testing the code
19-
20-
To do a static test of the code style, it is recommended to run **pylint** on
21-
the module. To do this, run the following from the main module directory:
22-
23-
```
24-
pylint delphi_utils
25-
```
26-
27-
The most aggressive checks are turned off; only relatively important issues
28-
should be raised and they should be manually checked (or better, fixed).
29-
30-
Unit tests are also included in the module. These should be run by first
31-
installing the module into a virtual environment:
32-
33-
```
34-
python -m venv env
35-
source env/bin/activate
36-
pip install .
37-
```
38-
39-
And then running the unit tests with:
40-
41-
```
42-
(cd tests && ../env/bin/pytest --cov=delphi_utils --cov-report=term-missing)
43-
```
44-
45-
The output will show the number of unit tests that passed and failed, along
46-
with the percentage of code covered by the tests. None of the tests should
47-
fail and the code lines that are not covered by unit tests should be small and
48-
should not include critical sub-routines.
49-
50-
When you are finished, the virtual environment can be deactivated and
51-
(optionally) removed.
52-
53-
```
54-
deactivate
55-
rm -r env
56-
```
1+
# Delphi Python Utilities
2+
3+
This package provides various utilities used by the [Delphi group](https://delphi.cmu.edu/) at [Carnegie Mellon
4+
University](https://www.cmu.edu) for its data pipelines and analyses.
5+
6+
Submodules:
7+
- `archive`: Diffing and archiving CSV files.
8+
- `export`: DataFrame to CSV export.
9+
- `geomap`: Mappings between geographic resolutions.
10+
- `logger`: Structured JSON logger.
11+
- `nancodes`: Enum constants encoding not-a-number cases.
12+
- `runner`: Orchestrator for running an indicator pipeline.
13+
- `signal`: Indicator (signal) naming.
14+
- `slack_notifier`: Slack notification integration.
15+
- `smooth`: Data smoothing functions.
16+
- `utils`: JSON parameter interactions.
17+
- `validator`: Data sanity checks and anomaly detection.
18+
19+
20+
Source code can be found here:
21+
[https://github.com/cmu-delphi/covidcast-indicators/](https://github.com/cmu-delphi/covidcast-indicators/)

_delphi_utils_python/delphi_utils/nancodes.py

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,4 @@
1-
"""Provides unified not-a-number codes for the indicators.
2-
3-
Currently requires a manual sync between the covidcast-indicators
4-
and the delphi-epidata repo.
5-
* in covidcast-indicators: _delphi_utils_python/delphi_utils
6-
* in delphi-epidata: src/acquisition/covidcast
7-
"""
1+
"""Unified not-a-number codes for CMU Delphi codebase."""
82

93
from enum import IntEnum
104

_delphi_utils_python/delphi_utils/validator/scripts/unique_geoids.R

Lines changed: 0 additions & 18 deletions
This file was deleted.

_delphi_utils_python/setup.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
11
from setuptools import setup
22
from setuptools import find_packages
33

4+
with open("README.md", "r") as f:
5+
long_description = f.read()
6+
47
required = [
58
"boto3",
69
"covidcast",
710
"freezegun",
811
"gitpython",
12+
"mock",
913
"moto",
1014
"numpy",
1115
"pandas>=1.1.0",
@@ -22,6 +26,8 @@
2226
name="delphi_utils",
2327
version="0.1.0",
2428
description="Shared Utility Functions for Indicators",
29+
long_description=long_description,
30+
long_description_content_type="text/markdown",
2531
author="",
2632
author_email="",
2733
url="https://github.com/cmu-delphi/",

facebook/delphiFacebook/NAMESPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ importFrom(dplyr,across)
6363
importFrom(dplyr,all_of)
6464
importFrom(dplyr,anti_join)
6565
importFrom(dplyr,arrange)
66+
importFrom(dplyr,bind_cols)
6667
importFrom(dplyr,bind_rows)
6768
importFrom(dplyr,case_when)
6869
importFrom(dplyr,coalesce)

facebook/delphiFacebook/R/contingency_aggregate.R

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -264,15 +264,10 @@ summarize_aggs <- function(df, crosswalk_data, aggregations, geo_level, params)
264264
}
265265

266266
## Find all unique groups and associated frequencies, saved in column `Freq`.
267-
# Keep rows with missing values initially so that we get the correct column
268-
# names. Explicitly drop groups with missing values in second step.
269267
unique_groups_counts <- as.data.frame(
270268
table(df[, group_vars, with=FALSE], exclude=NULL, dnn=group_vars),
271269
stringsAsFactors=FALSE
272270
)
273-
unique_groups_counts <- unique_groups_counts[
274-
complete.cases(unique_groups_counts[, group_vars]),
275-
]
276271

277272
# Drop groups with less than threshold sample size.
278273
unique_groups_counts <- filter(unique_groups_counts, Freq >= params$num_filter)
@@ -327,9 +322,10 @@ summarize_aggs <- function(df, crosswalk_data, aggregations, geo_level, params)
327322
aggregation <- aggregations$id[row]
328323
group_vars <- aggregations$group_by[[row]]
329324
post_fn <- aggregations$post_fn[[row]]
330-
325+
326+
# Keep only aggregations where the main value, `val`, is present.
331327
dfs_out[[aggregation]] <- dfs_out[[aggregation]][
332-
rowSums(is.na(dfs_out[[aggregation]][, c("val", "sample_size", group_vars)])) == 0,
328+
rowSums(is.na(dfs_out[[aggregation]][, c("val", "sample_size")])) == 0,
333329
]
334330

335331
dfs_out[[aggregation]] <- apply_privacy_censoring(dfs_out[[aggregation]], params)

facebook/delphiFacebook/R/contingency_calculate.R

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,11 @@
1313
#' be a non-integer
1414
#'
1515
#' @return a list of named means and other descriptive statistics
16-
compute_numeric <- function(response, weight, sample_size, total_represented)
16+
compute_household_binary <- function(response, weight, sample_size, total_represented)
1717
{
1818
response_mean <- compute_count_response(response, weight, sample_size)
1919
response_mean$sample_size <- sample_size
2020
response_mean$represented <- total_represented
21-
response_mean$se <- NA_real_
2221

2322
return(response_mean)
2423
}

0 commit comments

Comments
 (0)