Skip to content

Commit 5d1be4c

Browse files
authored
Merge pull request #76 from cmu-delphi/dv-package
Doctor's visits package
2 parents 3b4360c + 60a5202 commit 5d1be4c

27 files changed

+1408
-0
lines changed

doctor_visits/.gitignore

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# You should hard commit a prototype for this file, but we
2+
# want to avoid accidental adding of API tokens and other
3+
# private data parameters
4+
params.json
5+
6+
# Do not commit output files
7+
receiving/*.csv
8+
9+
# Remove macOS files
10+
.DS_Store
11+
12+
# virtual environment
13+
dview/
14+
15+
# Byte-compiled / optimized / DLL files
16+
__pycache__/
17+
*.py[cod]
18+
*$py.class
19+
20+
# C extensions
21+
*.so
22+
23+
# Distribution / packaging
24+
coverage.xml
25+
.Python
26+
build/
27+
develop-eggs/
28+
dist/
29+
downloads/
30+
eggs/
31+
.eggs/
32+
lib/
33+
lib64/
34+
parts/
35+
sdist/
36+
var/
37+
wheels/
38+
*.egg-info/
39+
.installed.cfg
40+
*.egg
41+
MANIFEST
42+
43+
# PyInstaller
44+
# Usually these files are written by a python script from a template
45+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
46+
*.manifest
47+
*.spec
48+
49+
# Installer logs
50+
pip-log.txt
51+
pip-delete-this-directory.txt
52+
53+
# Unit test / coverage reports
54+
htmlcov/
55+
.tox/
56+
.coverage
57+
.coverage.*
58+
.cache
59+
nosetests.xml
60+
coverage.xml
61+
*.cover
62+
.hypothesis/
63+
.pytest_cache/
64+
65+
# Translations
66+
*.mo
67+
*.pot
68+
69+
# Django stuff:
70+
*.log
71+
.static_storage/
72+
.media/
73+
local_settings.py
74+
75+
# Flask stuff:
76+
instance/
77+
.webassets-cache
78+
79+
# Scrapy stuff:
80+
.scrapy
81+
82+
# Sphinx documentation
83+
docs/_build/
84+
85+
# PyBuilder
86+
target/
87+
88+
# Jupyter Notebook
89+
.ipynb_checkpoints
90+
91+
# pyenv
92+
.python-version
93+
94+
# celery beat schedule file
95+
celerybeat-schedule
96+
97+
# SageMath parsed files
98+
*.sage.py
99+
100+
# Environments
101+
.env
102+
.venv
103+
env/
104+
venv/
105+
ENV/
106+
env.bak/
107+
venv.bak/
108+
109+
# Spyder project settings
110+
.spyderproject
111+
.spyproject
112+
113+
# Rope project settings
114+
.ropeproject
115+
116+
# mkdocs documentation
117+
/site
118+
119+
# mypy
120+
.mypy_cache/

doctor_visits/.pylintrc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[DESIGN]
2+
3+
min-public-methods=0
4+
5+
6+
[MESSAGES CONTROL]
7+
8+
disable=R0801, C0200, C0330, E1101, E0611, E1136, C0114, C0116, C0103, R0913, R0914, R0915, W1401, W1202, W1203, W0702

doctor_visits/README.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Doctor Visits Indicator
2+
3+
## Running the Indicator
4+
5+
The indicator is run by directly executing the Python module contained in this
6+
directory. The safest way to do this is to create a virtual environment,
7+
installed the common DELPHI tools, and then install the module and its
8+
dependencies. To do this, run the following code from this directory:
9+
10+
```
11+
python -m venv env
12+
source env/bin/activate
13+
pip install ../_delphi_utils_python/.
14+
pip install .
15+
```
16+
17+
All of the user-changable parameters are stored in `params.json`. To execute
18+
the module and produce the output datasets (by default, in `receiving`), run
19+
the following:
20+
21+
```
22+
env/bin/python -m delphi_doctor_visits
23+
```
24+
25+
Once you are finished with the code, you can deactivate the virtual environment
26+
and (optionally) remove the environment itself.
27+
28+
```
29+
deactivate
30+
rm -r env
31+
```
32+
33+
## Testing the code
34+
35+
To do a static test of the code style, it is recommended to run **pylint** on
36+
the module. To do this, run the following from the main module directory:
37+
38+
```
39+
env/bin/pylint delphi_doctor_visits
40+
```
41+
42+
The most aggressive checks are turned off; only relatively important issues
43+
should be raised and they should be manually checked (or better, fixed).
44+
45+
Unit tests are also included in the module. To execute these, run the following
46+
command from this directory:
47+
48+
```
49+
(cd tests && ../env/bin/pytest --cov=delphi_doctor_visits --cov-report=term-missing)
50+
```
51+
52+
The output will show the number of unit tests that passed and failed, along
53+
with the percentage of code covered by the tests. None of the tests should
54+
fail and the code lines that are not covered by unit tests should be small and
55+
should not include critical sub-routines.

doctor_visits/REVIEW.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
## Code Review (Python)
2+
3+
A code review of this module should include a careful look at the code and the
4+
output. To assist in the process, but certainly not in replace of it, please
5+
check the following items.
6+
7+
**Documentation**
8+
9+
- [ ] the README.md file template is filled out and currently accurate; it is
10+
possible to load and test the code using only the instructions given
11+
- [ ] minimal docstrings (one line describing what the function does) are
12+
included for all functions; full docstrings describing the inputs and expected
13+
outputs should be given for non-trivial functions
14+
15+
**Structure**
16+
17+
- [ ] code should use 4 spaces for indentation; other style decisions are
18+
flexible, but be consistent within a module
19+
- [ ] any required metadata files are checked into the repository and placed
20+
within the directory `static`
21+
- [ ] any intermediate files that are created and stored by the module should
22+
be placed in the directory `cache`
23+
- [ ] final expected output files to be uploaded to the API are placed in the
24+
`receiving` directory; output files should not be committed to the respository
25+
- [ ] all options and API keys are passed through the file `params.json`
26+
- [ ] template parameter file (`params.json.template`) is checked into the
27+
code; no personal (i.e., usernames) or private (i.e., API keys) information is
28+
included in this template file
29+
30+
**Testing**
31+
32+
- [ ] module can be installed in a new virtual environment
33+
- [ ] pylint with the default `.pylint` settings run over the module produces
34+
minimal warnings; warnings that do exist have been confirmed as false positives
35+
- [ ] reasonably high level of unit test coverage covering all of the main logic
36+
of the code (e.g., missing coverage for raised errors that do not currently seem
37+
possible to reach are okay; missing coverage for options that will be needed are
38+
not)
39+
- [ ] all unit tests run without errors

doctor_visits/cache/.gitignore

Whitespace-only changes.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# -*- coding: utf-8 -*-
2+
"""Module to pull and clean indicators from the Doctor's Visits source.
3+
4+
This file defines the functions that are made public by the module. As the
5+
module is intended to be executed though the main method, these are primarily
6+
for testing.
7+
"""
8+
9+
from __future__ import absolute_import
10+
11+
from . import config
12+
from . import direction
13+
from . import geo_maps
14+
from . import run
15+
from . import sensor
16+
from . import smooth
17+
from . import update_sensor
18+
from . import weekday
19+
20+
__version__ = "0.1.0"
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# -*- coding: utf-8 -*-
2+
"""Call the function run_module when executed.
3+
4+
This file indicates that calling the module (`python -m delphi_doctor_visits`) will
5+
call the function `run_module` found within the run.py file. There should be
6+
no need to change this template.
7+
"""
8+
9+
from .run import run_module # pragma: no cover
10+
11+
run_module() # pragma: no cover
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
"""
2+
This file contains configuration variables used to generate the doctor visits signal.
3+
4+
Author: Maria
5+
Created: 2020-04-16
6+
Last modified: 2020-06-17
7+
"""
8+
9+
from datetime import datetime, timedelta
10+
11+
12+
class Config:
13+
"""Static configuration variables.
14+
"""
15+
16+
# dates
17+
FIRST_DATA_DATE = datetime(2020, 1, 1)
18+
DAY_SHIFT = timedelta(days=1) # shift dates forward for labeling purposes
19+
20+
# data columns
21+
CLI_COLS = ["Covid_like", "Flu_like", "Mixed"]
22+
FLU1_COL = ["Flu1"]
23+
COUNT_COLS = CLI_COLS + FLU1_COL + ["Denominator"]
24+
DATE_COL = "ServiceDate"
25+
GEO_COL = "PatCountyFIPS"
26+
AGE_COL = "PatAgeGroup"
27+
HRR_COLS = ["Pat HRR Name", "Pat HRR ID"]
28+
ID_COLS = [DATE_COL] + [GEO_COL] + [AGE_COL] + HRR_COLS
29+
FILT_COLS = ID_COLS + COUNT_COLS
30+
DTYPES = {"ServiceDate": str, "PatCountyFIPS": str,
31+
"Denominator": int, "Flu1": int,
32+
"Covid_like": int, "Flu_like": int,
33+
"Mixed": int, "PatAgeGroup": str,
34+
"Pat HRR Name": str, "Pat HRR ID": float}
35+
36+
SMOOTHER_BANDWIDTH = 100 # bandwidth for the linear left Gaussian filter
37+
MAX_BACKFILL_WINDOW = 7 # maximum number of days used to average a backfill correction
38+
MIN_CUM_VISITS = 500 # need to observe at least 500 counts before averaging
39+
RECENT_LENGTH = 7 # number of days to sum over for sparsity threshold
40+
MIN_RECENT_VISITS = 100 # min numbers of visits needed to include estimate
41+
MIN_RECENT_OBS = 3 # minimum days needed to produce an estimate for latest time
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
"""
2+
Functions used to calculate direction. (Thanks to Addison Hu)
3+
4+
Author: Maria Jahja
5+
Created: 2020-04-17
6+
7+
"""
8+
9+
import numpy as np
10+
11+
12+
def running_mean(s):
13+
"""Compute running mean."""
14+
return np.cumsum(s) / np.arange(1, len(s) + 1)
15+
16+
17+
def running_sd(s, mu=None):
18+
"""
19+
Compute running standard deviation. Running mean can be pre-supplied
20+
to save on computation.
21+
"""
22+
if mu is None:
23+
mu = running_mean(s)
24+
sqmu = running_mean(s ** 2)
25+
sd = np.sqrt(sqmu - mu ** 2)
26+
return sd
27+
28+
29+
def first_difference_direction(s):
30+
"""
31+
Code taken from Addison Hu. Modified to return directional strings.
32+
Declares "notable" increases and decreases based on the distribution
33+
of past first differences.
34+
35+
Args:
36+
s: input data
37+
38+
Returns: Directions in "-1", "0", "+1", or "NA" for first 3 values
39+
"""
40+
T = s[1:] - s[:-1]
41+
mu = running_mean(T)
42+
sd = running_sd(T, mu=mu)
43+
d = np.full(s.shape, "NA")
44+
45+
for idx in range(2, len(T)):
46+
if T[idx] < min(mu[idx - 1] - sd[idx - 1], 0):
47+
d[idx + 1] = "-1"
48+
elif T[idx] > max(mu[idx - 1] + sd[idx - 1], 0):
49+
d[idx + 1] = "+1"
50+
else:
51+
d[idx + 1] = "0"
52+
53+
return d

0 commit comments

Comments
 (0)