Skip to content

Commit f84d0cb

Browse files
committed
some documentation
1 parent 2263105 commit f84d0cb

File tree

2 files changed

+22
-9
lines changed

2 files changed

+22
-9
lines changed

emr_hosp/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
# EMR Hospitalizations Indicator
22

33
COVID-19 indicator using hospitalizations from electronic medical records (EMR).
4+
Reads claims data (AGG) and EMR data (CMB) and combines into pandas dataframe.
5+
Makes appropriate date shifts, adjusts for backfilling, and smooths estimates.
6+
Writes results to csvs.
47

58

69
## Running the Indicator
@@ -56,3 +59,10 @@ The output will show the number of unit tests that passed and failed, along
5659
with the percentage of code covered by the tests. None of the tests should
5760
fail and the code lines that are not covered by unit tests should be small and
5861
should not include critical sub-routines.
62+
63+
## Code tour
64+
65+
- update_sensor.py: EMRHospSensorUpdator: reads the data, makes transformations,
66+
- sensor.py: EMRHospSensor: methods for transforming data, including backfill and smoothing
67+
- load_data.py: methods for loading claims and EHR data
68+
- geo_maps.py: geo reindexing

emr_hosp/delphi_emr_hosp/update_sensor.py

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -97,23 +97,27 @@ def __init__(self,
9797

9898

9999
def shift_dates(self):
100-
"""shift estimates one day forward to account for a 1 day lag, e.g.
101-
we want to produce estimates for the time range May 2 to May 20, inclusive
102-
given a drop on May 20, we have data up until May 19.
103-
we then train on data from Jan 1 until May 19, storing only the sensors
104-
on May 1 to May 19. we then shift the dates forward by 1, giving us sensors
105-
on May 2 to May 20. therefore, we will move the startdate back by one day
106-
in order to get the proper estimate at May 1
100+
"""shift estimates forward to account for time lag, compute burnindates, sensordates
107101
"""
108-
## JS: WILL USE DATETIMEINDEX FOR THIS...
102+
109103
drange = lambda s, e: pd.date_range(start=s,periods=(e-s).days,freq='D')
110104
self.startdate = self.startdate - Config.DAY_SHIFT
111105
self.burnindate = self.startdate - Config.BURN_IN_PERIOD
112106
self.fit_dates = drange(Config.FIRST_DATA_DATE, self.dropdate)
113107
self.burn_in_dates = drange(self.burnindate, self.dropdate)
114108
self.sensor_dates = drange(self.startdate, self.enddate)
109+
return True
115110

116111
def geo_reindex(self,data,staticpath):
112+
"""Reindex based on geography, include all date, geo pairs
113+
114+
Args:
115+
data: dataframe, the output of loadcombineddata
116+
staticpath: path for the static geographic files
117+
118+
Returns:
119+
dataframe
120+
"""
117121
# get right geography
118122
geo = self.geo
119123
geo_map = GeoMaps(staticpath)
@@ -207,7 +211,6 @@ def update_sensor(self,
207211
pool_results = [proc.get() for proc in pool_results]
208212

209213
for res in pool_results:
210-
## un-tested
211214
geo_id = res["geo_id"]
212215
res = pd.DataFrame(res)
213216
sensor_rates[geo_id] = np.array(res.loc[final_sensor_idxs, "rate"])

0 commit comments

Comments
 (0)