Skip to content

Commit b9c15f4

Browse files
committed
Readme edits
1 parent 30dd449 commit b9c15f4

File tree

4 files changed

+16
-4
lines changed

4 files changed

+16
-4
lines changed

validator/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# Validator
22

3+
The validator performs two main tasks:
4+
1) Sanity checks on daily data generated from a pipeline of specific data
5+
source.
6+
2) Its does a comparative analysis with recent data from the API
7+
to detect any anomalies such as spikes, significant value differences
8+
9+
The validator validates against daily data thats already written in the disk
10+
making the execution of the validator independent of the pipeline execution.
11+
This creates an additional advantage of running the validation against multiple
12+
days of daily data and have a better cummulative analysis.
313

414

515
## Running the Indicator

validator/delphi_validator/driver.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@
33

44
# Defining start date and end date for the last fb-survey pipeline execution
55
survey_sdate = "2020-06-13"
6-
survey_edate = "2020-06-19"
6+
survey_edate = "2020-06-15"
77
dtobj_sdate = datetime.strptime(survey_sdate, '%Y-%m-%d')
88
dtobj_edate = datetime.strptime(survey_edate, '%Y-%m-%d')
99
print(dtobj_sdate.date())
1010
print(dtobj_edate.date())
1111

1212

1313
# Collecting all filenames
14-
daily_filnames = read_filenames("./data")
14+
daily_filnames = read_filenames("../data")
1515

1616
fbsurvey_validation(daily_filnames, dtobj_sdate, dtobj_edate)

validator/delphi_validator/fbsurveyvalidation.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,7 @@ def check_avg_val_diffs(recent_df, recent_api_df, smooth_option):
147147
# dplyr::mutate(mean.stddiff.high = abs(mean.stddiff) > thresholds[["mean.stddiff"]] |
148148
# variable=="val" & abs(mean.stddiff) > thresholds[["val.mean.stddiff"]],
149149
# mean.stdabsdiff.high = mean.stdabsdiff > thresholds[["mean.stdabsdiff"]]) %>>%
150+
# TOdo - Check whats the purpose of variable=="val" in the above statement
150151

151152
switcher = {
152153
'raw': raw_thresholds,
@@ -242,8 +243,8 @@ def fbsurvey_validation(daily_filenames, sdate, edate, max_check_lookbehind = ti
242243
if (recent_df["se"].isnull().mean() > 0.5):
243244
print('Recent se values are >50% NA')
244245

245-
#if sanity_check_rows_per_day:
246-
# check_rapid_change(checking_date, recent_df, recent_api_df, date_list, sig, geo)
246+
if sanity_check_rows_per_day:
247+
check_rapid_change(checking_date, recent_df, recent_api_df, date_list, sig, geo)
247248

248249
if sanity_check_value_diffs:
249250
check_avg_val_diffs(recent_df, recent_api_df, smooth_option)

validator/params.json.template

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
{
2+
"data_source": "fb_survey"
23
"start_date": "2020-06-13",
34
"end_date": "2020-06-19",
45
"ref_window_size": 7

0 commit comments

Comments
 (0)