Skip to content

FB-survey validation with a generic design to include other pipelines #155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 154 commits into from
Nov 19, 2020

Conversation

amartyabasu
Copy link
Contributor

@amartyabasu amartyabasu commented Jul 20, 2020

Closes #59

  • The new validator reads the daily-files from the disk after pipeline execution. The validations for a given signal-type and geo-value on given day are done against a range of previous days data, that get fetched from the API.
  • The validator does basic checks related to different columns, column types, missing_dates, numeric-constraints, null values.
  • It also checks for upward and downward spikes of data, variations in mean differences of column values grouped by geo_ids

Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't run this yet, but it looks like it's not quite ready to go.

@krivard krivard marked this pull request as draft September 8, 2020 15:21
@krivard
Copy link
Contributor

krivard commented Sep 8, 2020

This doesn't seem to run, and is full of temp code -- converting to draft to revise

@krivard
Copy link
Contributor

krivard commented Sep 8, 2020

@nmdefries

@krivard krivard force-pushed the fb-package-validation branch from 773d279 to cd8541f Compare September 8, 2020 19:15
@nmdefries nmdefries requested a review from krivard October 10, 2020 16:42
@nmdefries nmdefries marked this pull request as ready for review October 10, 2020 16:52
Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just based on a read-through, not running or testing

@nmdefries nmdefries self-assigned this Oct 13, 2020
@krivard krivard removed the request for review from jsharpna October 15, 2020 20:55
nmdefries and others added 22 commits November 2, 2020 12:18
…unused vars. Simplify API thread methods. Update plans
Code to validate that all geo_id values are valid, by comparing against a list of known values.
- Clarify format vs. value checks
- move files from csv/ -> static/
Renamed directory and file (unique_geoids.R) is now expected to be run from within the directory instead of from one level up.
Find Unexpected Values for geo_id compared to historical geo_ids seen
JedGrabman and others added 5 commits November 9, 2020 13:38
Automatically determine signals and data sources to use for retrieving geo_values. This adds robustness at the cost of efficiency.
@krivard krivard merged commit 8a20533 into main Nov 19, 2020
@krivard krivard deleted the fb-package-validation branch February 9, 2021 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port the Facebook validation pipeline to be generic and automatable
6 participants