Skip to content

Missing data in GHT #293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
huisaddison opened this issue Sep 30, 2020 · 2 comments
Closed

Missing data in GHT #293

huisaddison opened this issue Sep 30, 2020 · 2 comments
Assignees
Labels
data quality Missing data, weird data, broken data Engineering Used to filter issues when synching with Asana

Comments

@huisaddison
Copy link
Contributor

Actual Behavior:

Data missing for many MSA / States on July 11, 2020; August 18 - 20, 2020

Expected behavior

Data is there for those days.

Context

GHT should not have "missing data" - the usual behavior of the API is to report 0 when seearches are below a minimum threshold. This suggests to me that there was a glitch in our pipeline automation. We should try rerunning the pipeline for those days to see if this fixes the issue

cc @RoniRos who pointed this out
cc @jingjtang who I b elieve is the current pipeline maintainer

@huisaddison huisaddison added the data quality Missing data, weird data, broken data label Sep 30, 2020
@huisaddison
Copy link
Contributor Author

@jingjtang in response to the pipeline being automated (which you commented on another issue) Is there any chance that the GHT API did not report data for a really long time, and somehow our automation didn't "look far back" enough when it finally updated?

I can look into this by directly querying the GHT API for those days data tomorrow (Th/F Delphi days) but if somebody else beats me to it they should post first

@jingjtang
Copy link
Contributor

jingjtang commented Sep 30, 2020

@huisaddison It could be possible since initially the automation just uploaded the latest report to the API. But later we have added code for archiving to solve the backfill problem. @korlaxxalrok should know more details about what happened. I will look into the data for those days too.

@nmdefries nmdefries removed their assignment Nov 24, 2020
@SumitDELPHI SumitDELPHI added the Engineering Used to filter issues when synching with Asana label Dec 6, 2020
@krivard krivard closed this as completed Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data quality Missing data, weird data, broken data Engineering Used to filter issues when synching with Asana
Projects
None yet
Development

No branches or pull requests

5 participants