Skip to content

Clarify covidcast documentation about diff-based vs confirmation-based issues #765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
krivard opened this issue Nov 10, 2021 · 0 comments

Comments

@krivard
Copy link
Contributor

krivard commented Nov 10, 2021

From Logan in: cmu-delphi/covidcast-indicators#1362

I soft-expected an issue-query to return only data with changes to entries (other than to issue itself and lag), for two reasons:

  • ?covidcast::covidcast_signal describes issue-queries as "[f]etch[ing] only data that was published or updated ("issued") on these [issues]", which might leave this impression.
  • storage efficiency-wise, this would make sense; query efficiency-wise, I'm not sure but I'd guess that it'd probably help, especially if there is an index over source,signal,geo_value,time_value,issue.

We should be clear in the documentation that the covidcast data history occasionally includes records that confirm (match) a previous value, rather than being restricted to only additions and updates.

If we feel the need to be expansive, here is the full explanation:

Confirmation-type issues are expected under certain circumstances:

  1. Certain sources always use confirmation-type issues (fb-survey, doctor-visits, hospital-admissions)
  2. All issues for all sources prior to 2020-07-16 were confirmation-type, since that's how we generated the initial set of versions when versioning was launched. Indicators were switched to diff-based issues one at a time afterward (with poor bookkeeping as to the timing of each switch)
  3. With the exception of the bigint patches, all data patches applied prior to November 2021 were confirmation-type, since setting up the machinery for a diff-type patch is fairly complicated and we only recently established software to assist with the process
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant