Skip to content

Covidcast table refactor for data versioning performance and quantile support #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
krivard opened this issue Jul 23, 2020 · 4 comments
Closed
Assignees
Labels
Engineering Used to filter issues when synching with Asana

Comments

@krivard
Copy link
Contributor

krivard commented Jul 23, 2020

Proposed new table design:

  • Invariant parameters: (id, source, signal, time_type, geo_type, geo_value, time_value, most_recent_fk)
  • Versioned indicator values: (id, invariants_fk, value, std, sample_size, direction, ts1, ts2, issue, lag)
  • Versioned forecast values: (id, invariants_fk, value, std, sample_size, direction, ts1, ts2, forecastdate, lag)
  • Forecast parameters: future work but probably something like (id, invariants_fk, quantile, forecast_value_fk)
@krivard
Copy link
Contributor Author

krivard commented Aug 12, 2020

Migration could be complicated. Options:

  • Load all the data from scratch
  • Shift data from another live table (could need tons of memory)

Currently exploring loading/transferring data in batches

If all else fails could just bring it up on a separate VM and transfer DNS over once the data is all loaded. Would need to handle DNS TTL carefully in advance; coordinate with CMU on that.

Meet to discuss details next week.

@krivard
Copy link
Contributor Author

krivard commented Sep 9, 2020

Performance issues arise!

@krivard
Copy link
Contributor Author

krivard commented Oct 7, 2020

Working on removing the direction stuff so we can stop contorting ourselves to work around it.

@krivard
Copy link
Contributor Author

krivard commented Oct 14, 2020

Mystery test gremlins conquered, PR imminent

@SumitDELPHI SumitDELPHI added the Engineering Used to filter issues when synching with Asana label Dec 6, 2020
@krivard krivard closed this as completed Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engineering Used to filter issues when synching with Asana
Projects
None yet
Development

No branches or pull requests

3 participants