Skip to content

Keep thresholded counties for future calculations #638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 14, 2020
Merged

Conversation

mariajahja
Copy link
Member

Description

Fix bug where rows in thresholded counties are dropped in megacounty creation. When dropped, the signal backfill is incorrectly computed on 0-padded dates.

Changelog

  • extract megacounties and concatenate to original data in geo_maps::county_to_megacounty
  • update unit test test_geo_map::test_county_to_megacounty

Unit tests pass. Checks on state/msa/hrr match exactly to current behavior. More counties are produced (due to increased denominator from backwards padding).

@krivard Do we need to regenerate past issues of DV, up till when we switched to the geomapper? The data isn't wrong, but it is different, since it would have been produced only on larger counties, without the correct dynamic backwards padding.

Copy link
Contributor

@chinandrew chinandrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: :shipit: :shipit:

Would leave a short comment explaining why the concat is being done, otherwise looks good.

@krivard
Copy link
Contributor

krivard commented Dec 14, 2020

Do we need to regenerate past issues of DV

hmm. I'd rather not, but I also don't want people using past issues for training to get a nasty surprise. How obnoxious is it to produce them?

@krivard krivard merged commit 0d10717 into dv-package Dec 14, 2020
@krivard krivard deleted the dv-fix-rows branch December 14, 2020 22:08
@mariajahja
Copy link
Member Author

Would take an afternoon to regenerate the files, but then I'd have to hand off to someone else to correctly insert issues into the database. I'm happy to help, but not sure if this is a priority at the moment.

@krivard
Copy link
Contributor

krivard commented Dec 14, 2020

let's hold off then -- if you finish your december okrs then pick it up, otherwise it can wait either until january, or until someone inquires, whichever comes first.

(loading them into the db is my job 🙂 )

@mariajahja
Copy link
Member Author

Sounds good -- will put it on my list so I don't forget. Just a note that the automation has been updated -- so hopefully tomorrow we see better behavior!

@krivard krivard mentioned this pull request Jan 27, 2021
4 tasks
@mariajahja
Copy link
Member Author

DV from 2020-11-09 to 2020-12-13 will be reuploaded as part of a database fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants