Skip to content

First pass of the CDC Vaccination Indicator #1238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 136 commits into from

Conversation

Ananya-Joshi
Copy link
Contributor

Description

First pass on the CDC Indicator with tests

@Ananya-Joshi Ananya-Joshi linked an issue Sep 8, 2021 that may be closed by this pull request
9 tasks
@Ananya-Joshi Ananya-Joshi marked this pull request as draft September 8, 2021 12:07
@Ananya-Joshi Ananya-Joshi marked this pull request as ready for review September 9, 2021 15:32
@Ananya-Joshi
Copy link
Contributor Author

Missing HI and TX data by county.

@krivard krivard changed the title First pass of the CDC Indicator First pass of the CDC Vaccination Indicator Sep 9, 2021
Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent start!

@Ananya-Joshi
Copy link
Contributor Author

The changes requested have been made. We should talk about the two open issues still - the unknown states/counties and the megacounty thread.

Co-authored-by: Katie Mazaitis <[email protected]>
# Obtain new_counts
df.sort_values(["fips", "timestamp"], inplace=True)
for to, from_d in DIFFERENCE_MAPPING.items():
df[to] = df[from_d].diff()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, you might like this version of taking diffs, grouped by geos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the .isna() in that code should probably be replaced with .loc[min_time_value, :] or something, but the gist is there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can keep this method for now, but then later I'll look to fix the method and use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's alright. A possible refactor for later.

@Ananya-Joshi
Copy link
Contributor Author

@krivard @dshemetov , do we need to do anything else for this indicator?

@dshemetov
Copy link
Contributor

@Ananya-Joshi just the Nancodes annotation in the thread above and it should be good to go 👍

@Ananya-Joshi
Copy link
Contributor Author

Ananya-Joshi commented Sep 29, 2021

@Ananya-Joshi just the Nancodes annotation in the thread above and it should be good to go 👍

As far as I can see on my local, they're changed! Is this not the case for what you can see?

@dshemetov
Copy link
Contributor

Oh, now we're missing the actual "se" and "sample_size" columns. I will make a commit with what I mean.

Copy link
Contributor

@dshemetov dshemetov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krivard
Copy link
Contributor

krivard commented Oct 6, 2021

This looks like rebase leavings:
image

Indeed, the gitref for "Remove test for old export func" is supposed to be 5fa6314 and not ca09586. @Ananya-Joshi and @dshemetov can you clarify what was done here? I want to make sure when we merge it we won't wind up with duplicate copies of all of those dual-author commits.

@Ananya-Joshi
Copy link
Contributor Author

Hi Katie, I did do a rebase so that the new create_export_csv function would be used. Is there something I can fix or undo?

@krivard
Copy link
Contributor

krivard commented Oct 7, 2021

was it just

$ git rebase main

or something more complicated?

@Ananya-Joshi
Copy link
Contributor Author

It started out that way, but I also needed to manually fix some parts that were conflicting. Should I roll back?

@krivard
Copy link
Contributor

krivard commented Oct 8, 2021

None of the files added in this branch existed before, so the fact that there were conflicts is highly suspicious. I'm also looking at the merge commit you have there -- b9c6e8a -- mixing merges and rebases is probably not helping.

There are multiple options for repairing the commit history, largely because there are multiple acceptable options for where we want to end up when we're done:

  • (A) a branch starting from 56991f9 that only includes commits for cdc_vaccines or merge commits+whatever comes with each merge (merge don't rebase)
  • (B) a branch starting from wherever main is now (currently 5b741ec) that only includes commits for cdc_vaccines (rebase don't merge)

This PR includes 136 commits, but only ~32 of them were for cdc_vaccines:

  • 89410dc^..f67925b
  • maybe 90ea653?
  • d3544d0^..ea6587d

Ways to get from here to one of the options above include:

  • (A1) rebase -i (no other arguments) and drop all the non-cdc_vaccines commits, then git merge main. There's only one or two ranges to drop, but that would include 104 individual lines in whatever editor you use for interactive rebases.
  • (B1) rebase -i (no other arguments) and drop all the non-cdc_vaccines commits, then git rebase --onto main. If you get conflicts, call for help, for the reasons explained above.
  • (B2) checkout a fresh branch from main, then cherry-pick the 32 cdc_vaccines commits into this new branch (details below). There are only 2 or 3 ranges to cherry-pick, so this probably has the lowest opportunity for error so long as there are no surprises. Once you're done, discard the old branch, then rename the new branch to replace it.
  • (B3) rebase version of B2 (details below). This one is better if you run into trouble, since rebase has way better conflict & recovery handling than cherry-pick. Con: I've never used this method myself; found it on the internet. Once you're done, discard the old branch, then rename the new branch to replace it.

Whatever you choose, you'll then need to git push --force to upload it into the PR.

Cherrypick method (B2) details:

git checkout main
git checkout -b indicator_cdc_vaccines_fixed
git cherry-pick 89410dc^..f67925b
# git cherry-pick 90ea653 #(might not be needed)
git cherry-pick d3544d0^..ea6587d

Rebase method (B3) details:

git branch indicator_cdc_vaccines_fixed indicator_cdc_vaccines
git branch signpost/stop f67925b
git rebase --onto main 89410dc^ signpost/stop
# git cherry-pick 90ea653 #(might not be needed)
git rebase --onto signpost/stop d3544d0^ indicator_cdc_vaccines_fixed

@Ananya-Joshi
Copy link
Contributor Author

Thank you Katie for this super detailed comment- I learned a lot and it seems like 2b will be successful (will push shortly or will create a new PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New indicator: CDC vaccine data
8 participants