Skip to content

Dsew vaccination #1495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Feb 24, 2022
Merged

Dsew vaccination #1495

merged 40 commits into from
Feb 24, 2022

Conversation

Ananya-Joshi
Copy link
Contributor

@Ananya-Joshi Ananya-Joshi commented Jan 26, 2022

Booster indicators at the state + level

Of note:

  • Currently only pulling one file; potential problem with listing.
  • Need the receiving directory!
  • Did not add any new tests (did not see the need).

@Ananya-Joshi Ananya-Joshi marked this pull request as ready for review January 26, 2022 22:22
@Ananya-Joshi Ananya-Joshi marked this pull request as draft January 26, 2022 22:30
@Ananya-Joshi
Copy link
Contributor Author

Converted to draft because safegraph is stalling..

@Ananya-Joshi Ananya-Joshi marked this pull request as ready for review January 28, 2022 18:05
@krivard
Copy link
Contributor

krivard commented Jan 28, 2022

would you merge in main please?

@krivard krivard requested a review from nmdefries January 31, 2022 19:24
@@ -359,7 +372,7 @@ def fetch_new_reports(params, logger=None):

# download and parse individual reports
datasets = download_and_parse(listing, logger)

print(datasets, datasets.items())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a debugging leftover? Usually we try and use the structured logger to avoid mucking up the kibana dashboards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'll make it a logger command (but maybe @nmdefries can weigh in ) - this is because only a pdf instead of an excel file was uploaded on Jan 25/26th to the CDC site, and thus returned None for dataset items). Is it worth keeping in and investigating?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to fail in that case with a useful message, we can add an assert. But it would be reasonable to check if a given listing is empty in the for sig, lst in datasets.items(): loop below. If it is empty, continue to the next listing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because it's not happened again, I'm going to just ignore it for now! That makes sense if it ever comes up again.

Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the booster field is under a new section heading ("overheader"), we need to include the vaccinations/boosters overheader, e.g. "COVID-19 VACCINATION DATA: LAST WEEK (January 15-21)", in parsing via including patterns for it in skip_overheader()

@nmdefries
Copy link
Contributor

nmdefries commented Feb 2, 2022

I was looking at all the vaccination-related fields CPR provides, including here for future reference.

We want total & new doses administered, + boosters. It's currently under discussion if total should be only initial doses, or include boosters. It's also unclear to me exactly what is included in CPR's "Doses administered".

Overheaders:
• (1) COVID-19 VACCINATION DATA: LAST WEEK (January 25-31)
• (exclude) COVID-19 VACCINATION DATA: % CHANGE FROM PREVIOUS WEEK
• (2) COVID-19 VACCINATION DATA: CUMULATIVE (January 31)
• (exclude) COVID-19 VACCINATION DATA: DEMOGRAPHIC DATA LAST WEEK
• (exclude) COVID-19 VACCINATION DATA: DEMOGRAPHIC DATA % CHANGE FROM PREVIOUS WEEK
• (exclude) COVID-19 VACCINATION DATA: DEMOGRAPHIC DATA CUMULATIVE

Headers:
• (1) Doses administered - last 7 days
• (1) (maybe) Doses administered per 100k population - last 7 days
• (1) Booster doses administered - last 7 days **** this PR
• (2) Doses administered
• (2) (maybe) Doses administered per 100k population
• (2) People with at least 1 dose
• (2) People who are fully vaccinated
• (2) People who have received a booster dose since August 13, 2021

From the data dictionary in the CPR spreadhsheets,

Doses distributed, people with at least one dose, and people who are fully vaccinated include the Pfizer-BioNTech, Moderna, and J&J/Janssen COVID-19 vaccines... People fully vaccinated includes those who have received two doses of the Pfizer-BioNTech or Moderna vaccine and those who have received one dose of the J&J/Janssen vaccine... The count of people who received a booster dose includes anyone who is fully vaccinated and has received another dose of COVID-19 vaccine since August 13, 2021. This includes people who received booster doses and people who received additional doses.

@Ananya-Joshi
Copy link
Contributor Author

Hi Nat, I just saw your review, so I'll update the indicator shortly with your edits.

@krivard
Copy link
Contributor

krivard commented Feb 2, 2022

The count of people who received a booster dose includes anyone who is fully vaccinated and has received another dose of COVID-19 vaccine since August 13, 2021. This includes people who received booster doses and people who received additional doses.

Oh nooooo

This is a different definition than expected. Typically we'd consider an immunocompromised person who has received an additional dose as part of their initial series (ie 3 doses total) to be "fully vaccinated" and not "boosted", but this report seems to be tossing them into the "boosted" category so long as that 3rd dose occurred after their arbitrary August 13 cutoff.

I don't think there's anything we can do about that in our code, we'll just have to make it super clear in the documentation that this is how the CPR is counting it. cc @capnrefsmmat

@capnrefsmmat
Copy link
Contributor

This is a different definition than expected. Typically we'd consider an immunocompromised person who has received an additional dose as part of their initial series (ie 3 doses total) to be "fully vaccinated" and not "boosted", but this report seems to be tossing them into the "boosted" category so long as that 3rd dose occurred after their arbitrary August 13 cutoff.

I don't think there's anything we can do about that in our code, we'll just have to make it super clear in the documentation that this is how the CPR is counting it. cc @capnrefsmmat

Not surprising that they're conflating the categories like this, but yes, that's definitely a problem we should clearly document. (I'd expect many of our users not to know the additional dose/booster dose distinction either.)

@Ananya-Joshi Ananya-Joshi marked this pull request as draft February 3, 2022 13:49
@Ananya-Joshi Ananya-Joshi marked this pull request as ready for review February 3, 2022 21:40
@Ananya-Joshi Ananya-Joshi marked this pull request as ready for review February 7, 2022 14:25
@nmdefries nmdefries self-requested a review February 7, 2022 16:14
Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Note: when we're ready to publish these in the API, the new signal keys from SIGNALS will need to be added to the exported_signals setting in the Ansible production params template.

@Ananya-Joshi
Copy link
Contributor Author

Visual and statistical checks: Comment from slack:

There are some counties that have "000" as the last 3 numbers. This is because until 11/15/2021, the state levels for people_fully_vaccinated were not NA. I have removed it in the signal_prechecks pdf, but I'm not sure how we want to handle them in the future. We also stop having data for some fips codes after December 16th.
I also don't know how to rotate the x-values on the legend for choropleth maps, so the values are hard to read, but the maps look okay in the signal_prechecks.pdf

I'm not sure why only < 1/2 of the first choropleth map is filled out at the county level the rest of the values I assume being NA.

signal_prechecks.pdf
correlations_analysis-1.html.zip

@Ananya-Joshi
Copy link
Contributor Author

Ananya-Joshi commented Feb 15, 2022

Before 03-09-2021, "People with full course administered" was used instead of "fully vaccinated". Should this be a separate signal or might there be a better way to combine the two explicitly in the code.
[EDIT: Addressed in Slack]

@nmdefries nmdefries self-requested a review February 17, 2022 14:47
Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

@krivard
Copy link
Contributor

krivard commented Feb 18, 2022

Great! Holding off on merging until we have the remaining approvals and API documentation ready to go.

@Ananya-Joshi
Copy link
Contributor Author

Made a PR for the documentation, I think the visual and stat checks should be done - anything else left?

For the PR, looks like I'm running into an error I don't quite understand (because I only edited the markdown).

@nmdefries
Copy link
Contributor

Added a note about the failure.

@krivard krivard merged commit 2e5cb61 into main Feb 24, 2022
@krivard krivard deleted the dsew_cp_vaccination branch February 24, 2022 21:00
@nmdefries nmdefries mentioned this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants