-
Notifications
You must be signed in to change notification settings - Fork 67
allow state-level FIPS codes for geo_type "county" #1134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
this change is intended to remedy the empty state visualization on the covidcast dashboard (ie, as seen on https://delphi.cmu.edu/covidcast/?date=20221115®ion=PA) that was noticed today. it should make it so state-level FIPS codes are also allowed as valid values for `geo_type` of "`county'" (ie, all of Pennsylvania is captured in FIPS code 42000 where individual counties can be found in the range of 42001-42999). it looks like "`hhs`" would work identically in place of "`chng-fips`" too, which kind of implies "hhs" and "county" may be redundant geo_types. this is also largely untested
Kudos, SonarCloud Quality Gate passed!
|
@nmdefries @dshemetov i thought this was a simple change from pointing to this file instead of this file -- that the first column is used as the mapping keys, and the newly referenced one includes the "nn000" state codes that the original one doesnt. this is not the case, as there appears to be something fancier done when ingesting those CSVs than what happens here. can you suggest a good way to accept state codes as valid county codes? |
I wrote a little about it here. I suggest switching to fips_state_table.csv for the fips levels instead. Sidenote: the "chng-fips" file seems to be missing some fips codes found in the fips_pop file, probably because the mapping file didn't contain these geos.
|
That is by design. CHNG wants to censor the counts for those fips, and instead report those counts in small groupings of those fips. Each group lies within a single state, and the membership in each group is intended to stay consistent across time. You can get the group codes if you do the opposite set operation:
lol though that 17000 is the exact bug we're trying to squash! So is the fix to change GeoMapper to use fips_state_table.csv, release covidcast-indicators, then just rebuild delphi-epidata so that it grabs the new version of delphi-utils? |
the difference isnt in those csv files. this grabs the first column from each file, sorts them, and then diffs them. the first diff returns nothing, showing there is nothing in the
so i guess the crosswalking is doing something funky |
@krivard Ah right, the censoring! Makes sense! And yup, adjusting GeoMapper would be my suggestion. |
@melange396 huh, you're right. Very weird. So the flow is:
I see, so |
Why weird? We know the fips -> chng-fips mapping includes xx000 fips values We know chng-fips values replace some fips with xxgyy group codes Isn't that consistent with George's findings? or at least, consistent with the shell output he got. it's just that the differences are in the .csv files, but you have to look at all the columns, not just the first column |
Ah sorry that was my initial reaction to George's line diff not showing the same differences I got with |
are you guys at all worried that changing the implementation of GeoMapper will have other unintended consequences? |
so that `delphi_utils.geomap.GeoMapper().get_geo_values('fips')` also includes (for example) "42000" representing all of Pennsylvania, in addition to FIPS codes in the 42001-42999 interval that represent individual counties. see discussion at cmu-delphi/delphi-epidata#1134
here's everywhere we're using get_geo_values()
|
Also searched the whole org for uses of that function and it's only used again in that server geo-validator code in _params.py, so we should be safe 👍 |
this change is intended to remedy the empty state visualization on the covidcast dashboard (ie, as seen on https://delphi.cmu.edu/covidcast/?date=20221115®ion=PA) that was noticed today.
it should make it so state-level FIPS codes are also allowed as valid values for
geo_type
of "county
" (ie, all of Pennsylvania is captured in FIPS code 42000 where individual counties can be found in the range of 42001-42999). it looks like "hhs
" would work identically in place of "chng-fips
" too, which kind of implies "hhs" and "county" may be redundant geo_types as stored in the epimetric tables.this is also largely untested