-
Notifications
You must be signed in to change notification settings - Fork 67
Add covid_hosp_facility lookup integration tests #1030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add covid_hosp_facility lookup integration tests #1030
Conversation
* mixed-facility-metadata inputs * facility-metadata updates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great stuff, but check my comment @ database.py:181 to make sure im not imagining things
# ...and string of VALUES placeholders | ||
values_str = ','.join( ['%s'] * len(ak_cols) ) | ||
# use aggregate key table alias | ||
ak_table = self.table_name + '_key' | ||
# assemble full SQL statement | ||
ak_insert_sql = f'INSERT INTO `{ak_table}` ({ak_cols_str}) VALUES ({values_str}) ON DUPLICATE KEY UPDATE {ak_updates_str}' | ||
ak_insert_sql = f'INSERT INTO `{ak_table}` ({ak_cols_str}) VALUES ({values_str}) as v ON DUPLICATE KEY UPDATE {ak_updates_str}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ak_insert_sql = f'INSERT INTO `{ak_table}` ({ak_cols_str}) VALUES ({values_str}) as v ON DUPLICATE KEY UPDATE {ak_updates_str}' | |
ak_insert_sql = f'INSERT INTO `{ak_table}` ({ak_cols_str}) VALUES ({values_str}) AS v ON DUPLICATE KEY UPDATE {ak_updates_str}' |
.sort_values(self.KEY_COLS)[ak_cols] | ||
.drop_duplicates()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice, so any later-dated rows for the same PK will go into the INSERT...ON DUPLICATE after the earlier ones, and thus the newest data will be preserved in the ak_table
! it might be worth leaving a note to that effect? also (im beginning to hate this data set) should we apply the dataframe_typemap before this sort, because '2020-10-30'<'2020/09/07'
but not '20201030'<'20200907'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we apply the dataframe_typemap before this sort
Good call. We've never seen dates without delimiters in the source data, but we have seen both slashes and dashes, and
>>> "2/3/4" > "2-3-5"
True
>>> "2/3/4" > "2/3/5"
False
😩
* Switch to executemany * Add new limited_geocode datatype * Fix test prep missing from #1030
Extends #1016
Prerequisites:
Unless it is a documentation hotfix it should be merged against thedev
branchdev
Summary
Integration tests