-
Notifications
You must be signed in to change notification settings - Fork 5
Update check functions for clarity and add tests #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm not going to block, but it's probably worth examining the related implementation of similar checks in @lcbrooks it may be worth putting these kind of potentially shared functions somewhere common. |
@dajmcdon Or should we use some prepackaged solution like |
I took a glance at Also, thanks Dan, I'll check out similar functions in epipredict. |
Still need to give this a look over. Brief partial skim: is #36 actually addressed? Or is the code sample showing the old behavior? |
Ah good catch, I misunderstood that issue. It's not fixed yet. EDIT: Fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a lot of naming questions and polish comments which could be debated/deferred, but already this is a naming improvement and takes care of a few bugs. I think the current logic and naming doesn't match quite well with R's always-vectors approach, naming conventions (helped a lot by your addition of these "single" checker functions), and check function conventions (I think missingness would be handled separately).
thought: I believe [some] related rlang functions combine handle the single/"scalar" case simply by adding on an n
argument to the vector case. Though checkmate
takes a different approach and tries to use different names and functions for [additional] single/scalar checks. If you feel it'd be more maintainable to reduce the number of functions, you might try the rlang approach.
question: I haven't really given a good check over any actual functionality issues or the tests, but the tests seem pretty extensive so they might already have caught everything. Are you confident this behaves well or should I give it a second pass trying to find bugs?
On naming: perhaps "scalar" rather than "single"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Logan and Dan! Going to
- make some conversions to
checkmate
- rename
single
->scalar
- add
Date
support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome to see these simplifications! A couple remaining important things:
- Make sure we check for single strings / scalar characters appropriately. (Note: I think I've been unclear here... rlang's "string" is a length-1 non-NA character vector, not just a length-1 character vector. We probably want to be forbidding NAs in just about everything, but not a mandatory thing to address right now.)
- Make sure Date support actually works in requests.
* consolidate check functions, update tests * allow date values in time args, handle output formatting * update format_item, format_list, tests * add basic endpoint tests and request_url tests
Awesome, think it's 99% there. issue: hyphenated date strings don't work. (Or are they supposed to?)
A few things that could be refiled into Issues / other PRs: todo (non-blocking): add yourself as an author/contributor in DESCRIPTION & redoc. And Kean Ming as well if he's not already there.
should have something like
or
to indicate you don't have to do both. |
To follow on Logan's previous comments:
|
I can probably make hyphenated date strings work with epirange, but there's currently a server bug that makes them not work. |
Can they just be processed to numeric on the R side? |
So far I have avoided processing character / integer inputs as dates, hoping that we can rely on the API to return validation errors. But I could give date parsing a shot. |
* add timeset help topic * replace links to epirange fields to timeset * add parse_timeset_input function that will parse/validate date-like inputs * thread parse_timeset_input for all epirange-like endpoint arguments * update tests
Added a date validation function. This query works now r$> modified_example = covidcast(
data_source = "jhu-csse",
signals = "confirmed_7dav_incidence_prop",
time_type = "day",
geo_type = "state",
time_values = list(epirange("2020-06-01", "2020-06-03"), epirange("2020-07-01", "20200703")),
geo_values = "ca,fl"
) %>% fetch_tbl()
r$> modified_example
# A tibble: 12 × 15
geo_value signal source geo_type time_type time_value direction issue lag missing_value missing_stderr missing_sample_size value stderr sample_size
<chr> <chr> <chr> <ord> <ord> <date> <dbl> <date> <int> <int> <int> <int> <dbl> <dbl> <dbl>
1 ca confirmed_7dav_incidence_prop jhu-csse state day 2020-06-01 NA 2023-03-10 1012 0 5 5 6.84 NA NA
2 fl confirmed_7dav_incidence_prop jhu-csse state day 2020-06-01 NA 2023-03-03 1005 0 5 5 3.34 NA NA
3 ca confirmed_7dav_incidence_prop jhu-csse state day 2020-06-02 NA 2023-03-10 1011 0 5 5 6.83 NA NA
4 fl confirmed_7dav_incidence_prop jhu-csse state day 2020-06-02 NA 2023-03-03 1004 0 5 5 3.41 NA NA
5 ca confirmed_7dav_incidence_prop jhu-csse state day 2020-06-03 NA 2023-03-10 1010 0 5 5 6.66 NA NA
6 fl confirmed_7dav_incidence_prop jhu-csse state day 2020-06-03 NA 2023-03-03 1003 0 5 5 4.03 NA NA
7 ca confirmed_7dav_incidence_prop jhu-csse state day 2020-07-01 NA 2023-03-10 982 0 5 5 17.1 NA NA
8 fl confirmed_7dav_incidence_prop jhu-csse state day 2020-07-01 NA 2023-03-03 975 0 5 5 32.9 NA NA
9 ca confirmed_7dav_incidence_prop jhu-csse state day 2020-07-02 NA 2023-03-10 981 0 5 5 18.0 NA NA
10 fl confirmed_7dav_incidence_prop jhu-csse state day 2020-07-02 NA 2023-03-03 974 0 5 5 36.2 NA NA
11 ca confirmed_7dav_incidence_prop jhu-csse state day 2020-07-03 NA 2023-03-10 980 0 5 5 17.6 NA NA
12 fl confirmed_7dav_incidence_prop jhu-csse state day 2020-07-03 NA 2023-03-03 973 0 5 5 36.6 NA NA |
since these invalid inputs might be attempts at epiweek inputs, not just dates
`c(epirange1, epirange2)` currently outputs a named, non-EpiRange vector or 4 that we wouldn't treat appropriately. Make this output an error about having names (could be clearer, but this also might catch some other mistakes / false expectations). Since "scalars" are vectors in R, don't use `is_vector` to try to detect what we might think of as vectors. `format_item` is already mostly set up to handle vectors, so just make it appropriately handle Date vectors and send it all non-list vectors.
Pushed some changes that I believe resolve the remaining comments. Please check over & merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to request @dshemetov to review my tweaks since he's the PR author. Just marking approved.
This PR rewrites all the
check_*
functions withcheckmate
and adds tests.Changes:
check_single_int_param, check_int_param
->check_int_like_param
check_single_string_param, check_string_param
->check_character_param
check_date_param
check_single_epirange_param, check_epirange_param
->check_timeset_param
check_timeset_param
to correctly detect a list of epiranges and to correctly error when given a vector of epirangesFixes:
covidcast
'sas_of
should not "require" an epirange, and does not properly handle epiranges #48issue
parsing influview
endpoint #36covid_hosp_facility
example yields empty string #62 (I think this was fixed already, but tagging this in here so we remember to close the issue)epirange
interface does not supporttime_value=*
#68