-
Notifications
You must be signed in to change notification settings - Fork 8
Consider tidyselect in epi_slide()
#18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks @brookslogan. That seems like it could/would be great for usability and readability! If you can get a demo working I think it'd be really nice to see. Note: I think what you're suggesting could replace the formula option in the slide function. It'd be much cleaner. However, in certain scenarios, I could imagine that you'd still want to be able to specify the function directly, i.e., when it's a complicated function that takes the whole input data frame (row-subsetted appropriately), like in some forecasting tasks. |
Adding more thoughts to record them, even if just for my own sake. In a helpful side conversation, @earowang suggested something similar, where we refactor things so that there aren't separate functions for
That is, It seems like we could pull this off similarly to how @earowang implemented One thing I'm uncertain of is whether the aforementioned implementation leverages |
slide_by_geo
epi_slide()
Wondering what @earowang thinks about my last comment and the example I gave there? Is that along the lines of what you were thinking? And do you have an opinion on what the path towards implementation would be---would it be good to follow the example you gave in |
Does Can we group currently available functions into two sections: (1) table functions handling data frames and (2) vector functions? Regarding table functions, you may like to name them more generically, e.g. covid_data %>%
slide_by(cases_pct_change = epi_pct_change(cases, n = 14)) |
I don't think (Regarding naming I don't have a clear opinion right now. Maybe consider potential confusion introduced by mixing By Another detail to think about: what should be the output
Run on daily data with no missingness, the output will not include the most recent 3 Now consider also including |
Thanks both! I like the idea of separating out at least conceptually into table functions and vector functions. Re naming: I'm not sure it makes sense to prefix everything by Re mutate: can you look at the code I linked above (in Earo's Re time values: this is a good point. But I think any desired behavior could be accomplished with some combination of setting the following two arguments:
|
Re time values: I think this is more about preventing users from accidentally doing something they don't want rather than being able to do what they want. For example, anything involving |
Re |
Users may want to directly write
slide_by_geo(newcol = f(oldcol), ........)
rather than separately specifying the function and name. We could use rlang/tidyselect stuff to accomplish this.transmute_scattered_ts
andmutate_scattered_ts
from #7 performs thetime_value
wrangling required to apply functions assuming thatvalue
s correspond to increasing, evenly-spaced time values. One can basically just write a transmute statement and add the additional time window info requires. This might potentially help implement this revision. It would be group-by-geo followed bymutate_scattered_ts
andungroup
. The mutation might be tricky asmutate_scattered_ts
works on vectorized functions whileslide_by_geo
works on non-vectorized functions.The text was updated successfully, but these errors were encountered: