Skip to content

Test growth_rate #197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions R/growth_rate.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@
#' * "rel_change": uses (B/A - 1) / h, where B is the average of `y` over the
#' second half of a sliding window of bandwidth h centered at the reference
#' point `x0`, and A the average over the first half. This can be seen as
#' using a first-difference approximation to the derivative.
#' using a first-difference approximation to the derivative. This is the
#' default if `method` is not specified.
#' * "linear_reg": uses the slope from a linear regression of `y` on `x` over a
#' sliding window centered at the reference point `x0`, divided by the fitted
#' value from this linear regression at `x0`.
Expand Down Expand Up @@ -145,7 +146,7 @@ growth_rate = function(x = seq_along(y), y, x0 = x,

# Remove NAs if we need to
if (na_rm) {
o = !(is.na(x) & is.na(y))
o = !is.na(x) & !is.na(y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is this a bug fix? The new version of this line doesn't mean the same thing as the old version.

suggestion: I'm assuming the new version is desired behavior (keeping only pairs of obs in x and y when both values are available), but the comment "# Remove NAs if we need to" is very ambiguous. Recommend making the goal more explicit.

x = x[o]
y = y[o]
}
Expand Down
12 changes: 8 additions & 4 deletions man/as_epi_archive.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions man/as_epi_df.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

42 changes: 21 additions & 21 deletions man/epi_archive.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 8 additions & 4 deletions man/epi_slide.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 8 additions & 4 deletions man/epix_as_of.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 8 additions & 4 deletions man/epix_slide.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/growth_rate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

56 changes: 56 additions & 0 deletions tests/testthat/test-growth_rate.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
library(dplyr)

X <- c(1:10,NA,10:20,NA)
Y <- c(2^(1:9),NA,NA,2^(10:21))

methods <- c("rel_change","linear_reg","smooth_spline","trend_filter")

gr <- function(method = "rel_change", h = 3, na_rm = TRUE, ...) {
growth_rate(x=X,y=Y,method=method,na_rm = na_rm, h = h,...)
}

test_that("Test error throwing",{
# Error cases
expect_error(growth_rate(x=1:3,y=1:4),
"`x` and `y` must have the same length.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion [non-blocking]: Probably beyond scope here, but it would be nice if the errors and warnings in growth_rate had classes so we could check for that rather than message wording, which is liable to change.

expect_error(growth_rate(x=1:20,y=1:20,x0=21),
"`x0` must be a subset of `x`.")
# Fails only when method = `"trend_filter"`
expect_error(gr(method = "trend_filter",cv=FALSE,df=1.5),
"If `cv = FALSE`, then `df` must be an integer.")
})

test_that("Test throwing of warning of duplicates",{
# The warning that is prompted is as follows:
# "`x` contains duplicate values. (If being run on a column in an `epi_df`,
# did you group by relevant key variables?)"
# Note that putting it in the regexp of expect_warning doesn't seem to work
jhu_csse_daily_subset %>%
mutate(cases_gr = growth_rate(x = time_value, y = cases, dup_rm=TRUE)) %>%
expect_warning() %>%
expect_error()
Comment on lines +30 to +31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: It's hard to tell what this is doing. If we want to check for both a warning and an error, run this twice with each expect_*. If we're trying to do something else, I don't understand.

})

test_that("Simple example of growth rate that produces desired results",{
expect_equal(growth_rate(x=1:20,y=2^(1:20),h=1), c(rep(1,19),NaN))
})

test_that("log_scale works",{
expect_equal(growth_rate(x=1:20,y=exp(1:20),h=5,
method="linear_reg",log_scale = TRUE),
rep(1,20))
})

test_that("Running different methods with NA removal won't fail",{
for (m in methods) expect_false(NA %in% gr(method = m,x0=1:5))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: x0 doesn't contain any NAs, so are we really testing NA-removal? Do we want

Suggested change
for (m in methods) expect_false(NA %in% gr(method = m,x0=1:5))
for (m in methods) expect_false(NA %in% gr(method = m,x0=c(1:5, NA)))

?

})

test_that("na_rm works and is necessary when there are NA's",{
expect_false(NA %in% gr())
expect_equal(length(gr()),20)
expect_equal(gr(na_rm = FALSE),
# 1+NA gives an NA classified as a numeric
rep(1+NA,23))
expect_equal(gr(h=1), c(rep(1,19),NaN))
expect_error(gr(method = "smooth_spline"))
})
Copy link
Contributor

@brookslogan brookslogan Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add another test for the actual values coming out? E.g., construct a simple example where the growth rate should be obvious, and test that we calculate that obvious value. At least for some of the methods.

Also, please test that each of the methods, log_scale settings, and na_rm settings at least run successfully.

Copy link
Contributor

@brookslogan brookslogan Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can't construct an example where you think you know what the output should be, maybe throw in some testthat::expect_snapshots? I assume we'll need to take the expect_snapshot approach for some/most/all of the methods; I'm not sure, e.g., that there's a simple example where we would think the trendfilter output is obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect_snapshot creates a new file for snapshots and does something that doesn't seem inherently helpful. All it does it checks for errors from my experience; however, that doesn't seem to be any better than doing expect_error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kenmawer I think the point here is to test that the correct calculations are occurring when arguments are correctly specified, not just that errors happen when the inputs are wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already added a few tests for calculations. I would suspect that for some of the other methods, it would be too complicated to be able to verify the outputted numbers. It also depends on what you mean by "correct calculations," whether it's merely getting the right number of cells or getting those exact numbers.

Copy link
Contributor

@brookslogan brookslogan Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's likely too complicated to verify the output numbers [for the complex settings like using trend filtering], that's why I suggested expect_snapshot. When you first run it, it saves the current output to a _snaps directory, which you would then commit (make sure that the test is constructed so that this file is very small; we don't want to bloat the repo), and then later, when we run the snapshot test again, it will detect the file in the _snaps directory and compare against it. If we inadvertently change the output behavior, this test will catch it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like of the requested

test that each of the methods, log_scale settings, and na_rm settings at least run successfully

only the method outputs are left -- is that correct?