Skip to content

refactor: clean up error messages #413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 7, 2024
Merged

refactor: clean up error messages #413

merged 2 commits into from
Feb 7, 2024

Conversation

dshemetov
Copy link
Contributor

@dshemetov dshemetov commented Feb 1, 2024

Checklist

Please:

  • Make sure this PR is against "dev", not "main".
  • Request a review from one of the current epiprocess main reviewers:
    brookslogan, nmdefries.
  • Makes sure to bump the version number in DESCRIPTION and NEWS.md.
    Always increment the patch version number (the third number), unless you are
    making a release PR from dev to main, in which case increment the minor
    version number (the second number).
  • Describe changes made in NEWS.md, making sure breaking changes
    (backwards-incompatible changes to the documented interface) are noted.
    Collect the changes under the next release number (e.g. if you are on
    0.7.2, then write your changes under the 0.8 heading).

Change explanations for reviewer

  • Removed Abort and Warn, replaced with direct calls to cli functions
  • Replaced many custom arg validation statements with checkmate equivalents
  • Updated test regexp expectations

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

@dshemetov dshemetov requested a review from dajmcdon February 1, 2024 01:45
@dshemetov dshemetov force-pushed the ds/error-msg branch 2 times, most recently from cc30c3a to de91c22 Compare February 1, 2024 02:01
R/archive.R Outdated
}
assert_data_frame(x)
assert_subset(c("geo_value", "time_value", "version"), names(x))
assert(!anyNaN(x$version))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this last one is wrong (should be anyNA()).

Generally, I'm not sure I like the messages produced by this package. There's of course a balance between offloading this stuff to well-tested functions and getting everything I want, but these seem too low-tech. For example:

library(checkmate)
x <- 1:10
names(x) <- letters[1:10]
assert_subset(c("geo_value", "time_value", "version"), names(x))
#> Error: Assertion on 'c("geo_value", "time_value", "version")' failed: Must be a subset of {'a','b','c','d','e','f','g','h','i','j'}, but has additional elements {'geo_value','time_value','version'}.

assert_data_frame(x)
#> Error: Assertion on 'x' failed: Must be of type 'data.frame', not 'integer'.
version <- c(1:5, NA)
assert(!anyNaN(version))
assert(!anyNA(version))
#> Error: Assertion on 'anyNA(version)' failed: FALSE.

The messages aren't wrapped or formatted as they were before. And the messages aren't nearly as clear. It's worth examining https://style.tidyverse.org/error-messages.html

Copy link
Contributor Author

@dshemetov dshemetov Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be anyNA()

Ah yea, in the idioms of this packages, it's anyMissing(). I can change that.

these seem too low-tech
the messages aren't nearly as clear

Seems like a matter of preference, so it's hard for me to know what you don't like about these. Custom error messages can always be done like this

if (!test_subset(c("geo_value", "time_value", "version"), names(x))) cli_abort("custom message")

Like you said, the off-loading to well tested functions is exactly what I like here. It's wild to me how much of our unit testing is the same "did this input type error correctly throw?" checks over and over. To me that's just downstream of how squirmy type checking can get in R. This package gives me type check tests I can trust, which frees up energy for testing the higher level logic of the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better then to refactor the arg_is_*() type functions as you did in epipredict to use {checkmate}. That way you mostly get better messaging.

In this particular case, to better mimic what the original messages were saying (though with better formatting), I'd like to see something like:

hardhat::validate_column_names(data.frame(geo_value = 1:5, time_value = 1:5), c("geo_value", "time_value", "version"))
#> Error in `hardhat::validate_column_names()`:
#> ! The following required columns are missing: 'version'.

Created on 2024-02-01 with reprex v2.0.2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated archive.R and other places where I do column checking with assert_subset to something like:

        if (!test_subset(c("geo_value", "time_value", "version"), names(x))) cli_abort(
          "Columns `geo_value`, `time_value`, and `version` must be present in `x`."
        )

Sound good?

Are there other messages that you'd like changed? I'd prefer not to prematurely make wrappers (DRY three times).

class = "epiprocess__version_values_must_not_be_na"
)
}
assert_data_frame(x)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message from this function is similar:

r$> x = 5

r$> checkmate::assert_data_frame(x)
Error: Assertion on 'x' failed: Must be of type 'data.frame', not 'double'.

}
assert_logical(compactify, len = 1, null.ok = TRUE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message from this function is similar:

r$> compactify = 5

r$> checkmate::assert_logical(compactify, len 
    = 1, null.ok = TRUE)
Error: Assertion on 'compactify' failed: Must be of type 'logical' (or 'NULL'), not 'double'.

}
assert_set_equal(class(max_version), class(self$DT$version))
assert_set_equal(typeof(max_version), typeof(self$DT$version))
assert_scalar(max_version, na.ok = FALSE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The messaging here is similar:

r$> max_version = c(5, 5)

r$> checkmate::assert_scalar(max_version, na.o
    k = FALSE)
Error: Assertion on 'max_version' failed: Must have length 1.

}
assert_logical(all_versions, len = 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> all_versions = c(1, 2)

r$> assert_logical(all_versions, len = 1)
Error: Assertion on 'all_versions' failed: Must be of type 'logical', not 'double'.

}
assert_set_equal(class(max_version), class(self$DT$version))
assert_set_equal(typeof(max_version), typeof(self$DT$version))
assert_scalar(max_version, na.ok = FALSE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The messaging here is similar:

r$> max_version = c(5, 5)

r$> checkmate::assert_scalar(max_version, na.o
    k = FALSE)
Error: Assertion on 'max_version' failed: Must have length 1.

@@ -78,12 +78,11 @@
epi_cor <- function(x, var1, var2, dt1 = 0, dt2 = 0, shift_by = geo_value,
cor_by = geo_value, use = "na.or.complete",
method = c("pearson", "kendall", "spearman")) {
# Check we have an `epi_df` object
if (!inherits(x, "epi_df")) Abort("`x` must be of class `epi_df`.")
assert_class(x, "epi_df")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> assert_class(x, "epi_df")
Error: Assertion on 'x' failed: Must inherit from class 'epi_df', but has class 'numeric'.

if (!is.data.frame(x)) {
Abort("`x` must be a data frame.")
}
assert_data_frame(x)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$>   assert_data_frame(x)
Error: Assertion on 'x' failed: Must be of type 'data.frame', not 'double'.

Abort("`x` must be a data frame.")
}
assert_data_frame(x)
assert_list(additional_metadata)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> assert_list(additional_metadata)
Error: Assertion on 'additional_metadata' failed: Must be of type 'list', not 'double'.

epiprocess__DT_names = names(ungrouped$DT)
)
}
assert_class(ungrouped, "epi_archive")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> ungrouped = 5

r$> assert_class(ungrouped, "epi_archive")
Error: Assertion on 'ungrouped' failed: Must inherit from class 'epi_archive', but has class 'numeric'.

)
}
assert_class(ungrouped, "epi_archive")
assert_character(vars)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> vars = 5

r$> assert_character(vars)
Error: Assertion on 'vars' failed: Must be of type 'character', not 'double'.

}
assert_logical(drop)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> assert_logical(5, len = 1)
Error: Assertion on '5' failed: Must be of type 'logical', not 'double'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trying to be an annoying blocker here. So feel free to ignore me. Here's the difference

Previously:
Screenshot 2024-02-06 at 17 13 17

Now:
Screenshot 2024-02-06 at 17 11 51

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel like the previous one was clearer? Is it the rlang::last_trace() message that you wish was there?

} else {
assert_numeric(ref_time_values, upper = private$ungrouped$versions_end, min.len = 1L, null.ok = FALSE, any.missing = FALSE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> ref_time_values = Date()

r$> assert_numeric(ref_time_values, 
    upper = as.Date("2020-03-05"), min.len = 1
    L, null.ok = FALSE, any.missing = FALSE)
Error: Assertion on 'ref_time_values' failed: Must have length >= 1, but has length 0.

r$> ref_time_values = c(as.Date("2020-03-06"),
     NA)

r$> assert_numeric(ref_time_values, 
    min.len = 1L, null.ok = FALSE, any.missing
     = FALSE)
Error: Assertion on 'ref_time_values' failed: Contains missing values (element 2).

if (length(before) != 1L || is.na(before) || before < 0L) {
Abort("`before` must be length-1, non-NA, non-negative.")
}
assert_int(before, lower = 0L, null.ok = FALSE, na.ok = FALSE)
Copy link
Contributor Author

@dshemetov dshemetov Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> before = c(1, 2)

r$> assert_int(before, lower = 0L, null.ok = F
    ALSE, na.ok = FALSE)
Error: Assertion on 'before' failed: Must have length 1.

r$> before = NA

r$> assert_int(before, lower = 0L, null.ok = F
    ALSE, na.ok = FALSE)
Error: Assertion on 'before' failed: May not be NA.

r$> before = -5

r$> assert_int(before, lower = 0L, null.ok = F
    ALSE, na.ok = FALSE)
Error: Assertion on 'before' failed: Element 1 is not >= 0.

}
assert_logical(as_list_col, len = 1L)
assert_logical(all_versions, len = 1L)
assert_character(names_sep, len = 1L, null.ok = TRUE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See examples for these above (messaging is similar).

if (!(is.atomic(comp_value) || is.data.frame(comp_value))) {
Abort("The slide computation must return an atomic vector or a data frame.")
}
assert(check_atomic(comp_value, any.missing = TRUE), check_data_frame(comp_value), combine = "or", .var.name = vname(comp_value))
Copy link
Contributor Author

@dshemetov dshemetov Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r$> comp_value = list(c(2))

r$>           assert(check_atomic(comp_value, 
    any.missing = TRUE), check_data_frame(comp
    _value), combine = "or", .var.name = vname
    (comp_value))
Error: Assertion on 'comp_value' failed: One of the following must apply:
 * check_atomic(comp_value): Must be of
 * type 'atomic', not 'list'
 * check_data_frame(comp_value): Must be
 * of type 'data.frame', not 'list'.

@@ -77,7 +77,7 @@
#'
#' @export
epix_as_of <- function(x, max_version, min_time_value = -Inf, all_versions = FALSE) {
if (!inherits(x, "epi_archive")) Abort("`x` must be of class `epi_archive`.")
assert_class(x, "epi_archive")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all these class messages, the default messaging is similar, see above.

)
if (anyDuplicated(ref_time_values) != 0L) {
cli_abort("`ref_time_values` must not contain any duplicates; use `unique` if appropriate.")
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See similar section in grouped_epi_archive above.

if (length(before) != 1L || is.na(before) || before < 0L) {
Abort("`before` must be length-1, non-NA, non-negative")
}
assert_int(before, lower = 0, null.ok = FALSE, na.ok = FALSE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default messages are similar, see above.

@dshemetov dshemetov requested a review from dajmcdon February 7, 2024 01:19
@dshemetov
Copy link
Contributor Author

@dajmcdon I've updated this PR with better messaging in certain cases. I also went through all the instances where I changed to using a checkmate assert and added an example of its error message. I think the remaining ones are all very similar to what we had before. Let me know what you think!

Copy link
Contributor

@dajmcdon dajmcdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one comment here: #413 (comment).

If you're fine with it then so am I.

@dshemetov
Copy link
Contributor Author

Thanks! I'm happy to polish a little more. Just let me know what you'd prefer to see that in that case.

@dshemetov dshemetov force-pushed the ds/error-msg branch 3 times, most recently from 2d91421 to ea7872b Compare February 7, 2024 03:32
* swap rlang for cli in Abort and Warn
* remove Abort and Warn plugins
* format long error message lines
* replace and remove break_str
* simplify validate_version_bound with checkmate
* use checkmate for validation where possible
* update tests
@dshemetov
Copy link
Contributor Author

Going to merge for now, but happy to polish messages in another PR.

@dshemetov dshemetov merged commit aecb7e5 into dev Feb 7, 2024
@dshemetov dshemetov deleted the ds/error-msg branch February 7, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean up error messages and assertions
2 participants