Skip to content

Commit 1b001cd

Browse files
committed
Move issue from column to metadata
- As per the discussion on #8, from now on, an `epi_signal` object represents a single snapshot of a data set - Modify `as.epi_signal()` to reflect this change - Update documentation and vignettes accordingly
1 parent 8547c5d commit 1b001cd

21 files changed

+178
-193
lines changed

R/change.R

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
#' Compute percentage change of values in `epi_signal` data frame
22
#'
33
#' Computes the percentage change of the values in a `epi_signal` data frame,
4-
#' per geographic location. (When multiple issue dates are present, only the
5-
#' latest issue is considered.) See the [percentage change
4+
#' per geographic location. See the [percentage change
65
#' vignette](https://cmu-delphi.github.io/epitools/articles/pct-change.html) for
76
#' examples.
87
#'

R/cor.R

+1-7
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,7 @@
11
#' Compute correlations between two `epi_signal` data frames
22
#'
33
#' Computes correlations between two `epi_signal` data frames, allowing for
4-
#' slicing by geo location, or by time. (When multiple issue dates are present,
5-
#' only the latest issue from each data frame is used for correlations.) See the
6-
#' [correlations
4+
#' slicing by geo location, or by time. See the [correlations
75
#' vignette](https://cmu-delphi.github.io/epitools/articles/correlations.html)
86
#' for examples.
97
#'
@@ -41,10 +39,6 @@ sliced_cor = function(x, y, dt_x = 0, dt_y = 0,
4139
abort("`y` be of class `epi_signal`.")
4240
}
4341

44-
# Get the latest issue per value
45-
x = latest_issue(x)
46-
y = latest_issue(y)
47-
4842
# Which way to slice? Which method?
4943
by = match.arg(by)
5044
method = match.arg(method)

R/deriv.R

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
#' Estimate derivatives of values in an `epi_signal` data frame
22
#'
33
#' Estimates derivatives of the values in an `epi_signal` data frame, using a
4-
#' local (in time) linear regression or a smoothing spline. (When multiple issue
5-
#' dates are present, only the latest issue is considered.) See the [estimating
4+
#' local (in time) linear regression or a smoothing spline. See the [estimating
65
#' derivatives
76
#' vignette](https://cmu-delphi.github.io/epitools/articles/derivatives.html)
87
#' for examples.

R/epi_signal.R

+23-31
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,18 @@
66
#'
77
#' @details An `epi_signal` object is simply a tibble, with (at least) the
88
#' following columns (with data types written in tibble notation):
9-
#' * `value` <dbl>: the value of the signal
9+
#' * `value` <dbl> or <list>: the value of the signal
1010
#' * `geo_value` <int> or <str>: the associated geographic value
1111
#' * `time_value` <date>: the associated time value
12-
#' * `issue` <date>: the time value at which the given signal value was issued
1312
#'
1413
#' An `epi_signal` object also has a tibble `metadata` stored in its attributes,
1514
#' with (at least) the following columns:
1615
#' * `name` <str>: the name of the signal
1716
#' * `geo_type` <str>: the geographic resolution
1817
#' * `time_type` <str>: the temporal resolution
18+
#' * `issue` <date>: the time value at which the given data set was issued (this
19+
#' represents the maximum of issue dates of individual signal values in the
20+
#' data set)
1921
#' * `signal_type` <str>: the type of the signal value (optional)
2022
#' * `signal_unit` <str>: the units associated with the signal value (optional)
2123
#'
@@ -51,13 +53,12 @@
5153
#' @param geo_type The geographic resolution. If missing, then it will be
5254
#' guessed from the geo values present.
5355
#' @param time_type The temporal resolution. If missing, then it will be guessed
54-
#' from the time values present.
56+
#' from the time values present.
57+
#' @param issue Issue date to use for this data. If missing, then today's date
58+
#' will be used.
5559
#' @param signal_type The type of the signal value.
5660
#' @param signal_unit The units of the signal value.
57-
#' @param issue Issue date to use for this data, if not present in `x`. If no
58-
#' issue date is present in `x` and `issue` is missing, then today's date will
59-
#' be used.
60-
#' @param metadata List or tibble of additional metadata to attach to the
61+
#' @param metadata List or tibble of *additional* metadata to attach to the
6162
#' `epi_signal` object. All objects will have `geo_type`, `time_type`,
6263
#' `signal_type` (optional), and `signal_unit` (optional) entries included in
6364
#' their metadata, derived from the above arguments; any entries in the passed
@@ -78,13 +79,13 @@ as.epi_signal.epi_signal = function(x, ...) {
7879

7980
#' @method as.epi_signal tibble
8081
#' @describeIn as.epi_signal The input tibble `x` must contain the columns
81-
#' `value`, `geo_value`, and `time_value`. If an `issue` column is present in
82-
#' `x`, it will be used as the issue date for each observation; if not, the
83-
#' `issue` argument will be used. Other columns will be preserved as-is.
82+
#' `value`, `geo_value`, and `time_value`. All other columns will be preserved
83+
#' as-is.
8484
#' @importFrom rlang .data abort
8585
#' @export
86-
as.epi_signal.tibble = function(x, name, geo_type, time_type, signal_type,
87-
signal_unit, issue, metadata = list(), ...) {
86+
as.epi_signal.tibble = function(x, name, geo_type, time_type, issue,
87+
signal_type, signal_unit, metadata = list(),
88+
...) {
8889
if (!("value" %in% names(x))) {
8990
abort(paste(
9091
"`x` must contain a `value` column",
@@ -111,7 +112,11 @@ as.epi_signal.tibble = function(x, name, geo_type, time_type, signal_type,
111112
"`name` must be specified.",
112113
class = "epi_coerce_name")
113114
}
114-
115+
116+
# If issue is missing, thne use today's date
117+
if (missing(issue)) issue = Sys.Date()
118+
119+
# If geo type is missing ,then try to guess it
115120
if (missing(geo_type)) {
116121
if (is.character(x$geo_value)) {
117122
# Convert geo values to lowercase
@@ -143,6 +148,7 @@ as.epi_signal.tibble = function(x, name, geo_type, time_type, signal_type,
143148
else geo_type = "unknown" # TODO should we use NA? Or some other flag?
144149
}
145150

151+
# If time type is missing, then try to guess it
146152
if (missing(time_type)) {
147153
# Convert time values to Date format
148154
x$time_value = as.Date(x$time_value)
@@ -156,6 +162,7 @@ as.epi_signal.tibble = function(x, name, geo_type, time_type, signal_type,
156162

157163
# Define metadata fields
158164
metadata$name = name
165+
metadata$issue = issue
159166
metadata$geo_type = geo_type
160167
metadata$time_type = time_type
161168
if (!missing(signal_type)) metadata$signal_type = signal_type
@@ -167,34 +174,19 @@ as.epi_signal.tibble = function(x, name, geo_type, time_type, signal_type,
167174
class(x) = c("epi_signal", class(x))
168175
attributes(x)$metadata = metadata
169176

170-
# Reorder columns: value, geo_value, time_value,
177+
# Reorder columns: value, geo_value, time_value
171178
x = dplyr::relocate(x,
172179
.data$value,
173180
.data$geo_value,
174181
.data$time_value)
175182

176-
# If no rows, then quit
177-
if (nrow(x) == 0) return(x)
178-
179-
# Add issue column if we need to
180-
if (!("issue" %in% names(x))) {
181-
if (missing(issue)) x$issue = Sys.Date()
182-
else x$issue = issue
183-
}
184-
185-
# Reorder columns: issue after time_value
186-
x = dplyr::relocate(x,
187-
.data$issue,
188-
.after = .data$time_value)
189-
190183
return(x)
191184
}
192185

193186
#' @method as.epi_signal data.frame
194187
#' @describeIn as.epi_signal The input data frame `x` must contain the columns
195-
#' `value`, `geo_value`, and `time_value`. If an `issue` column is present in
196-
#' `x`, it will be used as the issue date for each observation; if not, the
197-
#' `issue` argument will be used. Other columns will be preserved as-is.
188+
#' `value`, `geo_value`, and `time_value`. All other columns will be preserved
189+
#' as-is.
198190
#' @export
199191
as.epi_signal.data.frame = as.epi_signal.tibble
200192

R/slide.R

+1-5
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22
#' location
33
#'
44
#' Slides a given function over the values in an `epi_signal` data frame,
5-
#' grouped by geo location. (When multiple issue dates are present, only the
6-
#' latest issue is considered.) See the [slide
5+
#' grouped by geo location. See the [slide
76
#' vignette](https://cmu-delphi.github.io/epitools/articles/slide.html)
87
#' for examples.
98
#'
@@ -41,9 +40,6 @@ slide_by_geo = function(x, slide_fun, n = 14, col_name = "slide_value",
4140
abort("`x` be of class `epi_signal`.")
4241
}
4342

44-
# Get the latest issue per value
45-
x = latest_issue(x)
46-
4743
# Which slide_index function?
4844
col_type = match.arg(col_type)
4945
slide_index_zzz = switch(col_type,

R/utils.R

+9-7
Original file line numberDiff line numberDiff line change
@@ -38,22 +38,24 @@ quiet = function(x) {
3838

3939
##########
4040

41-
#' Fetch the latest or earliest issue for each observation
41+
# TODO: fix. this function is no longer in sync with the epi_signal format.
42+
# Currently not being exported
43+
44+
#' Fetch the latest or for each observation
4245
#'
4346
#' The data returned from `covidcast_signal()` or `covidcast_signals()` can, if
4447
#' called with the `issues` argument, contain multiple issues for a single
4548
#' observation in a single location. These functions filter the data frame to
46-
#' contain only the earliest issue or only the latest issue.
49+
#' contain only the latest issue.
4750
#'
4851
#' @param df A `covidcast_signal` or `covidcast_signal_long` data frame, such as
4952
#' returned from `covidcast_signal()` or the "long" format of
5053
#' `aggregate_signals()`.
51-
#' @return A data frame in the same form, but with only the earliest or latest
52-
#' issue of every observation. Note that these functions sort the data frame
53-
#' as part of their filtering, so the output data frame rows may be in a
54-
#' different order.
54+
#' @return A data frame in the same form, but with only the latest issue of
55+
#' every observation. Note that these functions sort the data frame as part of
56+
#' their filtering, so the output data frame rows may be in a different
57+
#' order.
5558
#'
56-
#' @export
5759
latest_issue = function(x) {
5860
if (!inherits(x, "epi_signal")) {
5961
abort("`x` of class `epi_signal`.")

docs/articles/correlations.html

+2-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Loading
Loading
Loading
Loading

docs/articles/derivatives.html

+14-14
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)