Skip to content

Commit f908581

Browse files
Merge pull request #502 from UBC-DSCI/response-target-outcome
Replace "target" and "outcome" with "response"
2 parents 70f8f04 + b8b653d commit f908581

File tree

3 files changed

+12
-12
lines changed

3 files changed

+12
-12
lines changed

source/classification1.Rmd

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -749,8 +749,8 @@ knn_spec
749749

750750
In order to fit the model on the breast cancer data, we need to pass the model specification
751751
and the data set to the `fit` function. We also need to specify what variables to use as predictors
752-
and what variable to use as the target. Below, the `Class ~ Perimeter + Concavity` argument specifies
753-
that `Class` is the target variable (the one we want to predict),
752+
and what variable to use as the response. Below, the `Class ~ Perimeter + Concavity` argument specifies
753+
that `Class` is the response variable (the one we want to predict),
754754
and both `Perimeter` and `Concavity` are to be used as the predictors.
755755

756756
```{r 05-tidymodels-4}
@@ -861,7 +861,7 @@ In the `tidymodels` framework, all data preprocessing happens
861861
using a `recipe` from [the `recipes` R package](https://recipes.tidymodels.org/) [@recipes].
862862
Here we will initialize a recipe\index{recipe} \index{tidymodels!recipe|see{recipe}} for
863863
the `unscaled_cancer` data above, specifying
864-
that the `Class` variable is the target, and all other variables are predictors:
864+
that the `Class` variable is the response, and all other variables are predictors:
865865

866866
```{r 05-scaling-2, results=FALSE, message=FALSE, echo = TRUE}
867867
uc_recipe <- recipe(Class ~ ., data = unscaled_cancer)
@@ -872,7 +872,7 @@ uc_recipe
872872
hidden_print_cli(uc_recipe)
873873
```
874874

875-
So far, there is not much in the recipe; just a statement about the number of targets
875+
So far, there is not much in the recipe; just a statement about the number of response variables
876876
and predictors. Let's add
877877
scaling (`step_scale`) \index{recipe!step\_scale} and
878878
centering (`step_center`) \index{recipe!step\_center} steps for
@@ -904,7 +904,7 @@ as well as naming particular columns with the same syntax as the `select` functi
904904
For example:
905905

906906
- `all_nominal()` and `all_numeric()`: specify all categorical or all numeric variables
907-
- `all_predictors()` and `all_outcomes()`: specify all predictor or all target variables
907+
- `all_predictors()` and `all_outcomes()`: specify all predictor or all response variables
908908
- `Area, Smoothness`: specify both the `Area` and `Smoothness` variable
909909
- `-Class`: specify everything except the `Class` variable
910910

@@ -1324,7 +1324,7 @@ First we will load the data, create a model, and specify a recipe for how the da
13241324

13251325
```{r 05-workflow, message = FALSE, warning = FALSE}
13261326
# load the unscaled cancer data
1327-
# and make sure the target Class variable is a factor
1327+
# and make sure the response variable, Class, is a factor
13281328
unscaled_cancer <- read_csv("data/unscaled_wdbc.csv") |>
13291329
mutate(Class = as_factor(Class))
13301330

source/regression1.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,7 @@ The scientific question guides our initial exploration: the columns in the
148148
data that we are interested in are `sqft` (house size, in livable square feet)
149149
and `price` (house sale price, in US dollars (USD)). The first step is to visualize
150150
the data as a scatter plot where we place the predictor variable
151-
(house size) on the x-axis, and we place the target/response variable that we
151+
(house size) on the x-axis, and we place the response variable that we
152152
want to predict (sale price) on the y-axis.
153153
\index{ggplot!geom\_point}
154154
\index{visualization!scatter}
@@ -687,7 +687,7 @@ As the algorithm is the same, we will not cover it again in this chapter.
687687
We will now demonstrate a multivariable KNN regression \index{K-nearest neighbors!multivariable regression} analysis of the
688688
Sacramento real estate \index{Sacramento real estate} data using `tidymodels`. This time we will use
689689
house size (measured in square feet) as well as number of bedrooms as our
690-
predictors, and continue to use house sale price as our outcome/target variable
690+
predictors, and continue to use house sale price as our response variable
691691
that we are trying to predict.
692692
It is always a good practice to do exploratory data analysis, such as
693693
visualizing the data, before we start modeling the data. Figure \@ref(fig:07-bedscatter)

source/regression2.Rmd

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -286,7 +286,7 @@ lm_test_results
286286

287287
Our final model's test error as assessed by RMSPE \index{RMSPE}
288288
is `r format(round(lm_test_results |> filter(.metric == 'rmse') |> pull(.estimate)), big.mark=",", nsmall=0, scientific=FALSE)`.
289-
Remember that this is in units of the target/response variable, and here that
289+
Remember that this is in units of the response variable, and here that
290290
is US Dollars (USD). Does this mean our model is "good" at predicting house
291291
sale price based off of the predictor of home size? Again, answering this is
292292
tricky and requires knowledge of how you intend to use the prediction.
@@ -402,13 +402,13 @@ flexible and can be quite wiggly. But there is a major interpretability advantag
402402
model to a straight line. A
403403
straight line can be defined by two numbers, the
404404
vertical intercept and the slope. The intercept tells us what the prediction is when
405-
all of the predictors are equal to 0; and the slope tells us what unit increase in the target/response
405+
all of the predictors are equal to 0; and the slope tells us what unit increase in the response
406406
variable we predict given a unit increase in the predictor
407407
variable. KNN regression, as simple as it is to implement and understand, has no such
408408
interpretability from its wiggly line.
409409

410410
There can, however, also be a disadvantage to using a simple linear regression
411-
model in some cases, particularly when the relationship between the target and
411+
model in some cases, particularly when the relationship between the response and
412412
the predictor is not linear, but instead some other shape (e.g., curved or oscillating). In
413413
these cases the prediction model from a simple linear regression
414414
will underfit \index{underfitting!regression} (have high bias), meaning that model/predicted values do not
@@ -889,7 +889,7 @@ predictive performance.
889889

890890
So far in this textbook we have used regression only in the context of
891891
prediction. However, regression can also be seen as a method to understand and
892-
quantify the effects of individual variables on a response / outcome of interest.
892+
quantify the effects of individual predictor variables on a response variable of interest.
893893
In the housing example from this chapter, beyond just using past data
894894
to predict future sale prices,
895895
we might also be interested in describing the

0 commit comments

Comments
 (0)