Skip to content

Commit 71a213c

Browse files
Merge pull request #174 from UBC-DSCI/response-target-outcome
Replace "target" and "outcome" with "response"
2 parents f44b8f1 + 3d98444 commit 71a213c

File tree

4 files changed

+8
-8
lines changed

4 files changed

+8
-8
lines changed

source/classification1.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -955,7 +955,7 @@ In order to fit the model on the breast cancer data, we need to call `fit` on
955955
the model object. The `X` argument is used to specify the data for the predictor
956956
variables, while the `y` argument is used to specify the data for the response variable.
957957
So below, we set `X=cancer_train[["Perimeter", "Concavity"]]` and
958-
`y=cancer_train['Class']` to specify that `Class` is the target
958+
`y=cancer_train['Class']` to specify that `Class` is the response
959959
variable (the one we want to predict), and both `Perimeter` and `Concavity` are
960960
to be used as the predictors. Note that the `fit` function might look like it does not
961961
do much from the outside, but it is actually doing all the heavy lifting to train

source/classification2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -373,7 +373,7 @@ that the accuracy estimates from the test data are reasonable. First,
373373
setting `shuffle=True` (which is the default) means the data will be shuffled before splitting,
374374
which ensures that any ordering present
375375
in the data does not influence the data that ends up in the training and testing sets.
376-
Second, by specifying the `stratify` parameter to be the target column of the training set,
376+
Second, by specifying the `stratify` parameter to be the response variable in the training set,
377377
it **stratifies** the data by the class label, to ensure that roughly
378378
the same proportion of each class ends up in both the training and testing sets. For example,
379379
in our data set, roughly 63% of the

source/regression1.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ The scientific question guides our initial exploration: the columns in the
170170
data that we are interested in are `sqft` (house size, in livable square feet)
171171
and `price` (house sale price, in US dollars (USD)). The first step is to visualize
172172
the data as a scatter plot where we place the predictor variable
173-
(house size) on the x-axis, and we place the target/response variable that we
173+
(house size) on the x-axis, and we place the response variable that we
174174
want to predict (sale price) on the y-axis.
175175

176176
> **Note:** Given that the y-axis unit is dollars in {numref}`fig:07-edaRegr`,
@@ -922,7 +922,7 @@ As the algorithm is the same, we will not cover it again in this chapter.
922922
We will now demonstrate a multivariable KNN regression analysis of the
923923
Sacramento real estate data using `scikit-learn`. This time we will use
924924
house size (measured in square feet) as well as number of bedrooms as our
925-
predictors, and continue to use house sale price as our outcome/target variable
925+
predictors, and continue to use house sale price as our response variable
926926
that we are trying to predict.
927927
It is always a good practice to do exploratory data analysis, such as
928928
visualizing the data, before we start modeling the data. {numref}`fig:07-bedscatter`

source/regression2.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -464,7 +464,7 @@ glue("sacr_RMSPE", "{0:,.0f}".format(RMSPE))
464464

465465
Our final model's test error as assessed by RMSPE
466466
is {glue:text}`sacr_RMSPE`.
467-
Remember that this is in units of the target/response variable, and here that
467+
Remember that this is in units of the response variable, and here that
468468
is US Dollars (USD). Does this mean our model is "good" at predicting house
469469
sale price based off of the predictor of home size? Again, answering this is
470470
tricky and requires knowledge of how you intend to use the prediction.
@@ -645,7 +645,7 @@ flexible and can be quite wiggly. But there is a major interpretability advantag
645645
model to a straight line. A
646646
straight line can be defined by two numbers, the
647647
vertical intercept and the slope. The intercept tells us what the prediction is when
648-
all of the predictors are equal to 0; and the slope tells us what unit increase in the target/response
648+
all of the predictors are equal to 0; and the slope tells us what unit increase in the response
649649
variable we predict given a unit increase in the predictor
650650
variable. KNN regression, as simple as it is to implement and understand, has no such
651651
interpretability from its wiggly line.
@@ -654,7 +654,7 @@ interpretability from its wiggly line.
654654
```
655655

656656
There can, however, also be a disadvantage to using a simple linear regression
657-
model in some cases, particularly when the relationship between the target and
657+
model in some cases, particularly when the relationship between the response variable and
658658
the predictor is not linear, but instead some other shape (e.g., curved or oscillating). In
659659
these cases the prediction model from a simple linear regression
660660
will underfit (have high bias), meaning that model/predicted values do not
@@ -1324,7 +1324,7 @@ predictive performance.
13241324

13251325
So far in this textbook we have used regression only in the context of
13261326
prediction. However, regression can also be seen as a method to understand and
1327-
quantify the effects of individual variables on a response / outcome of interest.
1327+
quantify the effects of individual variables on a response variable of interest.
13281328
In the housing example from this chapter, beyond just using past data
13291329
to predict future sale prices,
13301330
we might also be interested in describing the

0 commit comments

Comments
 (0)