Skip to content

Added doctest to decision_tree.py #11143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 5, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 35 additions & 14 deletions machine_learning/decision_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ def __init__(self, depth=5, min_leaf_size=5):
def mean_squared_error(self, labels, prediction):
"""
mean_squared_error:
@param labels: a one dimensional numpy array
@param labels: a one-dimensional numpy array
@param prediction: a floating point value
return value: mean_squared_error calculates the error if prediction is used to
estimate the labels
Expand All @@ -44,26 +44,47 @@ def mean_squared_error(self, labels, prediction):
def train(self, x, y):
"""
train:
@param x: a one dimensional numpy array
@param y: a one dimensional numpy array.
@param x: a one-dimensional numpy array
@param y: a one-dimensional numpy array.
The contents of y are the labels for the corresponding X values

train does not have a return value
"""

"""
this section is to check that the inputs conform to our dimensionality
train() does not have a return value

Examples:
1. Try to train when x & y are of same length & 1 dimensions (No errors)
>>> dt = DecisionTree()
>>> dt.train(np.array([10,20,30,40,50]),np.array([0,0,0,1,1]))

2. Try to train when x is 2 dimensions
>>> dt = DecisionTree()
>>> dt.train(np.array([[1,2,3,4,5],[1,2,3,4,5]]),np.array([0,0,0,1,1]))
Traceback (most recent call last):
...
ValueError: Input data set must be one-dimensional

3. Try to train when x and y are not of the same length
>>> dt = DecisionTree()
>>> dt.train(np.array([1,2,3,4,5]),np.array([[0,0,0,1,1],[0,0,0,1,1]]))
Traceback (most recent call last):
...
ValueError: x and y have different lengths

4. Try to train when x & y are of the same length but different dimensions
>>> dt = DecisionTree()
>>> dt.train(np.array([1,2,3,4,5]),np.array([[1],[2],[3],[4],[5]]))
Traceback (most recent call last):
...
ValueError: Data set labels must be one-dimensional

This section is to check that the inputs conform to our dimensionality
constraints
"""
if x.ndim != 1:
print("Error: Input data set must be one dimensional")
return
raise ValueError("Input data set must be one-dimensional")
if len(x) != len(y):
print("Error: X and y have different lengths")
return
raise ValueError("x and y have different lengths")
if y.ndim != 1:
print("Error: Data set labels must be one dimensional")
return
raise ValueError("Data set labels must be one-dimensional")

if len(x) < 2 * self.min_leaf_size:
self.prediction = np.mean(y)
Expand Down