Decision Tree Classifier Example: Beginner-Friendly Explanation
Decision Trees are one of the simplest and most interpretable models in machine learning. They split data step by step based on feature values and decide which class a sample belongs to.
Let’s look at a real code example with scikit-learn and understand what happens.
The Code
from sklearn.tree import DecisionTreeClassifier
X_train = [[1, 10], [2, 20], [3, 30], [4, 40]]
y_train = [0, 0, 0, 1]
clf = DecisionTreeClassifier(max_depth=2, random_state=0)
clf.fit(X_train, y_train)
X_test = [[2.5, 25]]
print(clf.predict(X_test))
Step 1: Training Data
The training data (X_train) has two features per row:
[1, 10] → class 0
[2, 20] → class 0
[3, 30] → class 0
[4, 40] → class 1
So the first three samples belong to class 0, and the last one belongs to class 1.
Step 2: Building the Decision Tree
-
max_depth=2means the tree can split at most two times from the root to a leaf. -
The tree looks for the best feature and threshold to separate classes.
-
Since most of the data belongs to class
0, the tree will learn that values closer to[1,2,3]should map to class0.
Step 3: Making a Prediction
We test with:
X_test = [[2.5, 25]]
This point lies between [2,20] and [3,30], both of which are class 0.
So, the decision tree predicts:
[0]
Step 4: The Output
Final result:
[0]
✅ Correct answer = [0]
Key Takeaways for Freshers
-
Decision trees split data step by step using feature thresholds.
-
The parameter
max_depthcontrols how “deep” the tree can grow (to prevent overfitting). -
The prediction is made by following the learned splits until a leaf node is reached.
-
In this example, since the test point is closer to samples of class
0, the output is[0].
This small example shows how decision trees are intuitive and easy to understand, making them a great starting point in machine learning!
Comments
Post a Comment