Understanding KNN with a Simple Example
The K-Nearest Neighbors (KNN) algorithm is one of the simplest yet most effective classification methods in machine learning. It classifies a new data point based on the majority label of its K nearest neighbors. Let’s analyze the given code snippet.
The Code
from sklearn.neighbors import KNeighborsClassifier
X_train = [[1, 50], [2, 60], [3, 70], [4, 80]]
y_train = [0, 0, 1, 1]
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
X_test = [[2.5, 64]]
print(knn.predict(X_test))
Step 1: Training Data
We have 4 training samples:
| Feature (x1, x2) | Label |
|---|---|
| (1, 50) | 0 |
| (2, 60) | 0 |
| (3, 70) | 1 |
| (4, 80) | 1 |
So the dataset is well-balanced:
-
Class 0 → First two points
-
Class 1 → Last two points
Step 2: Test Point
We want to classify:
Step 3: Compute Distances
We calculate Euclidean distance:
-
Distance to (1, 50):
-
Distance to (2, 60):
-
Distance to (3, 70):
-
Distance to (4, 80):
Step 4: Nearest 3 Neighbors
The 3 nearest neighbors are:
-
(2, 60) → Label 0 → Distance ≈ 4.03
-
(3, 70) → Label 1 → Distance ≈ 6.02
-
(1, 50) → Label 0 → Distance ≈ 14.09
Step 5: Majority Vote
Among the 3 neighbors:
-
Class 0 → 2 votes
-
Class 1 → 1 vote
✅ Majority class = 0
Final Output
The code will print:
[0]
Key Takeaways
-
KNN assigns labels based on the majority class of the nearest neighbors.
-
The choice of
k(number of neighbors) has a big impact on results. -
Here, with
k=3, the model predicts class 0 for the test point.
✨ This example shows how KNN works step by step with distances and voting.
Would you like me to extend this blog by showing what happens if we change n_neighbors to 1 or 4 to compare the decision?
Comments
Post a Comment