Understanding KNN with a Simple Example

The K-Nearest Neighbors (KNN) algorithm is one of the simplest yet most effective classification methods in machine learning. It classifies a new data point based on the majority label of its K nearest neighbors. Let’s analyze the given code snippet.


The Code

from sklearn.neighbors import KNeighborsClassifier  

X_train = [[1, 50], [2, 60], [3, 70], [4, 80]]  
y_train = [0, 0, 1, 1]  

knn = KNeighborsClassifier(n_neighbors=3)  
knn.fit(X_train, y_train)  

X_test = [[2.5, 64]]  
print(knn.predict(X_test))  

Step 1: Training Data

We have 4 training samples:

Feature (x1, x2) Label
(1, 50) 0
(2, 60) 0
(3, 70) 1
(4, 80) 1

So the dataset is well-balanced:

  • Class 0 → First two points

  • Class 1 → Last two points


Step 2: Test Point

We want to classify:

Xtest=(2.5,64)X_{test} = (2.5, 64)

Step 3: Compute Distances

We calculate Euclidean distance:

d=(x1x1t)2+(x2x2t)2d = \sqrt{(x_1 - x_{1t})^2 + (x_2 - x_{2t})^2}
  • Distance to (1, 50):
    (12.5)2+(5064)2=2.25+196=198.2514.09\sqrt{(1-2.5)^2 + (50-64)^2} = \sqrt{2.25 + 196} = \sqrt{198.25} \approx 14.09

  • Distance to (2, 60):
    (22.5)2+(6064)2=0.25+16=16.254.03\sqrt{(2-2.5)^2 + (60-64)^2} = \sqrt{0.25 + 16} = \sqrt{16.25} \approx 4.03

  • Distance to (3, 70):
    (32.5)2+(7064)2=0.25+36=36.256.02\sqrt{(3-2.5)^2 + (70-64)^2} = \sqrt{0.25 + 36} = \sqrt{36.25} \approx 6.02

  • Distance to (4, 80):
    (42.5)2+(8064)2=2.25+256=258.2516.08\sqrt{(4-2.5)^2 + (80-64)^2} = \sqrt{2.25 + 256} = \sqrt{258.25} \approx 16.08


Step 4: Nearest 3 Neighbors

The 3 nearest neighbors are:

  1. (2, 60) → Label 0 → Distance ≈ 4.03

  2. (3, 70) → Label 1 → Distance ≈ 6.02

  3. (1, 50) → Label 0 → Distance ≈ 14.09


Step 5: Majority Vote

Among the 3 neighbors:

  • Class 0 → 2 votes

  • Class 1 → 1 vote

✅ Majority class = 0


Final Output

The code will print:

[0]

Key Takeaways

  • KNN assigns labels based on the majority class of the nearest neighbors.

  • The choice of k (number of neighbors) has a big impact on results.

  • Here, with k=3, the model predicts class 0 for the test point.


✨ This example shows how KNN works step by step with distances and voting.


Would you like me to extend this blog by showing what happens if we change n_neighbors to 1 or 4 to compare the decision?

Comments

Popular posts from this blog

Understanding Data Leakage in Machine Learning: Causes, Examples, and Prevention

🌳 Understanding Maximum Leaf Nodes in Decision Trees (Scikit-Learn)

Linear Regression with and without Intercept: Explained Simply