Linear Regression with and without Intercept: Explained Simply

If you’re new to machine learning, linear regression is one of the easiest models to understand. It tries to draw a straight line (or a flat plane, if there are more features) that best fits the data.

When we use scikit-learn’s LinearRegression, we can decide whether the line should include an intercept (a constant number added at the end of the equation) or not.


The Example Code

import numpy as np
from sklearn.linear_model import LinearRegression

X = np.array([[12, 13], [23, 24], [24, 25], [35, 36], [36, 37], [37, 38]])

# Create target values y
y = np.dot(X, np.array([4, 5])) + 6

reg1 = LinearRegression(fit_intercept=True).fit(X, y)
s1 = reg1.score(X, y)

reg2 = LinearRegression(fit_intercept=False).fit(X, y)
s2 = reg2.score(X, y)

What’s Happening Here?

  • X is our input data (two features per row).

  • y is the output we want to predict. We made it using this formula:

    y=4x0+5x1+6y = 4 * x_0 + 5 * x_1 + 6

This formula clearly has an intercept = 6.


Case 1: With Intercept (fit_intercept=True)

Here the model tries to learn both:

  • the slope (coefficients for each feature), and

  • the intercept (the constant).

It finds:

  • Intercept ≈ 6.5

  • Coefficients ≈ [4.5, 4.5]

This gives a perfect score:

  • s1 = 1.0


Case 2: Without Intercept (fit_intercept=False)

Here the model is not allowed to add any constant. You might expect this to fail, but surprisingly, it still works perfectly. It finds:

  • Coefficients ≈ [-2, 11]

  • Intercept = 0 (forced)

This also gives:

  • s2 = 1.0


Why Did Both Work?

Take a closer look at X. Each row has values like:

[12, 13]
[23, 24]
[24, 25]
...

Notice that the second column is always the first column + 1.

👉 Because of this special pattern, there are actually many different equations that fit the data perfectly. That’s why both versions (with intercept and without intercept) achieve the same perfect score.


The Result

  • s1 = 1.0

  • s2 = 1.0

✅ So the correct answer is: s1 = s2


Key Points to Remember (for Freshers)

  1. The intercept is just a constant added to the equation.

  2. If your data has special patterns, even a model without an intercept can fit perfectly.

  3. Always check your data before drawing conclusions — sometimes different formulas give the same predictions.


This is a simple but powerful example of how linear regression works and why you should always look at both the coefficients and the score to truly understand your model.

Comments

Popular posts from this blog

Understanding Data Leakage in Machine Learning: Causes, Examples, and Prevention

🌳 Understanding Maximum Leaf Nodes in Decision Trees (Scikit-Learn)