Leave-One-Out Cross-Validation (LOOCV) Explained with Example
Cross-validation is a fundamental technique in machine learning to estimate model performance. One of the most extreme forms is Leave-One-Out Cross-Validation (LOOCV).
The Question
For a dataset with 1000 data points and 100 features, the following code will generate how many models during execution?
Code:
from sklearn.model_selection import cross_val_score, LeaveOneOut
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
loocv = LeaveOneOut()
score = cross_val_score(lin_reg, X, y, cv=loocv)
Understanding LOOCV
-
LeaveOneOut() creates as many folds as there are data points.
-
Each iteration:
-
1 sample is used as the test set.
-
The remaining N-1 samples are used for training.
-
-
If there are N = 1000 data points, LOOCV runs the model 1000 times.
Calculation
✅ So, the code will train 1000 models.
Why not other answers?
-
❌ 100 → Confusion with the number of features (not relevant to LOOCV).
-
❌ 99 → Misinterpretation (we train with 999 samples each time, but total models = 1000).
-
✅ 1000 → Correct, one model per data point.
Final Answer
👉 The code will generate 1000 models during execution.
✨ In summary: LOOCV is accurate but computationally expensive. For large datasets, other methods like k-fold cross-validation are preferred.
Would you like me to also compare LOOCV vs K-Fold (with diagrams) for better intuition in your blog?
Comments
Post a Comment