🎯 Understanding RandomizedSearchCV in Scikit-Learn

Hyperparameter tuning is crucial in machine learning for improving model performance. In scikit-learn, two main methods are used:

Here, we’ll analyze a code snippet using RandomizedSearchCV with SGDRegressor.


📌 The Code

import numpy as np
from sklearn.linear_model import SGDRegressor
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import loguniform

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])
y = np.array([3, 6, 9, 12, 15])

sgd_regressor = SGDRegressor()

param_dist = {
    'loss': ['squared_loss', 'huber', 'epsilon_insensitive'],
    'alpha': loguniform(1e-4, 1e0),
    'penalty': ['l1', 'l2', 'elasticnet'],
    'epsilon': loguniform(1e-4, 1e-1),
}

random_search = RandomizedSearchCV(
    sgd_regressor,
    param_distributions=param_dist,
    n_iter=10,
    cv=3,
    scoring='neg_mean_squared_error'
)

random_search.fit(X, y)

🌲 Breaking It Down

1. Number of parameter settings (n_iter)

  • In RandomizedSearchCV, n_iter specifies how many random combinations of parameters will be tried.

  • Here, n_iter=10, so 10 random parameter settings will be evaluated.


2. Actual number of combinations

  • Unlike GridSearchCV, RandomizedSearchCV does not try all possible combinations.

  • Even though parameter space has:

    • 3 (loss) × 3 (penalty) × continuous distributions (alpha, epsilon) → huge space

  • It only samples 10 random combinations, not 20 or full grid.


3. Search space for alpha

  • alpha is sampled from a log-uniform distribution between 1e-4 and 1.0.

  • This ensures values are spread multiplicatively (good for hyperparameters spanning orders of magnitude).


4. Scoring metric

  • The scoring is set to:

\text{neg_mean_squared_error}
  • This is the negative of Mean Squared Error (since scikit-learn’s CV scorers follow the convention higher = better).


🚀 Final Correct Statements

✅ The n_iter parameter controls the number of parameter settings to try.
❌ The actual number of combinations tried is 20.
✅ The hyperparameter search space for alpha follows a log-uniform distribution.
✅ The scoring metric is the negative mean squared error.


💡 Takeaway


👉 Would you like me to now compile all four topics (Decision Trees, Dimensionality Reduction, Partial Fit, RandomizedSearchCV) into a single structured blog guide titled something like “Top Machine Learning Interview Questions Explained”?

Comments

Popular posts from this blog

Understanding Data Leakage in Machine Learning: Causes, Examples, and Prevention

🌳 Understanding Maximum Leaf Nodes in Decision Trees (Scikit-Learn)

Linear Regression with and without Intercept: Explained Simply