Understanding learning_rate='constant' in SGDRegressor in Python

- August 30, 2025

In machine learning, Stochastic Gradient Descent (SGD) is a popular optimization technique used for linear models. One of the key parameters in SGD is the learning rate, which controls how much the model's weights are updated during training.

Let’s break down the example provided:

import numpy as np
from sklearn.linear_model import SGDRegressor

# Sample data
X = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])
y = np.array([3, 6, 9, 12, 15])

# SGD Regressor with constant learning rate
model = SGDRegressor(learning_rate='constant', eta0=0.1, max_iter=1000)
model.fit(X, y)

Key Concepts:

Learning Rate (eta0):
This is the step size for each iteration during the training of the model. It determines how quickly or slowly the model adapts to the problem. In the above code, eta0=0.1 sets the learning rate to 0.1.
learning_rate Parameter:
In SGDRegressor, the learning_rate parameter controls how the learning rate changes over iterations. Its options include:
- 'constant': Keeps the learning rate fixed at the value of eta0 for all iterations.
- 'optimal': Adjusts the learning rate based on the number of iterations and regularization.
- 'invscaling': Decreases the learning rate over time using an inverse scaling schedule.
- 'adaptive': Keeps the learning rate constant as long as the training loss keeps decreasing; otherwise, it reduces it.
learning_rate='constant' Explained:
When you set learning_rate='constant', the SGD algorithm uses the same learning rate (eta0) for every iteration, without adjusting it during training. This is useful when you want predictable and steady updates, especially for simple datasets or when tuning the learning rate manually.

Example Behavior:

For the data X and y above, setting a constant learning rate means that in each iteration:

weight_new = weight_old - eta0 * gradient

The value of eta0 remains 0.1 throughout the training process. The model updates its weights at a steady pace until convergence or until max_iter iterations are completed.

Why It Matters:

Stability: Constant learning rate ensures that the step size is predictable.
Simplicity: Useful for beginners or when tuning the learning rate manually.
Control: You can experiment with different constant values to see which leads to faster convergence.

Key Takeaways:

learning_rate='constant' does not change during training.
The learning rate is set by the eta0 parameter.
This option is best for controlled and steady training scenarios.

Summary:

In the SGDRegressor, learning_rate='constant' ensures the learning rate remains steady throughout the training, defined by eta0. It provides predictability and simplicity for optimizing models with gradient descent.

✅ Correct Option from the Question:

It defines the initial value for the learning rate and keeps it constant throughout training.

If you want, I can also create a diagrammatic explanation showing how the constant learning rate compares with adaptive and decreasing learning rates—it helps students visualize the effect clearly.

Do you want me to make that diagram?

Search This Blog

Data Science