Wednesday, April 29, 2015

Linear Regression : Learning Rate

Linear Regression : Learning Rate

How to correctly choose the Learning Rate $\alpha$?

The correct value of $\alpha$ would make the Gradient Descent Algorithm to converge to a minima and the cost J will reach a minimum. If $\alpha$ is large, the algorithm may not converge, and if $\alpha$ is small, the algorithm will be very very slow. It can be shown mathematically that for a sufficiently small value of $\alpha$, the algorithm will always converge, though it may be slow.

The ideal way is to see the value of the cost function after few iterations, and see if the value of J is decreasing. For example, compare the values of J after 100, 200 and 300 iterations and see if it is decreasing (it should be, for the algorithm to converge). Ideally, declare convergence if the value of J is decreasing by less than $10^{-3}$ in subsequent steps.

To choose $\alpha$, try setting it to 0.001, 0.01, 0.1, 1, 10,100 and so on in the multiples of 10; or set $\alpha$ to  0.001, 0.003, 0.01, 0.03, 0.1, 0.3,1,3,10,30.... in multiples of 3


No comments:

Post a Comment