Friday, May 15, 2015

Implementation Summary : Neural Networks

Training a Neural network:

Pick a network architecture (connectivity pattern between neurons)

No. of input units: Dimension of features $x^{(i)}$
No. of output units: Number of classes

Reasonable default: 1 hidden layer, or if >1 hidden layer, have same no. of hidden units in every layer (usually the more the better)

Training:

  1. Randomly initialize weights
  2. Implement forward propagation to get $h_\Theta(x^{(i)})$ for any $x^{(i)}$
  3. Implement code to calculate the cost function $J(\Theta)$
  4. Implement Backpropagation to compute partial derivatives $\frac \partial {\partial\Theta_{jk}^{(l)}}$
for i = 1 to m
     Perform forward propagation and backpropagation  using examples $(x^{(i)},y^{(i)})$
     (Get activations $a^{(l)}$ and delta terms $\delta^{(l)}$ for $l=2,3,4....,L$
     Calculate $\Delta^{(l)}:=\Delta^{(l)} + \delta^{(l+1)}(a^{(l)})^T
     ...
end;

Compute $\frac \partial {\partial\Theta_{jk}^{(l)}}J(\Theta)$

     5.  Use Gradient Checking to compare  $\frac \partial {\partial\Theta_{jk}${(l)}}J(\Theta)$ computed using backpropagation  vs. using numerical estimate  of gradient of $J(\Theta)$
     6.  Use Gradient Descent or advanced optimization method  with backpropagation  to try  to minimize $J(\Theta)$ as a function of parameters $\Theta$


No comments:

Post a Comment