Skip to main content

Posts

Showing posts from August, 2012

Backpropagating dE/dy by Geoffrey Hinton

1. Convert the disprepancy between each output and its target value into an error derivative. E = 1 / 2 Sigma (Tj - Yj)^2 j in Output   dE / dYj = - (Tj  -Yj)     2. Compute an error derivative in each hidden layer from error derivatives in the layer above. dE / dZj = dYj / dZj  * (dE / dYj) , where Zj is the sum of all outputs of i hidden units. , where Yi is the output of i hidden unit. , where Yj is the output of j unit dE / dZj =  Yj (1 - Yj) * (dE / dYj) , where Yj (1 - Yj) is dY / dZ of a nonlinear logic unit of y = 1 / (1 + e ^ -Z)  , where dY / dZ is y (1 - y) dE / dYi = Sigma(j) ( dZj / dYi ) * (dE / dZj) dE / dYi = Sigma(j) Wij * (dE / dZj) ,where dE / dZj is already computed in above layer. thus, dE / dWij = (dZj / dWij) * (dE / dZj) dE / dWij  = Yi * (dE / dZj)   Proof: y = 1 / (1 + e^-Z) = (1 + e^-Z)^-1 thus, dy/dz = -1 (-e^-z) / (1 + e^-z)^2  dy/dz = 1 / (1 + e^-z) * (e^-z / (1 + e^-z) ) = y ( 1 - y) because,  (e^-z) / (1 + e^-z) = ((1 + e^-z) - 1) / (1 + e^-z) = (1 + e^-