- Backpropagation is supervised learning algorithm , for training Neural Networks.
- Every node in Neural Network represent a Neuron, so we can say that Neural Network is a circuit of neurons,
- Neural Network consist an Input layer, an output layer and a hidden layer, let's see in diagram.
What is the Role of Backpropagation
- First of all,if I want to create a neural network, then I have to initialize some weights.
- Now, whatever values i have selected for weights i do not know how much they are correct.
- To check that the weight values that I have selected are correct or incorrect I have to calculate the error of the model.
- Suppose my model error occurred too much
- Meaning my predicated output is very different from the actual output, so what shall I do? I will try to minimize the error.
- Here we are trying to minimize our error , how we will do this?
- What we really want to do is we have to learn our model to change the weights automatically so that we can get least error.
- As shown in the above diagram, we first calculated the error of our model, after that we saw that if the error is minimal then our model is ready for prediction.
- If the error is not minimized, we will update the parameters (weights) and calculate the error again.
- These processes will run until the error of our model is minimized.
- We have number of optimizer but here we are using Gradient descent optimizer.
- Gradient descent work as a optimizer, for finding minimum of a function.
- In our case we update the weights using gradient descent and try to minimize error function.
Note: Achieving Global Loss Minimum is Backpropagation
How does back propagation algorithm work?
Suppose we have a neural network that has an input layer, a hidden layer and an output layer
step1: First, we give random weights to the model.
step2: Forward propagation (normal neural network calculation)
step3: Calculate total error.
step4: Backward propagation (gradient descent), updating parameters (weights and bias)
step5: Until the error is minimized (Predicted output to be approximately equal to original output)
The formulas that we are using here
1. To calculate value of h1
2. To calculate the output of h1
3. To calculate error of output of h1
4. To calculate total error of the model
Now will propagate Backward
- Here we are writing the process and formulas to update our w5 weight.
- For that, we should know how much total error has come with respect to w5 weight.
1. Calculating our total total error with respect to output one.
2. calculating our total output 1 with respect to net output 1
3. Calculate net output1 with respect to weight5
4. Calculating updated weight
Similarly we can calculate other weight values as well (All this process is happened behind in the model)
How is back-propagation implemented?
Initializing variables value
import numpy as np
x = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array((, , ), dtype=float)
X = X/np.amax(X,axis=0) #maximum along the first axis
#Defining Sigmoid Function for output
def sigmoid (x):
return (1/(1 + np.exp(-x)))
#Derivative of Sigmoid Function
return x * (1 - x)
epoch=7000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of input layer neurons
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
- In this code,we have defined sigmoid function and its derivative function.
- As you know, we train the Neural network many times at a single point, for that we need the number of epochs.
- Below that we have defined the only number of neurons in each layer.
#Defining weight and biases for hidden and output layer
- Here we have defined random weights and bias
- As we know, we should first defined the wights and Bias for the first (here we have only one hidden layer) hidden layer.
- After that we have defined the weights and bias for the output layer.
- Keep in mind when defining the weights size (how many neurons are in the previous layer, the number of neurons in the layer for that we have defined weights).
- Size of bias (number of neurons in output layer,the number of neurons in the layer for that we have defined biases).
for i in range(epoch):
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp= outinp1+ bout
output = sigmoid(outinp)
- Here we are just calculating output of our model, first we have done this for hidden layer and after that for output layer , and finally get the output.
- np.dot is used for dot product of two matrix.
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
#how much hidden layer wts contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr
# dotproduct of nextlayererror and currentlayerop
bout += np.sum(d_output, axis=0,keepdims=True) *lr
wh += X.T.dot(d_hiddenlayer) *lr
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
- In this code first we calculated error of output layer and after that calculated error of output layer.
- As we know from the formula we have to find out how much hidden layer contribute in total error and also contribution of weight in total error.
- After that we have updated our weights and biases, until we get minimum error.
- X.T is used to make transpose matrix.
Click here for more programs of RTU ML LAB