Consider a neural network that has hidden layers h1 . . . ht, inputs x1 . .
Question:
Consider a neural network that has hidden layers h1 . . . ht, inputs x1 . . . xt into each layer, and outputs o from the final layer ht. The recurrence equation for the pth layer is as follows:
o = Uht hp = tanh(Whp−1 + V xp) ∀p ∈ {1 . . . t}
The vector output o has dimensionality k, each hp has dimensionality m, and each xp has dimensionality
d. The “tanh” function is applied in element-wise fashion. The notations U, V , and W are matrices of sizes k × m, m ×
d, and m × m, respectively.
The vector h0 is set to the zero vector. Start by drawing a (vectored) computational graph for this system. Show that node-to-node backpropagation uses the following recurrence:
∂o
∂ht
= UT
∂o
∂hp−1
= WTΔp−1
∂o
∂hp
∀p ∈ {2 . . . t}
Here, Δp is a diagonal matrix in which the diagonal entries contain the components of the vector 1 − hp hp. What you have just derived contains the node-to-node backpropagation equations of a recurrent neural network. What is the size of each matrix ∂o
∂hp
?
Step by Step Answer: