Consider a simple CNN consisting of two hidden layers, each of which is composed of convolution and

Question:

Consider a simple CNN consisting of two hidden layers, each of which is composed of convolution and ReLU. These two hidden layers are then followed by a max-pooling layer and a softmax output layer.

Assume each convolution uses K kernels of 5  5 with a stride of 1 in each direction (no zero padding). All these kernels are represented as a multidimensional array, denoted asW¹ f1, f2, p, k, lº, where 1  f1, f2  5, 1  k  K, and l indicates the layer number l 2 f1, 2g, and p indicates the number of feature maps in each layer. The max-pooling layer uses 4  4 patches with a stride of 4 in each direction. Derive the back-propagation procedure to compute the gradients for all kernelsW¹ f1, f2, p, k, lº in this network when CE loss is used.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Question Posted: