Consider a simple CNN consisting of two hidden layers, each of which is composed of convolution and
Question:
Consider a simple CNN consisting of two hidden layers, each of which is composed of convolution and ReLU. These two hidden layers are then followed by a max-pooling layer and a softmax output layer.
Assume each convolution uses K kernels of 5 5 with a stride of 1 in each direction (no zero padding). All these kernels are represented as a multidimensional array, denoted asW¹ f1, f2, p, k, lº, where 1 f1, f2 5, 1 k K, and l indicates the layer number l 2 f1, 2g, and p indicates the number of feature maps in each layer. The max-pooling layer uses 4 4 patches with a stride of 4 in each direction. Derive the back-propagation procedure to compute the gradients for all kernelsW¹ f1, f2, p, k, lº in this network when CE loss is used.
Step by Step Answer:
Machine Learning Fundamentals A Concise Introduction
ISBN: 9781108940023
1st Edition
Authors: Hui Jiang