Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

[Eisenstein Chapter 3 Problem 8] The ReLU activation function can lead to dead neurons, which can never be activated on any input. Consider a feedforward

image text in transcribed

[Eisenstein Chapter 3 Problem 8] The ReLU activation function can lead to "dead neurons", which can never be activated on any input. Consider a feedforward neural network with a single hidden layer and ReLU nonlinearity, assuming a binary input vector, xf{0,1}D and scalar output y : zi=ReLU(i(xz)x+bi)y=(zy)z Assume the above function is optimized to minimze a loss function (e.g., mean squared error) using stochastic gradient descent. 1. (2 pts) Under what condition is node zi "dead"? Your answer should be expressed in terms of the parameters i(xz) and bi 2. (2 pts) Suppose that the gradient of the loss on a given instance is yl=1. Derive gradients bil and j,i(xz)l for such an instance. 3. (2 pts) Using your answers to the previous two parts, explain why a "dead" neuron can never be brought back to life during gradient-based learning

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions