a. Derivatives of various activation functions. Show how the derivatives of activation functions, including sigmoid, tanh, and

Question:

a. Derivatives of various activation functions. Show how the derivatives of activation functions, including sigmoid, tanh, and ELU in Table 10.6, are derived in mathematical details.

b. Backpropagation algorithm. Consider a multilayer feed-forward neural network as shown in Fig. 10.5 with sigmoid activation and mean-squared loss function L. Prove (1) Eq. (10.3) for computing the error in the output unit (δ10); (2) Eq. (10.4) for computing the errors in hidden units δ9 and δ6; (hint: consider the chain rule); and (3) Eq. (10.5) for updating weights (hint: consider the derivative of the loss with respect to weights).

8j = 0; (1-0)(0; - Tj), (10.3)

; = 0,01 -0,)(;), (10.4)

wij = ;;. (10.5)

c. Relation between different activation functions. Given the sigmoid function σ(I)= 11+eI and hyperbolic tangent function tanh(I)=eIeIeI+eI, show in mathematics how tanh(I) can be transformed from sigmoid through shifting and re-scaling.

Table 10.6 Summary of common activation functions. Name Definition (f(I)) Plot Sigmoid Tanh ReLU Leaky ReLU

X1  FIGURE 10.5  0-1 0=l U L al L 8= a1 U, Forward propagation net input and output of each unit Os=f(ls)

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Data Mining Concepts And Techniques

ISBN: 9780128117613

4th Edition

Authors: Jiawei Han, Jian Pei, Hanghang Tong

Question Posted: