Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 22, 2024

(15pts) We will be doing multiclass classification on the CIFAR-100 dataset and for this we will use the cross-entropy loss as our objective function and

image text in transcribed

(15pts) We will be doing multiclass classification on the CIFAR-100 dataset and for this we will use the cross-entropy loss as our objective function and softmax as the output layer. In our network, we will have a hidden layer between the input and output, that consists of J units with the tanh activation function. You can assume that you have K classes and hence, K units in the output layer. So this network has three layers: an input layer, a hidden layer and a softmax output layer. For the purpose of this particular question of the assignment, you will be assuming one hidden layer, but the programming part of the assignment (Problem 3) will require you to use more than one hidden layer. Notation: We use index k to represent a node in output layer and index j to represent a node in hidden layer and index i to represent a node in the input layer. Additionally, the weight from node i in the input layer to node j in the hidden layer is wij. Similarly, the weight from node j in the hidden layer to node k in the output layer is wjk. (a) (10pts) Derivation. In the following discussion, n denotes the nth input pattern. Derive the expression for for both the units of output layer (kn) and the hidden layer (jn). Recall that the definition of is in=ainEn, where ain is the weighted sum of the inputs to unit i and En is the cross-entropy loss for nth example. Show that if our output activation function is softmax and the hidden layer activation function is tanh then, for the output layer, kn=tknykn and for the hidden layer, jn=(1tanh2(ajn))k(tknykn)wjk. Use the cross-entropy cost function in your derivation: En=i=1Ktinlnyin There are two "hard parts" to this: 1) taking the derivative of the softmax; and 2) figuring out how to apply the chain rule to get the hidden deltas. Bishop and Chapter 8 of the PDP books (Parallel Distributed Processing: Explorations in the Microstructure of CognitionVol1 Vol2) both have good hints on the latter, and Bishop on the former. However, crucial steps have been left out of the Bishop derivation (Chapter 6). Our main hint here is: break it up into two parts (see equation 6.161 in Bishop), when k=k and when it doesn't. Note that Bishop (Equation 4.31) defines jn without a minus sign, which is the opposite of the way that we defined it above, and different from the PDP book chapter 8

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Postgresql 16 Administration Cookbook Solve Real World Database Administration Challenges With 180+ Practical Recipes And Best Practices

Postgresql 16 Administration Cookbook Solve Real World Database Administration Challenges With 180+ Practical Recipes And Best Practices

Authors: Gianni Ciolli ,Boriss Mejias ,Jimmy Angelakos ,Vibhor Kumar ,Simon Riggs

1st Edition

1835460585, 978-1835460580

More Books

Students also viewed these Databases questions

Question

★★★★★

Comprehensive Set of Transactions. The City of Lynnwood was recently incorporated and had the following transactions for the fiscal year ended December 31, 2011. 1. The city council adopted a General...

Answered: 1 week ago

Question

★★★★★

14. The father of statistical quality control is: (a) Joseph M. Juran (c) Walter Shewhart (b) F.W. Taylor (d) Philip Crossby

Answered: 1 week ago

Question

★★★★★

6. Behavior therapies often use techniques, such as systematic desensitization and aversive conditioning, to encourage clients to produce new responses to old stimuli.

Answered: 1 week ago

Question

★★★★★

Brook Company purchased 70 Meissner Company 12%, 10-year, $1,000 bonds on January 1, 2012, for $73,000. Brook Company also had to pay $500 of brokers fees. The bonds pay interest semiannually on July...

Answered: 1 week ago

Question

★★★★★

c. MeO2C OH OH 1. TrCl (1 eq) py, cat. DMAP 2. TBSOT 2,6-lutidine, CH2Cl2 3. LIBH4, THE C1 79% 4. Ph3P, 12 imidazole C2 89%

Answered: 1 week ago

Question

★★★★★

A coaxial cable consists of an inner conducting cylinder of radius a and an outer conducting cylinder of radius 5a. A current / flows down the length of the inner cylinder and back along the length...

Answered: 1 week ago

Question

★★★★★

Lightning Electronics is a midsize manufacturer of lithium batteries. The company's payroll records for the November 1-14 pay period show that employees earned wages totaling $70,000 but that...

Answered: 1 week ago

Question

★★★★★

Bob and Marcus are buying a house. They have $22000 for a down payment. The house price is $176000. They have secured a 25 year loan with an annual interest rate of 4.3% compounded monthly, Use Excel...

Answered: 1 week ago

Question

★★★★★

The pin-connected structure consists of two cold-rolled steel [E-26,000 ksil bars (1) and a bronze [E 12,000 ksi) bar (2) that are connected at pin D. Each steel bar has a cross-sectional area of...

Answered: 1 week ago

Question

★★★★★

Use the Dividend Discount Model to calculate the intrinsic value for Pepsi (PEP). Explain your assumptions for the discount rate and the expected dividend growth rate. Use the expected dividend...

Answered: 1 week ago

Question

★★★★★

1. Light of wavelength 686 nm in air enters water, making an angle of 40.4 with the normal. Determine a) the angle of refraction and b) the wavelength of light in water. Explain your working. (The...

Answered: 1 week ago

Question

★★★★★

State the meaning of performance appraisal and performance management and distinguish between the two.

Answered: 1 week ago

Question

★★★★★

Put forward reasoned proposals on how to set up and operate a basic performance management system.

Answered: 1 week ago

Question

★★★★★

Appreciate the role of human resources in effectively handling procedures.

Answered: 1 week ago

Previous Question Next Question