Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

You may use MNIST data for the first assignment. You can train and test a classifier on this data. But the core challenge is still

You may use MNIST data for the first assignment. You can train and test a classifier on this data. But the core challenge is still to figure out what it is that the hidden nodes are responding to, and making the task more complex will not change this as the core focus. You need to conduct a minimum of the following 5 experiments for this data, in order to get some useful insights. You NEED to conduct more experiments (eg. variations of Experiment 3) in order to get better insights.
EXPERIMENT 1: Our dense neural network will consist of 784 input nodes, a hidden layer with 1 node and 10 output nodes (corresponding to the 10 digits). We use mnist.load_data() to get the 70,000 images divided into a set of 60,000 training images and 10,000 test images. We hold back 5,000 of the 60,000 training images for validation. After training the model, we group the 60,000 activation values of the hidden node for the (original) set of training images by the 10 predicted classes and visualize these sets of values using a boxplot. We expect the overlap between the range of values in the "boxes" to be minimal. In addition, we find the pattern that maximally activates the hidden node as a "warm up" exercise for similar analysis we will perform on CNN models in Assignment 2.
EXPERIMENT 2: This time our dense neural network will have 784 input nodes, a hidden layer with 2 nodes and 10 output nodes (corresponding to the 10 digits). For each of the 60,000 images, the output of the two hidden nodes are plotted using a scatterplot. We color code the points according to which of the 10 classes the the output of the two nodes predicts. Ideally, just like in EXPERIMENT 1, the color clusters should have very little overlap. Also compare the accuracy % & confusion matrix of Experiments 1 & 2. Again, the goal is to get more insights.
EXPERIMENT 3: You can explore with more hidden nodes. At least 5 more variations of this architecture NEEDS to be tried. Then you end up with 1final model. Say the best model.
EXPERIMENT 4: Use PCA decomposition to reduce the number of dimensions of our training set of 28x28 dimensional MNIST images from 784 to 154(with 95% of training images variance lying along these components). We also reduce the number of dimensions of 'best' model from Experiment 3 to 154 inputs nodes and train it on the new lower dimensional data. We then compare the performance of Experiments 3 and 4.
EXPERIMENT 5: We use a Random Forest classifier to get the relative importance of the 784 features (pixels) of the 28x28 dimensional images in training set of MNIST images and select the top 70 features (pixels). We train our 'best' dense neural network using these 70 features and compare its performance to the the dense neural network models from EXPERIMENTS 3 and 4.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

SQL Server Query Performance Tuning

Authors: Sajal Dam, Grant Fritchey

4th Edition

1430267429, 9781430267423

More Books

Students also viewed these Databases questions

Question

=+3. Who can provide information for evaluation?

Answered: 1 week ago