Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this project, we will be developing a basic neural network from the ground up to classify various types of fashion items. The primary objective

In this project, we will be developing a basic neural network from the ground up to classify various types of fashion items. The primary objective of this project is to gain a comprehensive understanding of neural network architecture, including its theory and implementation details.
#initializing
# Notice that you don't need any other packages for this mid-term
import numpy as np
import pandas as pd
import random
from matplotlib import pyplot as plt
random.seed(42) # NEVER change this line
# Reading the dataset
data = pd.read_csv('./fashion_data.csv')
# The data pre-processing is done for you. Please do NOT edit the cell
# However, you should understand what these codes are doing
data = np.array(data)
m, n = data.shape
np.random.shuffle(data) # shuffle before splitting into dev and training sets
data_dev = data[0:400].T
Y_dev = data_dev[-1]
X_dev = data_dev[0:n-1]
X_dev = X_dev /255.
data_train = data[400:m].T
Y_train = data_train[-1]
X_train = data_train[0:n-1]
X_train = X_train /255.
_,m_train = X_train.shape
PART 1 building NN
#initializing parameters
# Initialize the parameters in the neural network
# Based on the figure above, we need the weight and bias matrices.
# W1, b1 are the matrices for the first layer
# W2, b2 are the matrices for the second layer
# You should think about the sizes of the matrices
# then initialize elements in the matrix to be random numbers between -0.5 to +0.5
def init_params():
W1= # Your code here
b1= # Your code here
W2= # Your code here
b2= # Your code here
return W1, b1, W2, b2
# As a starting point, you only need a ReLu function, its derivative, and the softmax function
def ReLU(Z):
# Your code here
def ReLU_deriv(Z):
# Your code here
def softmax(Z):
# Your code here
return A
# In the forward propagation function, X is the inputs (the image in vector form), and we pass all the weights and biases
def forward_prop(W1, b1, W2, b2, X):
Z1= # Your code here
A1= # Your code here
Z2= # Your code here
A2= # Your code here
return Z1, A1, Z2, A2
#backward propagation
# This one hot function is to convert a numeric number into a one-hot vector
def one_hot(Y):
# Your code here
return one_hot_Y
# Now performing the backward propagation
# Each function is only one line, but lots of Calculus behind
def backward_prop(Z1, A1, Z2, A2, W1, W2, X, Y):
one_hot_Y = one_hot(Y)
dZ2= # Your code here
dW2= # Your code here
db2= # Your code here
dZ1= # Your code here
dW1= # Your code here
db1= # Your code here
return dW1, db1, dW2, db2
# Finally, we are ready to update the parameters
def update_params(W1, b1, W2, b2, dW1, db1, dW2, db2, alpha):
W1= # Your code here
b1= # Your code here
W2= # Your code here
b2= # Your code here
return W1, b1, W2, b2
#gradient descent
# Implement the helper function. We need to convert the softmax output into a numeric label
# This is done through get_predictions function
def get_predictions(A2):
# Your code here
# We also want to have a simple function to compute the accuracy. Notice that "predictions" and "Y" are the same shape
def get_accuracy(predictions, Y):
return # Your code here
# Finally, we are ready to implement gradient descent
def gradient_descent(X, Y, alpha, iterations):
W1, b1, W2, b2= # Your code here - using the function you have implemented
for i in range(iterations):
Z1, A1, Z2, A2= # Your code here - using the function you have implemented
dW1, db1, dW2, db2= # Your code here - using the function you have implemented
W1, b1, W2, b2= # Your code here - using the function you have implemented
if i %10==0:
print("Iteration: ", i)
predictions = get_predictions(A2)
print(get_accuracy(predictions, Y))
return W1, b1, W2, b2
#validation set
def make_predictions(X, W1, b1, W2, b2):
_,_,_, A2= forward_prop(W1, b1, W2, b2, X)
predictions = get_predictions(A2)
return predictions
dev_predictions = make_predictions(X_dev, W1, b1, W2, b2)
get_accuracy(dev_predictions, Y_dev)
#exploring some samples
def test_prediction(index, W1, b1, W2, b2):
current_image = X_train[:, index, None]
prediction = make_predictions(X_train[:, index, None], W1, b1, W2, b2)
label = Y_train[index]
print("Prediction: ", prediction)
print("Label: ", label)
current_image = current_image.reshape((28,28))*255
plt.gray()
plt.imshow(current_image, interpolation='nearest')
plt.show()
test_prediction(0, W1, b1, W2, b2)
test_prediction(1, W1, b1, W2, b2)
Part 2: Error Analysis and Performance Improvements
You now will try to improve the model performance through, for example, different activation functions, learning rate cahnges, expanding the network complexity, regularization, and dropouts. Note solve Part 2 in detail with reasons why model did or did not improve

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Advanced Database Systems For Integration Of Media And User Environments 98

Authors: Yahiko Kambayashi, Akifumi Makinouchi, Shunsuke Uemura, Katsumi Tanaka, Yoshifumi Masunaga

1st Edition

9810234368, 978-9810234362

More Books

Students also viewed these Databases questions

Question

1. In what ways has flexible working revolutionised employment?

Answered: 1 week ago