Question: Hi, Can you please help me with assignment, I am failing to create the train_nn function. Please advise how I can get data to you,
Hi,
Can you please help me with assignment, I am failing to create the train_nn function. Please advise how I can get data to you, my previous efforts have failed.
Tensorflow_NeuralNetworkspdf May 1, 2020 Programming Assignment 8: Neural Networks with Tensorflow 0.1 Problem Statement In this programming assignment, you will write Tensorflow code to distinguish between a signal process which produces Higgs bosons and a background process which does not. We model this problem as a binary classification problem. Note: This assignment is not designed to make you a professional Tensorflow programmer, but rather to introduce you to, and make you practice, the basic constructs and functionalities of Tensorflow. 0.1.1 CPU vs GPU You may want to read this article to know more about the CPU vs GPU discussion. This is totally optional, still highly recommended for those who are interested. You do not need to write any "GPU specific" code. Tensorflow automatically recognizes the underlying hardware, and optimizes your code to run accordingly. The most common bottleneck to training faster with a GPU is usually the speed at which data is fed to the GPU for processing. So the input data pipeline is an important construct when writing efficient, scalable code to train Neural networks using Tensorflow. 0.1.2 Dataset For this assignment, we will use sampled data from a well known dataset: Higgs Dataset. Some information regarding the data and the problem: This is a classification problem to distinguish between a signal process which produces Higgs bosons and a background process which does not. The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle detectors in the accelerator. The last seven features are functions of the first 21 features; these are high-level features derived by physicists to help discriminate between the two classes. The train and test files have the following characteristics: The first row is a header that contains a comma-separated list of the names of the label and attributes Each successive row represents a single example The first column of each example is the label to be learned, and all other columns are attribute values. All attributes are numerical i.e. real numbers. 1 0.1.3 Testing and Evaluation For local testing and for two of the three submissions, you will use the small training and test datasets that have been provided along with this notebook. When submitting on EdX, your code will be trained and evaluated on a much larger sample of the full dataset. Some suggestions you should keep in mind while implementing the functions: Avoid doing repeated work i.e. anything that could be done outside the loop, should be outside. Read the markdown of this notebook carefully. 0.1.4 Configuration File To make your code more robust, and to aid the process of grading your work, most of the required parameters for the network training and testing will come from a YAML config file named "nn_config.yaml". This file is present in the same directory as the notebook. We have added default values to parameters, but you may modify them for debugging purposes. Information regarding what these variables mean, and how you should use them is present as comments in the yaml file. Information regarding how to read variables from the YAML config file is mentioned later in this notebook. However, remember that for grading your work we will use our own config files. So your code should always refer to variable values from the config file. In [1]: # Let's look at the contents of the YAML config file !cat nn_config.yaml ## Training data file path training_data_path: ../resource/lib/publicdata/hw8/higgs_train_large.csv ## Testing data file path test_data_path: ../resource/lib/publicdata/hw8/higgs_test_large.csv ## Location in which you will save the pickle file containing predictions on test data output_predictions_pickle_path: ./test_predictions.pkl ## How to split the input training data into train and validation sets. Value of 0.8 means that training_to_validation_ratio: 0.8 ## the learning rate you should use for your optimizer learning_rate: 0.05 ## the total number of epochs or iterations to run over the (80) training examples epochs: 200 ## the number of mini batches in which you should split your training examples. Continuing with num_mini_batches: 5 2 ## this variable is for your own use to modify when you would like to print any debug statement display_step: 1 ## the size of the network. If first_layer: 20 and second_layer: 8 then you should set the numb hidden_layer_sizes: first_layer: 20 second_layer: 8 ## for grading purposes, you may ignore dataset_size: large grading_script_path: ../resource/lib/publicdata/hw8/grade_test_submission.py 0.2 Gameplan You will write robust code that builds a feedforward neural network, and trains it according to the given set of parameters. 1. We will first load the training and test data using the parameters from the config file. 2. We will then split the training data into training and validation sets using the value of "training_to_validation_ratio" parameter in the config. For example, if the param is 0.8, it means that the initial 80% of the data should be kept for training, while the rest 20% should be used for validation. 3. We will use Cross Entropy Loss as our cost functions and minimize it using AdamOptimizer as our optimizer. 4. We will train our model in batches inside our main training loop. You will divide the training data into num_batches number of mini batches and for each epoch you will iterate and train over those many number of batches. 5. You can use "display_step" param to control the frequency of print statements. 6. You will maintain a list of training accuracies and losses (one value for each epoch). 7. You will maintain a list of validation accuracy and loss (one value for each epoch). The function tf.reduce_sum will allow you to sum across all instances. 5) You should train your network using your inputted learning rate and for the inputted number of iterations. The iterations are simply a loop that calls Backpropagation a fixed number of times. 0.3 Initialization In [2]: ## Tensorflow produces a lot of warnings. We generally want to suppress them. The below import warnings warnings.filterwarnings('ignore') In [3]: import tensorflow as tf import numpy as np from matplotlib import pyplot as plt ## Pretty Print import pprint as pp 3 In [4]: import yaml def import_config(): with open("nn_config.yaml", 'r') as ymlfile: try: cfg = yaml.load(ymlfile) except yaml.YAMLError as err: print(err) return cfg In [5]: if 'session' in locals() and session is not None: print('Close interactive session') session.close() config = tf.ConfigProto() config.gpu_options.allow_growth = True session = tf.Session(config=config) # Dynamically grow the memory ## The below function tests if Tensorflow has access to GPU or not. def test_cpu_gpu(): if tf.test.gpu_device_name(): print('Default GPU Device: {}'.format(tf.test.gpu_device_name())) else: print('''Your hardware either does not have a GPU or is not configured to use t However, you do not need a GPU for this assignment as you will be completing th CPU enviroment, but evaluating it on a GPU enviroment.''') test_cpu_gpu() Your hardware either does not have a GPU or is not configured to use the GPU version of TF. However, you do not need a GPU for this assignment as you will be completing this assig CPU enviroment, but evaluating it on a GPU enviroment. In [6]: cfg = import_config() ## Is it loaded correctly? pp.pprint(cfg) {'dataset_size': 'large', 'display_step': 1, 'epochs': 200, 'grading_script_path': '../resource/lib/publicdata/hw8/grade_test_submission.py', 'hidden_layer_sizes': {'first_layer': 20, 'second_layer': 8}, 'learning_rate': 0.05, 'num_mini_batches': 5, 'output_predictions_pickle_path': './test_predictions.pkl', 'test_data_path': '../resource/lib/publicdata/hw8/higgs_test_large.csv', 'training_data_path': '../resource/lib/publicdata/hw8/higgs_train_large.csv', 4 'training_to_validation_ratio': 0.8} In [7]: # Removes the old test_predictions.pkl file # so that it will not affect your final tests at the end of the notebook !rm test_predictions.pkl 2> /devull 0.4 Reading in Data In [8]: train_file_name = cfg['training_data_path'] test_file_name = cfg['test_data_path'] In [9]: #====================================================================================== # Uncomment this to test on smaller dataset. This is faster and can be used to debug qu # PLEASE COMMENT THIS BEFORE SUBMITTING. YOUR NOTEBOOK IS EVALUATED ON LARGE DATASET. # ===================================================================================== train_file_name = '../resource/lib/publicdata/hw8/higgs_train_small.csv' test_file_name = '../resource/lib/publicdata/hw8/higgs_test_small.csv' In [10]: ## Loading the Data training_data = np.loadtxt(train_file_name, delimiter = ',') test_data = np.loadtxt(test_file_name, delimiter = ',') In [11]: ## Loading the Data training_data = np.loadtxt(train_file_name, delimiter = ',') test_data = np.loadtxt(test_file_name, delimiter = ',') Now we have loaded the training and test data. However, we cannot use it directly. We first need to standardize it. 0.4.1 Exercise: Implement the Standardize Function Neural networks work best when all features roughly are on the same scale and are centered around the mean. This is done by standardizing the feature vectors. Feature standardization makes the values of each feature in the data have zero-mean (when subtracting the mean in the numerator) and unit-variance. The function standardize takes the input data and determines the distribution mean and standard deviation for each feature. Next the mean is subtracted from each feature. Then the meansubtracted values of each feature are divided by its standard deviation. Example Input There are 3 training examples with 4 features each np.array([[-0.22 -0.19 -0.17 -0.13][-0.1 -0.05 0.02 0.10][0.03 0.11 0.12 0.15]]) Example Output There are 3 training examples (which have been standardized) along each of the 4 features array([[-1.20809282, -1.19664225, -1.33025759, -1.39425471], [-0.03265116, -0.05439283, 0.2494233 , 0.4920899 ], [ 1.24074398, 1.25103507, 1.08083429, 0.90216481]]) 5 Refer the "Standardization" section of this Wikipedia Feature Scaling Article. In [12]: def standardize(data): # # YOUR CODE HERE # File "", line 4 # SyntaxError: unexpected EOF while parsing In [ ]: dummy = np.array([[-0.22, -0.19, -0.17, -0.13],[-0.1, -0.05, 0.02, 0.10],[0.03, 0.11, 0 assert standardize(dummy).__class__ == np.ndarray, "should return numpy array" assert standardize(dummy).shape == dummy.shape, "should have the same shape as the inpu dummy_ans = np.round(np.array([[-1.20809282, -1.19664225, -1.33025759, -1.39425471], [-0.03265116, -0.05439283, 0.2494233 , 0.4920899 ], [ 1.24074398, 1.25103507, 1.08083429, 0.90216481]]),3) assert (np.round(standardize(dummy)[0],3)==dummy_ans[0]).all(), "check for correct retu assert (np.round(standardize(dummy)[2],3)==dummy_ans[2]).all(), "check for correct retu del dummy, dummy_ans In [ ]: # Hidden Tests Here ### ### AUTOGRADER TEST - DO NOT REMOVE ### 0.4.2 Exercise: Implement the parse_training_data function The function parse_training_data takes the input data and returns labels and features. Remember that the first column of the training data is the labels, and the remaining columns are the features The labels should be reshaped to a 2-D numpy matrix of shape (dataset_size, 1) The features should be standardized and have be a 2-D numpy matrix of shape (dataset_size, 28) Example Input There are 3 training examples with the label and 3 features each np.array([[1 -0.19 -0.17 -0.13][0 -0.05 0.02 0.10][0 0.11 0.12 0.15]]) Example Output Returns a tuple: 1st element is the labels 2nd element is the standardized features (array([[1.], [0.], [1.]]), array([[-1.4688735 , -1.3105518 , -0.99390842], [-0.36062164, 0.19350429, 0.82679107], [ 0.90595192, 0.98511277, 1.22259531]])) 6 Remember to use the standardize function appropriately inside this function and use the visible assert statements to finetune the shape of your returned data. In [ ]: numpy_matrix = np.array([[1, -0.19, -0.17, -0.13],[0, -0.05, 0.02, 0.10],[0, 0.11, 0.12 def parse_training_data(numpy_matrix): ### ### YOUR CODE HERE ### return labels, features parse_training_data(numpy_matrix) In [ ]: # Parse Training Data. You will later split the `labels` and `features` into training a labels, features = parse_training_data(training_data) In [ ]: assert labels.shape[1] == 1 assert features.shape[1] == 28 In [ ]: ### ### AUTOGRADER TEST - DO NOT REMOVE ### 0.4.3 Exercise: Implement the parse_test_data function The function parse_test_data takes the input data and returns labels and features. We do not have access to labels while predicting the classes that our test examples belong to. The input data files for the test data would not have the labels column. So we need a different function to parse the test data. This should only return standardized features. The features should be standardized and have be a 2-D numpy matrix of shape (dataset_size, 28) Example Input There are 3 training examples with the label and 3 features each np.array([[-0.19 -0.17 -0.13][-0.05 0.02 0.10][0.11 0.12 0.15]]) Example Output Returns a tuple: 1st element is the labels 2nd element is the standardized features array([[-1.4688735 , -1.3105518 , -0.99390842], [-0.36062164, 0.19350429, 0.82679107], [ 0.90595192, 0.98511277, 1.22259531]]) Remember to use the standardize function appropriately inside this function. In [ ]: features = np.array([[-0.19, -0.17, -0.13],[-0.05, 0.02, 0.10],[0.11, 0.12, 0.15]]) In [ ]: def parse_test_data(numpy_matrix): # # YOUR CODE HERE # return test_features 7 In [ ]: test_features = parse_test_data(test_data) 0.5 Building the Neural Network 0.5.1 Initializing important parameters Use the below params appropriately inside the train_nn() function. We have initialized these variables in order to assist you in your implementation. In [ ]: learning_rate = cfg['learning_rate'] training_epochs = cfg['epochs'] train_valid_split = cfg['training_to_validation_ratio'] num_batches = cfg['num_mini_batches'] display_step = cfg['display_step'] num_examples= training_data.shape[0] # The first `num_train_examples` should be used for training, the rest for validation. num_train_examples = int(num_examples * train_valid_split) batch_size = num_train_examplesum_batches # Network Parameters n_hidden_1 = cfg['hidden_layer_sizes']['first_layer'] # 1st layer number of features n_hidden_2 = cfg['hidden_layer_sizes']['second_layer'] # 2nd layer number of features n_input = 28 n_classes = 1 print("Total Training examples: %d, Number of Batches: %d, Batch Size: %d" %(num_train_ 0.5.2 Initializing placeholders for feeding into the TF graph Define the TF placeholders which will receive data for each mini batch. Similarly define weights and biases as TF variables In [ ]: # TF Graph input ## Use the below placeholders appropriately inside the train_nn() function x = tf.placeholder("float", [None, n_input]) y = tf.placeholder("float", [None, 1]) # Store layers weight & bias weights = { 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])), 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])), 'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes])) 8 } biases = { 'b1': tf.Variable(tf.random_normal([n_hidden_1])), 'b2': tf.Variable(tf.random_normal([n_hidden_2])), 'out': tf.Variable(tf.random_normal([n_classes])) } 0.5.3 Exercise: Implement the calc_num_total_learnable_params function. This function calculates the number of learnable parameters of the network model. This number directly relates to the complexity of your model, as well as the training time. The function calc_num_total_learnable_params takes the weights dictionary and bias dictionary and returns an integer which is equal to the number of total parameters in the network. You can make use of the get_dims_as_tuple as a helper function to access the shape of the weight and bias matrices easily. In [ ]: # Helper function which you may use in implementing `calc_num_total_learnable_params(we def get_dims_as_tuple(x): shape = x.get_shape() dims = for dim in shape: dims.append(dim.value) return tuple(dims) # example usage: get_dims_as_tuple(weights['h1']) In [ ]: def calc_num_total_learnable_params(weights,biases): num_total = 0 for w in weights: num_total += w.size for b in biases: num_total += b.size return num_total In [ ]: # def calc_num_total_learnable_params(weights,biases): # # # # # # # ### ### YOUR CODE HERE layer_1 = tf.sigmoid(tf.add(tf.matmul(x, get_dims_as_tuple(weights['h1'])), get_d layer_2 = tf.sigmoid(tf.add(tf.matmul(layer_1, get_dims_as_tuple(weights['h2'])), out_layer = tf.matmul(layer_2, get_dims_as_tuple(weights['out'])) + get_dims_as_t return ### In [ ]: ## Hidden Tests Here ### ### AUTOGRADER TEST - DO NOT REMOVE ### 9 0.5.4 Exercise: Create FeedForward Network Model This function needs to be filled up with code to construct the remaining two layers of the neural network. You have to add one more hidden layers and also the output layer. You should use the sigmoid activation function. Tensorflow's tf.nn.sigmoid() function should be helpful. We have partially implemented this function. Complete the rest of it. Remember to not apply the sigmoid activation at the last layer as we will be using tf.nn.sigmoid_cross_entropy_with_logits() later which does that. In [ ]: def create_feedforward_nn_model(x, weights, biases): # Hidden layer with SIGMOID activation layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1']) layer_1 = tf.nn.sigmoid(layer_1) ### ### YOUR CODE HERE ### return out_layer 0.5.5 Exercise: Stitch the Neural Network Model Using the appropriate Tensorflow libraries, implement each of the following operations: - loss as the CrossEntropyLoss - train_op as the AdamOptimizer that minimizes the loss As inputs to these operators, you can use: - pred_raw which is the output of your neural network's last layer - pred is the predicted label, which is the output of rounding pred_raw. You might want to look at the Tensorflow Section notebooks as well as the TensorFlow API. Two of the returned values have been implemented as a hint for you. Functions that could be useful here: tf.nn.sigmoid_cross_entropy_with_logits() tf.reduce_mean() tf.round() tf.sigmoid() tf.train.AdamOptimizer().minimize() In [ ]: # Construct model def stitch_network(x, y, weights, biases, learning_rate): pred_raw = create_feedforward_nn_model(x, weights, biases) pred = tf.round(tf.nn.sigmoid(pred_raw)) ### ### YOUR CODE HERE ### return pred_raw, pred, cost, train_op 10 pred_raw, pred, cost, train_op = stitch_network(x, y, weights, biases, learning_rate) In [ ]: assert cost.__class__ == tf.Tensor assert cost.get_shape() == (), "Make sure you have used reduce_mean" In [ ]: # Initializing the variables - IMPORTANT init = tf.global_variables_initializer() 0.6 Training and Testing the Neural Network 0.6.1 Exercise: Writing the Train function This is where you will train your network. Your goal is to complete the following function named train_nn(). To help you structure your implementation, we have provided some starter code. We have also detailed each of the steps you need to pay attention to inside the main training loop. Remember you have access to all the parameters we initialized early on in the notebook, as well as to the parameters defined in the config file. train_nn() should return 5 python lists 1. training_costs 2. validation_costs 3. training_accs 4. validation_accs 5. test_predictions In [ ]: correct_prediction = tf.equal(pred, y) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) def train_nn(): with tf.Session() as sess: sess.run(init) ## this is needed to print debug statements during training. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) x_train, x_valid = features[:num_train_examples], features[num_train_examples:] y_train, y_valid = labels[:num_train_examples], labels[num_train_examples:] training_costs = training_accs = validation_losses = validation_accs = best_acc = 0. weights biases 11 for epoch in range(training_epochs): ''' We recommend you first think about how you will implement this on your own before proceeding to read any further. HINT: You should implement the following procedure here: An epoch is one pass through your training data 1. Keep a counter of this epoch's total loss. You will need this to average over the batches. 2. Keep a counter of the number of correct predictions in your epoch. You will need this to sum over the batches to calculate per epoch accura -- For each batch -- (you should have `num_batches` number of batches tota 3. subset your features and labels from x_train and y_train ex. for batch 1, you'd select all examples in the interval [0,batch_siz for batch 2, it would be between [batch_size, 2*batch_size) Make sure to account for a possible fractional batch as your last b 4. Massage your x_batch and y_batch into numpy arrays of shape (size_of and (size_of_batch, 1) respectively 5. Feed the x_batch and y_batch into your tensorflow graph and execute the optimizer, loss, and pred in order to train your model using the current batch and also get back the loss for the batch and predictions for the ba The contribution of each batch's loss towards the epoch's loss will 6. Count the number of correct predictions for this batch and add it to correct predictions in the epoch 7. Append the average epoch loss to `training_losses` 8. Calculate your epochs's accuracy as the total number of correct predicti divided by the number of training examples 9. Append the epoch's accuracy to `training_accs` --Validation at end of every epoch-- 10. Massage your validation labels (y_valid) into a numpy arrays of shape ( 11. With y_valid and x_valid as input to your graph, calculate the validation loss and validation predictions 12. Calculate the number of correct validation predictions by comparing aga 13. Append validation loss and validation accuracy to their respective list 14. NOTE: Avoid printing a lot of debug information when you submit the ass This reduces the speed of execution. If you want to print some information every so often, you can use the f at the end of your epoch loop: if epoch%display_step==0: print("Epoch %d | Tr loss: %f | Tr accuracy %f | Va loss: %f | Va a 12 %f"%(epoch + 1,, , ,