Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Assignment - exercise: (100 marks) Load & check the data: 1. Load the MINST data into a pandas dataframe named MINST_firstname where first name is

Assignment - exercise: (100 marks) Load & check the data: 1. Load the MINST data into a pandas dataframe named MINST_firstname where first name is you name. 2. List the keys 3. Assign the data to a ndarray named X_firstname where firstname is your first name. 4. Assign the target to a variable named y_firstname where firstname is your first name. 5. Print the types of X_firstname and y_firstname. 6. Print the shape of X_firstname and y_firstname.

7. Create three variables named as follows: a. If your first name starts by A through L name the variable some_digit1, some_digit2, some_digit3. Store in these variables the values from X_firstname indexed 7,5,0 in order. b. If your first name starts from M through Z name the variable some_digit12, some_digit13, some_digit14. Store in these variables the values from X_firstname indexed 3,8,1 in order. c. 8. Use imshow method to plot the values of the three variables you defined in the above point. Note the values in your Analysis report (written response). Pre-process the data 9. Change the type of y to unit8 10. The current target values range from 0 to 9 i.e. 10 classes. Transform the target variable to 3 classes as follows: a. Any digit between 0 and 3 inclusive should be assigned a target value of 0 b. Any digit between 4 and 6 inclusive should be assigned a target value of 1 c. Any digit between 7 and 9 inclusive should be assigned a target value of 9 (Hint: you can use numpy.where to carry out the transformation on the target.) 11. Print the frequencies of each of the three target classes and note it in your written report in addition provide a screenshot showing a bar chart. 12. Split your data into train, test. Assign the first 50,000 records for training and the last 20,000 records for testing. (Hint you dont need sklearn train test as the data is already randomized). Build Classification Models Nave Bayes 13. Train a Naive Bayes classifier using the training data. Name the classifier NB_clf_firstname. 14. Use 3-fold cross validation to validate the training process, and note the results in your written response. 15. Use the model to score the accuracy against the test data, note the result in your written response. 16. Generate the accuracy matrix. 17. Use the classifier to predict the three variables you defined in point 7 above. Note the results in your written response and compare against the actual results. Logistic regression 18. Train a Logistic regression classifier using the same training data. Name the classifier LR_clf_firstname. (Note this is a multi-class problem make sure to check all the parameters and set multi_class='multinomial'). Try training the classifier using two solvers first lbfgs then Saga. Set max_iter to 1200 and tolerance to 0.1 in both cases.

Make sure you note the results in both cases in your written response, and note the main differences in your written response. Carryout a quick research on the difference between the lbfgs and Saga solvers and see how this applies to the results, note that size and dimensions of the dataset. Dont worry if one doesnt converge your research should explain why. Note the results of your research in your analysis report. 19. Use 3-fold cross validation on the training data and note the results in your written response. 20. Use the model to score the accuracy against the test data, note the result in your written response. 21. Generate the Generate the accuracy matrix precision and recall of the model and note them in your written response. 22. Use the classifier that worked from the above point to predict the three variables you defined in point 7 above. Note the results in your written response and compare against the actual results

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Microsoft SQL Server 2012 Unleashed

Authors: Ray Rankins, Paul Bertucci

1st Edition

0133408507, 9780133408508

More Books

Students also viewed these Databases questions