Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Machine Learning: This is the question as posted. Predict the age of abalone (a common name for any of a group of small to very

Machine Learning: This is the question as posted.

image text in transcribed

image text in transcribed

Predict the age of abalone (a common name for any of a group of small to very large marine gastropod molluscs in the family Haliotidae) from physical measurements. The age of abalone is normally determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope -- a boring and time-consuming task. Determine if other measurements, which are easier to obtain, can be used to predict the age. Data Refer to the abalone.names file (specifically section 7) on Canvas for a description of the data. Note the last column "Rings" is used to determine the age of the abalone. Using pandas.read_csv (without a header) read the dataset: 2 O M 0.455 0.365 0.095 0.5140 0.2245 0.1010 0.150 15 1 M 0.350 0.265 0.090 0.2255 0.0995 0.0485 0.070 7 F 0.530 0.420 0.135 0.6770 0.2565 0.1415 0.210 9 3 M 0.440 0.365 0.125 0.5160 0.2155 0.1140 0.155 10 4 0.330 0.255 0.080 0.2050 0.0895 0.0395 0.0557 The first column (sex) will have to be encoded prior to training your model. You can use the LabelEncoder (just on that column, not the entire dataset). You will need to find a way to combine the encoded column with the other columns 1 - 7 (8 is the target label). One method to consider is to use hstack to combine the encoded 'sex' column with columns 1 - 7). Separate the data into a training and test set, where training is composed of 70% and testing is 30% of the data. You can use the train_test_split function in scikit-learn. Gaussian Nave Bayes Classifier Create a Gaussian Naive Bayes classifier and train with the training data. Test the performance with the test dataset. The performance below may be different from yours (but should be similar). Accuracy: 0.28 44976076555 The performance of this classifier is not very good given the complexity of the data. But you realize that your problem doesn't require predicting the exact age of an abalone. Rather, you just need to classify into one of three categories, 'young', 'middle age' and 'old' based on the following criteria for the 'Rings' column: Young: 1 - 4 Middle Age: 5-15 Old: > 15 With this knowledge, restructure your target labels and retrain your model. Your accuracy should improve quite a bit (note your accuracy may differ from what's shown, but should still show an improvement over the previous value) Accuracy: 0.70414673046252

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design Application Development And Administration

Authors: Michael V. Mannino

3rd Edition

0071107010, 978-0071107013

Students also viewed these Databases questions

Question

How do Dimensional Database Models differ from Relational Models?

Answered: 1 week ago

Question

What type of processing do Relational Databases support?

Answered: 1 week ago

Question

Describe several aggregation operators.

Answered: 1 week ago