Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Machine Learning: This is the question as posted. Predict the age of abalone (a common name for any of a group of small to very
Machine Learning: This is the question as posted.
Predict the age of abalone (a common name for any of a group of small to very large marine gastropod molluscs in the family Haliotidae) from physical measurements. The age of abalone is normally determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope -- a boring and time-consuming task. Determine if other measurements, which are easier to obtain, can be used to predict the age. Data Refer to the abalone.names file (specifically section 7) on Canvas for a description of the data. Note the last column "Rings" is used to determine the age of the abalone. Using pandas.read_csv (without a header) read the dataset: 2 O M 0.455 0.365 0.095 0.5140 0.2245 0.1010 0.150 15 1 M 0.350 0.265 0.090 0.2255 0.0995 0.0485 0.070 7 F 0.530 0.420 0.135 0.6770 0.2565 0.1415 0.210 9 3 M 0.440 0.365 0.125 0.5160 0.2155 0.1140 0.155 10 4 0.330 0.255 0.080 0.2050 0.0895 0.0395 0.0557 The first column (sex) will have to be encoded prior to training your model. You can use the LabelEncoder (just on that column, not the entire dataset). You will need to find a way to combine the encoded column with the other columns 1 - 7 (8 is the target label). One method to consider is to use hstack to combine the encoded 'sex' column with columns 1 - 7). Separate the data into a training and test set, where training is composed of 70% and testing is 30% of the data. You can use the train_test_split function in scikit-learn. Gaussian Nave Bayes Classifier Create a Gaussian Naive Bayes classifier and train with the training data. Test the performance with the test dataset. The performance below may be different from yours (but should be similar). Accuracy: 0.28 44976076555 The performance of this classifier is not very good given the complexity of the data. But you realize that your problem doesn't require predicting the exact age of an abalone. Rather, you just need to classify into one of three categories, 'young', 'middle age' and 'old' based on the following criteria for the 'Rings' column: Young: 1 - 4 Middle Age: 5-15 Old: > 15 With this knowledge, restructure your target labels and retrain your model. Your accuracy should improve quite a bit (note your accuracy may differ from what's shown, but should still show an improvement over the previous value) Accuracy: 0.70414673046252Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started