Question
Download the csv file named AA3 . Write a script in python (you can use any IDE) to load the csv file AA3 into a
Download the csv file named AA3 . Write a script in python (you can use any IDE) to load the csv file AA3 into a Pandas data frame, name the frame df_firstname (where firstname is your firrstname). In your script carry out the following, and then answer the last set of questions in point 5 Analysis in the html box:
(Note: Once your script is ready please attach the python script to this question by clicking the "Add file" button and then follow the notes to upload your script).
Explore the data (6 marks)
Print the names of columns
Print the types of columns
Print the unique values in each column.
Print the statistics count, min, mean, standard deviation, 1st quartile, median, 3rd quartile max of all the numeric columns(use one command).
Print the first three records.
Print a summary of all missing values in all columns (use one command).
Print the total number (count) of each unique value in the following categorical columns:
Model
Color
Visualize the data (10 marks)
Plot a histogram for the millage use 10 bins, name the x and y axis appropriately, give the plot a title "firstname_millage".
Create a scatterplot showing "millage" versus "value", name the x and y axis appropriately, give the plot a title "firstname_millage_scatter".
Plot a "scatter matrix" showing the relationship between all columns of the dataset on the diagonal of the matrix plot the kernel density function.
Create a "boxplot" for the value column; name the x and y axis appropriately, give the plot a title "firstname_box_value"
Create a "bar chart" indicating stolen vehicles by type of vehicle i.e. for each Type plot two bars one in red color showing the total stolen and one blue showing the total not stolen., name the x and y axis appropriately, give the plot a title "firstname_stolen_by_type"
Pre-process the data (8 marks)
Remove (drop) properly the column with the most missing values. (hint: make sure you review and set the right arguments)
Replace the missing values in the "millage" column with the mean average of the column value.
Check that there are no missing values.
Convert the all the categorical columns into numeric values and drop/delete the original columns. (hint: use get dummies)
Make sure your new data frame is completely numeric, name it df_firstname_numeric.
Build a model and validate (10 marks)
Build a predictive model, namely a logistic regression classifier using sklearn take into consideration the following:
Name the model model_firstname (where firstname is your firstname).
The class attribute is "stolen".
Use "train_test_split" from sklearn to split your data 60% for training and 40% for testing. Set the random seed to be the last two digits of your student number.
Fit your training data into the logistic regression model you defined in points 1& 2 above, set/use solver='lbfgs', max_iter = 1400 as arguments.
Validate the model on your training set using k-fold cross validation, set the number of folds to 10 and print the mean of the "accuracy" for all ten runs.
Use the model you created using the training data to test the 30% testing data, print the:
The accuracy of the test.
The confusion matrix.
Analysis (6 marks)
In the below box answer the following three questions, number your responses based on the question numbers:
What are the key highlights of the original dataset, you loaded.
After carrying out the pre-processing steps in point 3 above, what is new number of columns and what is the new number of rows?
Looking at the confusion matrix you generated in point 4.6 what are the key findings (Hint: think in terms of precision, re-call, True negatives,.....)?
AA3 CONTENT
model,type,year,millage,motor,value,damage,color,stolen Ford,sedan,2015,64,2,16,no damage,white,0 Ford,SUV,2016,22,4,23,low damage,black,1 Toyota,sedan,2018,28,2,25,medium damage,black,0 Toyota,sedan,2016,52,,19,no damage,white,0 Toyota,sedan,2017,30,,21,medium damage,white,0 Toyota,SUV,2017,36,,34,no damage,black,1 Ford,SUV,2017,43,,32,no damage,black,1 Ford,sedan,2015,nan,,16,no damage,black,0 Ford,SUV,2016,21,,25,low damage,white,1 Toyota,sedan,2018,15,,21,medium damage,black,0 Toyota,sedan,2016,55,,17,no damage,white,0 Ford,sedan,2018,35,4,33,no damage,white,0 Toyota,SUV,2017,43,,31,low damage,white,1 Ford,SUV,2018,12,,46,no damage,black,1 Ford,sedan,2015,63,,18,no damage,black,0 Ford,SUV,2016,22,,29,no damage,black,1 Toyota,sedan,2018,34,2,27,medium damage,black,0 Toyota,sedan,2016,32,,25,no damage,white,0 Ford,sedan,2015,48,,17,no damage,white,0 Toyota,SUV,2017,38,,34,low damage,white,0 Toyota,SUV,2017,23,4,37,no damage,black,1 Ford,sedan,2015,56,,18,medium damage,black,0 Ford,SUV,2016,30,,25,no damage,black,1 Toyota,sedan,2018,18,,22,low damage,white,0 Toyota,sedan,2016,68,2,18,no damage,white,0 Ford,sedan,2018,35,,27,no damage,white,0 Toyota,SUV,2017,41,,36,no damage,white,1 Ford,SUV,2018,12,,39,no damage,black,1 Ford,sedan,2015,53,,19,no damage,black,0 Ford,SUV,2016,23,,18,low damage,white,1 Toyota,sedan,2018,31,4,22,medium damage,black,0 Toyota,sedan,2016,50,,21,no damage,white,0 Toyota,sedan,2017,nan,,23,medium damage,white,0 Toyota,SUV,2017,43,,21,no damage,white,1 Toyota,SUV,2017,40,,34,no damage,black,1 Ford,sedan,2015,70,,17,no damage,black,0 Ford,SUV,2016,19,,16,low damage,white,1 Toyota,sedan,2018,16,,25,medium damage,black,0 Toyota,sedan,2016,nan,,21,no damage,white,0 Ford,sedan,2018,35,2,17,no damage,white,0 Toyota,SUV,2017,40,2,33,low damage,white,0 Ford,SUV,2018,10,,31,no damage,black,1 Toyota,sedan,2015,62,,18,no damage,white,0 Ford,SUV,2016,29,,18,low damage,black,1 Toyota,sedan,2018,33,,32,medium damage,black,0 Toyota,sedan,2016,35,,23,no damage,white,0 Ford,sedan,2017,32,,21,no damage,white,0 Toyota,SUV,2017,39,,28,low damage,white,1 Toyota,SUV,2017,38,,34,no damage,black,1 Ford,sedan,2015,59,,23,medium damage,black,0 Ford,SUV,2016,19,,25,low damage,white,1 Toyota,sedan,2018,14,,21,medium damage,black,0 Toyota,sedan,2016,54,,23,no damage,white,0 Ford,sedan,2018,36,,27,no damage,white,0 Toyota,SUV,2017,39,,28,no damage,white,0 Ford,SUV,2018,12,,36,no damage,black,0 Toyota,SUV,2017,45,,25,low damage,white,1 Toyota,SUV,2017,27,,26,no damage,black,1 Ford,sedan,2015,60,,14,medium damage,black,0 Ford,SUV,2016,20,,26,low damage,white,1 Toyota,sedan,2018,13,,21,medium damage,black,0 Toyota,sedan,2016,57,,19,low damage,white,0 Ford,sedan,2018,nan,,23,no damage,white,0 Toyota,SUV,2017,41,,33,medium damage,black,1 Ford,SUV,2018,10,,38,no damage,black,1 Toyota,sedan,2016,51,,18,low damage,white,0 Ford,sedan,2018,35,,25,no damage,white,0 Toyota,SUV,2017,40,,32,low damage,white,1 Ford,SUV,2018,nan,,31,no damage,black,1 Ford,sedan,2015,67,,16,no damage,white,0 Ford,SUV,2016,20,,31,low damage,black,1 Toyota,sedan,2018,15,,22,medium damage,black,0 Toyota,sedan,2016,55,,19,no damage,white,0 Ford,sedan,2018,32,4,29,no damage,white,0 Toyota,SUV,2018,14,,27,medium damage,black,1 Toyota,sedan,2016,55,2,25,no damage,white,0 Ford,sedan,2018,35,,28,no damage,white,0 Toyota,sedan,2018,17,,34,medium damage,black,0 Toyota,sedan,2016,55,,37,no damage,white,0 Ford,SUV,2018,35,4,19,no damage,black,1 Toyota,sedan,2018,15,,25,medium damage,black,0 Toyota,sedan,2016,45,2,20,no damage,white,0 Ford,SUV,2018,35,,21,no damage,white,0 Toyota,sedan,2018,15,,24,medium damage,black,0 Toyota,SUV,2016,55,2,23,no damage,white,0 Ford,sedan,2018,35,,38,no damage,white,0 Ford,sedan,2017,30,,22,no damage,white,0 Toyota,SUV,2017,40,,25,low damage,white,0 Toyota,SUV,2017,39,4,29,no damage,black,1 Ford,sedan,2015,67,,17,medium damage,black,0 Ford,SUV,2016,40,,27,low damage,white,1 Toyota,sedan,2018,15,,21,medium damage,black,0 Toyota,SUV,2016,55,2,34,no damage,white,0 Ford,sedan,2018,35,,24,no damage,white,0 Toyota,SUV,2017,40,,31,no damage,white,0 Ford,SUV,2018,10,,34,no damage,black,1 Toyota,SUV,2017,40,4,30,no damage,black,0 Ford,sedan,2015,75,,17,medium damage,black,0 Ford,SUV,2016,41,,31,low damage,white,0 Ford,sedan,2018,12,,26,low damage,black,1 Ford,SUV,2015,72,,22,low damage,black,0 Toyota,sedan,2016,66,,20,low damage,black,0 Ford,SUV,2017,35,,36,no damage,black,0 Toyota,SUV,2017,40,,31,no damage,black,1 Toyota,sedan,2016,45,,22,low damage,black,0 Toyota,sedan,2016,55,,21,low damage,white,0 Ford,SUV,2017,42,,34,no damage,white,1
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started