Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Download the csv file named AA3 . Write a script in python (you can use any IDE) to load the csv file AA3 into a

Download the csv file named AA3 . Write a script in python (you can use any IDE) to load the csv file AA3 into a Pandas data frame, name the frame df_firstname (where firstname is your firrstname). In your script carry out the following, and then answer the last set of questions in point 5 Analysis in the html box:

(Note: Once your script is ready please attach the python script to this question by clicking the "Add file" button and then follow the notes to upload your script).

Explore the data (6 marks)

Print the names of columns

Print the types of columns

Print the unique values in each column.

Print the statistics count, min, mean, standard deviation, 1st quartile, median, 3rd quartile max of all the numeric columns(use one command).

Print the first three records.

Print a summary of all missing values in all columns (use one command).

Print the total number (count) of each unique value in the following categorical columns:

Model

Color

Visualize the data (10 marks)

Plot a histogram for the millage use 10 bins, name the x and y axis appropriately, give the plot a title "firstname_millage".

Create a scatterplot showing "millage" versus "value", name the x and y axis appropriately, give the plot a title "firstname_millage_scatter".

Plot a "scatter matrix" showing the relationship between all columns of the dataset on the diagonal of the matrix plot the kernel density function.

Create a "boxplot" for the value column; name the x and y axis appropriately, give the plot a title "firstname_box_value"

Create a "bar chart" indicating stolen vehicles by type of vehicle i.e. for each Type plot two bars one in red color showing the total stolen and one blue showing the total not stolen., name the x and y axis appropriately, give the plot a title "firstname_stolen_by_type"

Pre-process the data (8 marks)

Remove (drop) properly the column with the most missing values. (hint: make sure you review and set the right arguments)

Replace the missing values in the "millage" column with the mean average of the column value.

Check that there are no missing values.

Convert the all the categorical columns into numeric values and drop/delete the original columns. (hint: use get dummies)

Make sure your new data frame is completely numeric, name it df_firstname_numeric.

Build a model and validate (10 marks)

Build a predictive model, namely a logistic regression classifier using sklearn take into consideration the following:

Name the model model_firstname (where firstname is your firstname).

The class attribute is "stolen".

Use "train_test_split" from sklearn to split your data 60% for training and 40% for testing. Set the random seed to be the last two digits of your student number.

Fit your training data into the logistic regression model you defined in points 1& 2 above, set/use solver='lbfgs', max_iter = 1400 as arguments.

Validate the model on your training set using k-fold cross validation, set the number of folds to 10 and print the mean of the "accuracy" for all ten runs.

Use the model you created using the training data to test the 30% testing data, print the:

The accuracy of the test.

The confusion matrix.

Analysis (6 marks)

In the below box answer the following three questions, number your responses based on the question numbers:

What are the key highlights of the original dataset, you loaded.

After carrying out the pre-processing steps in point 3 above, what is new number of columns and what is the new number of rows?

Looking at the confusion matrix you generated in point 4.6 what are the key findings (Hint: think in terms of precision, re-call, True negatives,.....)?

AA3 CONTENT

model,type,year,millage,motor,value,damage,color,stolen Ford,sedan,2015,64,2,16,no damage,white,0 Ford,SUV,2016,22,4,23,low damage,black,1 Toyota,sedan,2018,28,2,25,medium damage,black,0 Toyota,sedan,2016,52,,19,no damage,white,0 Toyota,sedan,2017,30,,21,medium damage,white,0 Toyota,SUV,2017,36,,34,no damage,black,1 Ford,SUV,2017,43,,32,no damage,black,1 Ford,sedan,2015,nan,,16,no damage,black,0 Ford,SUV,2016,21,,25,low damage,white,1 Toyota,sedan,2018,15,,21,medium damage,black,0 Toyota,sedan,2016,55,,17,no damage,white,0 Ford,sedan,2018,35,4,33,no damage,white,0 Toyota,SUV,2017,43,,31,low damage,white,1 Ford,SUV,2018,12,,46,no damage,black,1 Ford,sedan,2015,63,,18,no damage,black,0 Ford,SUV,2016,22,,29,no damage,black,1 Toyota,sedan,2018,34,2,27,medium damage,black,0 Toyota,sedan,2016,32,,25,no damage,white,0 Ford,sedan,2015,48,,17,no damage,white,0 Toyota,SUV,2017,38,,34,low damage,white,0 Toyota,SUV,2017,23,4,37,no damage,black,1 Ford,sedan,2015,56,,18,medium damage,black,0 Ford,SUV,2016,30,,25,no damage,black,1 Toyota,sedan,2018,18,,22,low damage,white,0 Toyota,sedan,2016,68,2,18,no damage,white,0 Ford,sedan,2018,35,,27,no damage,white,0 Toyota,SUV,2017,41,,36,no damage,white,1 Ford,SUV,2018,12,,39,no damage,black,1 Ford,sedan,2015,53,,19,no damage,black,0 Ford,SUV,2016,23,,18,low damage,white,1 Toyota,sedan,2018,31,4,22,medium damage,black,0 Toyota,sedan,2016,50,,21,no damage,white,0 Toyota,sedan,2017,nan,,23,medium damage,white,0 Toyota,SUV,2017,43,,21,no damage,white,1 Toyota,SUV,2017,40,,34,no damage,black,1 Ford,sedan,2015,70,,17,no damage,black,0 Ford,SUV,2016,19,,16,low damage,white,1 Toyota,sedan,2018,16,,25,medium damage,black,0 Toyota,sedan,2016,nan,,21,no damage,white,0 Ford,sedan,2018,35,2,17,no damage,white,0 Toyota,SUV,2017,40,2,33,low damage,white,0 Ford,SUV,2018,10,,31,no damage,black,1 Toyota,sedan,2015,62,,18,no damage,white,0 Ford,SUV,2016,29,,18,low damage,black,1 Toyota,sedan,2018,33,,32,medium damage,black,0 Toyota,sedan,2016,35,,23,no damage,white,0 Ford,sedan,2017,32,,21,no damage,white,0 Toyota,SUV,2017,39,,28,low damage,white,1 Toyota,SUV,2017,38,,34,no damage,black,1 Ford,sedan,2015,59,,23,medium damage,black,0 Ford,SUV,2016,19,,25,low damage,white,1 Toyota,sedan,2018,14,,21,medium damage,black,0 Toyota,sedan,2016,54,,23,no damage,white,0 Ford,sedan,2018,36,,27,no damage,white,0 Toyota,SUV,2017,39,,28,no damage,white,0 Ford,SUV,2018,12,,36,no damage,black,0 Toyota,SUV,2017,45,,25,low damage,white,1 Toyota,SUV,2017,27,,26,no damage,black,1 Ford,sedan,2015,60,,14,medium damage,black,0 Ford,SUV,2016,20,,26,low damage,white,1 Toyota,sedan,2018,13,,21,medium damage,black,0 Toyota,sedan,2016,57,,19,low damage,white,0 Ford,sedan,2018,nan,,23,no damage,white,0 Toyota,SUV,2017,41,,33,medium damage,black,1 Ford,SUV,2018,10,,38,no damage,black,1 Toyota,sedan,2016,51,,18,low damage,white,0 Ford,sedan,2018,35,,25,no damage,white,0 Toyota,SUV,2017,40,,32,low damage,white,1 Ford,SUV,2018,nan,,31,no damage,black,1 Ford,sedan,2015,67,,16,no damage,white,0 Ford,SUV,2016,20,,31,low damage,black,1 Toyota,sedan,2018,15,,22,medium damage,black,0 Toyota,sedan,2016,55,,19,no damage,white,0 Ford,sedan,2018,32,4,29,no damage,white,0 Toyota,SUV,2018,14,,27,medium damage,black,1 Toyota,sedan,2016,55,2,25,no damage,white,0 Ford,sedan,2018,35,,28,no damage,white,0 Toyota,sedan,2018,17,,34,medium damage,black,0 Toyota,sedan,2016,55,,37,no damage,white,0 Ford,SUV,2018,35,4,19,no damage,black,1 Toyota,sedan,2018,15,,25,medium damage,black,0 Toyota,sedan,2016,45,2,20,no damage,white,0 Ford,SUV,2018,35,,21,no damage,white,0 Toyota,sedan,2018,15,,24,medium damage,black,0 Toyota,SUV,2016,55,2,23,no damage,white,0 Ford,sedan,2018,35,,38,no damage,white,0 Ford,sedan,2017,30,,22,no damage,white,0 Toyota,SUV,2017,40,,25,low damage,white,0 Toyota,SUV,2017,39,4,29,no damage,black,1 Ford,sedan,2015,67,,17,medium damage,black,0 Ford,SUV,2016,40,,27,low damage,white,1 Toyota,sedan,2018,15,,21,medium damage,black,0 Toyota,SUV,2016,55,2,34,no damage,white,0 Ford,sedan,2018,35,,24,no damage,white,0 Toyota,SUV,2017,40,,31,no damage,white,0 Ford,SUV,2018,10,,34,no damage,black,1 Toyota,SUV,2017,40,4,30,no damage,black,0 Ford,sedan,2015,75,,17,medium damage,black,0 Ford,SUV,2016,41,,31,low damage,white,0 Ford,sedan,2018,12,,26,low damage,black,1 Ford,SUV,2015,72,,22,low damage,black,0 Toyota,sedan,2016,66,,20,low damage,black,0 Ford,SUV,2017,35,,36,no damage,black,0 Toyota,SUV,2017,40,,31,no damage,black,1 Toyota,sedan,2016,45,,22,low damage,black,0 Toyota,sedan,2016,55,,21,low damage,white,0 Ford,SUV,2017,42,,34,no damage,white,1

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

Discuss the techniques of job analysis.

Answered: 1 week ago

Question

How do we do subnetting in IPv6?Explain with a suitable example.

Answered: 1 week ago

Question

Explain the guideline for job description.

Answered: 1 week ago

Question

What is job description ? State the uses of job description.

Answered: 1 week ago

Question

What are the objectives of job evaluation ?

Answered: 1 week ago