Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

file: / / / C: / Users / 1 4 0 9 3 / OneDrive / credit 2 . pdf Use Rstudio and give 3

file:///C:/Users/14093/OneDrive/credit2.pdf Use Rstudio and give 3 sentences explaining each step of code. This assignment asks you to examine the Support Vector Machine for classification. Provide your answers to the questions in a Word document named Assign4_LastName.doc along with your source code that should be saved as Assign4_LastName, and click the title link to upload and submit them.
The attached dataset contains German credit data. To apply Support Vector Machines, the data requires preprocessing, such as data type transformation and normalization. For data type transformations, we mainly perform factoring of the categorical variables, where we transform the data type of the categorical features from numeric to factor.For example,
code line 5: #data preprocessing; code line 6: #data type transformation - factoring; coding line 7: to.Factor <- function (df, variables){; coding line 8: for (variable in variables){; coding line 9: df[[variable]]<_as.factor(df[[variable]]); coding line 10: }; coding line 11: return (df); coding line 12: }
There are several numeric variables, which include credit.amount, age, and credit.duration.months. Please check their distributions using histogram. If they are skewed distributions, please normalize the data. One possible way is using z-normalization as follows:
coding line 14: #Normalization-scaling; coding line 15: scales.features <_function(df, variables){; coding line 16: for (variable in variables){; coding line 17: df[[variable]]<_scale(df[[variable]], center=T, scale=T); coding line 18: }; coding line 19: return(df); coding line 20: }
You can pass some variables to the above functions to transform and normalize the data, such as the following example:
code: #Normalize variable numeric.var <-c("your variables"0 yourData <- scale.features(yourData, numeric.var)
You can apply a similar method to transform your data.
Once the preprocessing is completed, partition the data randomly into training and testing sets using a 6:4 ratio. Use the training set to fit a model and the testing set to assess the model performance.
Develop a model using various techniques youve learned from the lecture, and propose and explain the best model.
There are three numeric variables: credit duration month, amount, and age. Plot the histogram of the variables. (20 pts)
Normalize the above variables, plot the histogram of the variables again. (20 pts)
Normalize and factorize the appropriate variables, and split in a 6:4 ratio for train:test. Use the training set, and create svm model using e1071 package svm. Test the generated model on the test dataset, and explain the results of trained model and the comparison of the original dataset and tested model using pred (20 pts).
Repeat (c) but with linear kernel and non-linear kernel(40 pts).

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

3. What are the possible causes of synaesthesia?

Answered: 1 week ago

Question

b. Explain how you initially felt about the communication.

Answered: 1 week ago