Answered step by step
Verified Expert Solution
Question
1 Approved Answer
file: / / / C: / Users / 1 4 0 9 3 / OneDrive / credit 2 . pdf Use Rstudio and give 3
file:C:UsersOneDrivecreditpdf Use Rstudio and give sentences explaining each step of code. This assignment asks you to examine the Support Vector Machine for classification. Provide your answers to the questions in a Word document named AssignLastName.doc along with your source code that should be saved as AssignLastName, and click the title link to upload and submit them.
The attached dataset contains German credit data. To apply Support Vector Machines, the data requires preprocessing, such as data type transformation and normalization. For data type transformations, we mainly perform factoring of the categorical variables, where we transform the data type of the categorical features from numeric to factor.For example,
code line : #data preprocessing; code line : #data type transformation factoring; coding line : toFactor function df variables; coding line : for variable in variables; coding line : dfvariableasfactordfvariable; coding line : ; coding line : return df; coding line :
There are several numeric variables, which include credit.amount, age, and credit.duration.months. Please check their distributions using histogram. If they are skewed distributions, please normalize the data. One possible way is using znormalization as follows:
coding line : #Normalizationscaling; coding line : scales.features functiondf variables; coding line : for variable in variables; coding line : dfvariablescaledfvariable centerT scaleT; coding line : ; coding line : returndf; coding line :
You can pass some variables to the above functions to transform and normalize the data, such as the following example:
code: #Normalize variable numeric.var cyour variables" yourData scale.featuresyourData numeric.var
You can apply a similar method to transform your data.
Once the preprocessing is completed, partition the data randomly into training and testing sets using a : ratio. Use the training set to fit a model and the testing set to assess the model performance.
Develop a model using various techniques youve learned from the lecture, and propose and explain the best model.
There are three numeric variables: credit duration month, amount, and age. Plot the histogram of the variables. pts
Normalize the above variables, plot the histogram of the variables again. pts
Normalize and factorize the appropriate variables, and split in a : ratio for train:test. Use the training set, and create svm model using e package svm Test the generated model on the test dataset, and explain the results of trained model and the comparison of the original dataset and tested model using pred pts
Repeat c but with linear kernel and nonlinear kernel pts
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started