Question
R language In this assignment, download the train.csv from https://www.kaggle.com/code/gadigevishalsai/credit-score-classification-eda-classification/data (you do NOT need the test.csv file). 1Coding: (a) Read the .csv file into R.
R language
In this assignment, download the train.csv from
https://www.kaggle.com/code/gadigevishalsai/credit-score-classification-eda-classification/data (you do NOT need the test.csv file).
1Coding:
(a) Read the .csv file into R.
(b)Use the str() function to obtain variable types.
(c)Create a new object which should be a subset of this dataset. The subset only includes sample units with the Credit_Score variable equal to Poor or Good.
(d)Create a new object which includes 80% of the sample units from 5.1c, this 80% should be randomly sampled.
(e)For a non-numerical variable that you identified in 5.1b, count the number of unique values in this variable.
Answer the following questions:
(a)If you are to use all variables in this dataset to explain customers credit score (the Credit_Score variable), what are the variables that should NOT be included in analysis, why?
(b)Based on the output from 5.1b, which variable(s) have types that went against your expectation (e.g., you thought a variable is categorical, but based on str(), R had it as numerical)?
(c)From a modeling perspective, what is the potential consequence of treating a numerical variable as a categorical variable? What about treating a categorical variable as a numerical variable?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started