Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

5 | P a g e Task First, copy the code below to a R script. Enter your student ID into the command set.seed (

5| P a g e
Task
First, copy the code below to a R script. Enter your student ID into the command set.seed(.)
and run the whole code. The code will create a sub-sample that is unique to you.
Use the str(.) command to check that the data type for each feature is correctly specified.
Address the issue if this is not the case.
You are to clean and perform basic data analysis on the relevant features in mydata, and as
well as principal component analysis (PCA) on the continuous variables. This is to be done
using R. You will report on your findings.
Part 1 Exploratory Data Analysis and Data Cleaning
(i) For each of your categorical or binary variables, determine the number (%) of
instances for each of their categories and summarise them in a table as follows.
State all percentages in 1 decimal places.
Categorical Feature Category N (%)
Feature 1 Category 110(10.0%)
Category 230(30.0%)
Category 350(50.0%)
Missing 10(10.0%)
Feature 2(Binary) YES 75(75.0%)
NO 25(25.0%)
Missing 0(0.0%)
.........
Feature k Category 125(25.0%)
Category 225(25.0%)
Category 315(15.0%)
Category 430(30.0%)
Missing 5(5.0%)
# You may need to change/include the path of your working directory
dat <- read.csv("HealthCareData_2024.csv", stringsAsFactors = TRUE)
# Separate samples of normal and malicious events
dat.class0<- dat %>% filter(Classification == "Normal") # normal
dat.class1<- dat %>% filter(Classification == "Malicious") # malicious
# Randomly select 400 samples from each class, then combine them to form a working dataset
set.seed(Enter your student ID here)
rand.class0<- dat.class0[sample(1:nrow(dat.class0), size =400, replace = FALSE),]
rand.class1<- dat.class1[sample(1:nrow(dat.class1), size =400, replace = FALSE),]
# Your sub-sample of 800 observations
mydata <- rbind(rand.class0, rand.class1)
dim(mydata) # Check the dimension of your sub-sample

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Essential SQLAlchemy Mapping Python To Databases

Authors: Myers, Jason Myers

2nd Edition

1491916567, 9781491916568

More Books

Students also viewed these Databases questions