Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

ALL INFORMATION IS PROVIDED! USE R- CODE. ANYTHING IN BOLD IS R-CODE THAT IS PROVIDED. ##### Problem 1 Consider the business school admission data available

ALL INFORMATION IS PROVIDED! USE R- CODE. ANYTHING IN BOLD IS R-CODE THAT IS PROVIDED.

##### Problem 1

Consider the business school admission data available in `admission.csv`. The admission officer of a business school has used an "index" of undergraduate grade point average ($X_1$=`GPA`) and graduate management aptitude test ($X_2$=`GMAT`) scores to help decide which applicants should be admitted to the school's graduate programs. This index is used to categorize each applicant into one of three groups - `admit` (group 1), `do not admit` (group 2), and `borderline` (group 3).

1. First let's import the data set using the `read.csv()` function.

```{r}

library(caret) # load the caret package

admData <- read.csv('admission.csv') # change the path to the file if needed ```

2. Now, let's create 10 folds to be used by our models. This is done so that all the models are fit and tested on the same two sets of data points.

```{r}

set.seed(123)

# for reproducibility of results - don't remove this line

testInd <- createFolds(admData$Group,k=10)

```

For example,

```

{r}

testInd[[1]]

# 9 11 14 46 51 59 75 83

```

stores the indices of the test points of the 1st fold.

a. Using the `train()` function of the `caret` package, fit `Multinomial Logistic Regression` 10 times where each time you exclude the data points from the $k$-th fold during model training. Set your `method` argument to "multinom". Since the response variable `Group` is coded as numeric (with values 1,2 and 3), convert it into a factor variable using the `as.factor()` function during model fitting. Compute an estimate for the test `Accuracy` (i.e. the accuracy on the test set) using 10-fold cross validation.

b Repeat part (a) for `LDA` setting the `method` argument to "lda".

c. Repeat part (a) for `QDA` setting the `method` argument to "qda".

d. Repeat part (a) for `Naive Bayes` setting the `method` argument to "nb".

e. Repeat part (a) for `KNN` with $K=1,2,\ldots,10$ and setting the `method` argument to "knn". In this case, standardize your data prior to fitting the `KNN` model. Choose the optimal value of $K$ using 5-fold cross validation (_hint_: set `trControl=trainControl(method='cv',number = 5)` within the `train()` function).

f. In a single table report the **cross validation based** estimates of the test `Accuracy` for each model. Based on the results which model would you recommend?

admissions.csv

GPA GMAT Group
2.96 596 1
3.14 473 1
3.22 482 1
3.29 527 1
3.69 505 1
3.46 693 1
3.03 626 1
3.19 663 1
3.63 447 1
3.59 588 1
3.3 563 1
3.4 553 1
3.5 572 1
3.78 591 1
3.44 692 1
3.48 528 1
3.47 552 1
3.35 520 1
3.39 543 1
3.28 523 1
3.21 530 1
3.58 564 1
3.33 565 1
3.4 431 1
3.38 605 1
3.26 664 1
3.6 609 1
3.37 559 1
3.8 521 1
3.76 646 1
3.24 467 1
2.54 446 2
2.43 425 2
2.2 474 2
2.36 531 2
2.57 542 2
2.35 406 2
2.51 412 2
2.51 458 2
2.36 399 2
2.36 482 2
2.66 420 2
2.68 414 2
2.48 533 2
2.46 509 2
2.63 504 2
2.44 336 2
2.13 408 2
2.41 469 2
2.55 538 2
2.31 505 2
2.41 489 2
2.19 411 2
2.35 321 2
2.6 394 2
2.55 528 2
2.72 399 2
2.85 381 2
2.9 384 2
2.86 494 3
2.85 496 3
3.14 419 3
3.28 371 3
2.89 447 3
3.15 313 3
3.5 402 3
2.89 485 3
2.8 444 3
3.13 416 3
3.01 471 3
2.79 490 3
2.89 431 3
2.91 446 3
2.75 546 3
2.73 467 3
3.12 463 3
3.08 440 3
3.03 419 3
3 509 3
3.03 438 3
3.05 399 3
2.85 483 3
3.01 453 3
3.03 414 3
3.04 446 3

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Mathematics for Economics and Business

Authors: Ian Jacques

9th edition

129219166X, 9781292191706 , 978-1292191669

More Books

Students also viewed these Mathematics questions

Question

81. Analyze asset composition and coverage for solvency analysis.

Answered: 1 week ago