Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Could you help fixing the problem I have ? It is Gaussian Naive Bayes implementation using R from scratch. The output I got is not

Could you help fixing the problem I have ? It is Gaussian Naive Bayes implementation using R from scratch. The output I got is not expected. I am wondering whether the part for theposterior functionis correct. I am also wondering if I am wrong for another steps.

Task and instruction :

Write a function that performs Naive Bayes classification for the iris data. The function will output probabiity estimates of the species for a test case.

The function will accept three inputs: a row matrix for the x values of the test case, a matrix of x values for the training data, and a vector of class labels for the training data.

The function will create the probability estimates based on the training data it has been provided.

Within the function use a Gaussian model and estimate the mean and standard deviation of the Gaussian populations based on the training data provided. (Hint: You have 24 parameters to estimate: the mean and standard deviation of each of the 4 variables for each of the three species. With the naive assumption, you do not have to estimate any covariances.)

My code so far :

library(dplyr) iris_nb <- function(testx, trainx, trainy){  trainx <- as.data.frame(trainx)  train <- cbind(trainx, trainy)  class_virginica <- train[which(train$trainy == 'virginica'),]  class_setosa <- train[which(train$trainy == 'setosa'),]  class_versicolor <- train[which(train$trainy == 'versicolor'),]  posterior <- function(testx, classtype){   p_Sepal.Length <- dnorm(testx[1], mean(classtype[,1]), sd(classtype[,1]))   p_Sepal.Width <- dnorm(testx[2], mean(classtype[,2]), sd(classtype[,2]))   p_Petal.Length <- dnorm(testx[3], mean(classtype[,3]), sd(classtype[,3]))   p_Petal.Width <- dnorm(testx[4], mean(classtype[,4]), sd(classtype[,4]))   vec <- (1/3)* p_Sepal.Length * p_Sepal.Width * p_Petal.Length * p_Petal.Width #for each species   return(vec)  }  output <- (c(sum(posterior(testx, class_setosa)),         sum(posterior(testx, class_versicolor)),         sum(posterior(testx, class_virginica))))  names(output) <- c("setosa","versicolor","virginica")  return(output) } 

Test Case and my output :

set.seed(1) training_rows <- sort(c(sample(1:50, 40), sample(51:100, 40), sample(101:150, 40))) training_x <- as.matrix(iris[training_rows, 1:4]) training_y <- iris[training_rows, 5] # test cases test_case_a <- as.matrix(iris[24, 1:4]) # true class setosa test_case_b <- as.matrix(iris[73, 1:4]) # true class versicolor test_case_c <- as.matrix(iris[124, 1:4]) # true class virginica # class predictions of test cases iris_nb(test_case_a, training_x, training_y) ## setosa versicolor virginica ## 2.943304e-02 3.031270e-15 1.206279e-19 which is wrong; should be 1 1.029887e-13 4.098385e-18 iris_nb(test_case_b, training_x, training_y) ## setosa versicolor virginica ## 3.368210e-116 1.020970e-01 1.090789e-02 which is wrong ; should be 2.980587e-115 0.9034742 0.09652578 iris_nb(test_case_c, training_x, training_y) ## setosa versicolor virginica ## 5.244160e-137 8.231302e-03 7.804414e-02 which is wrong; should be 6.078393e-136 0.09540725 0.9045928 

However , the output I got does not match with the the results predicted bynaiveBayes() from the package 'e1071'. According to instruction , they should match. i.e the expected results of iris_nb() should be same to the results of naiveBayes().

The results predicted bynaiveBayes() is given below which are correct:

library(e1071) nb_model1 <- naiveBayes(training_x, training_y) predict(nb_model1, newdata = test_case_a, type = 'raw') ## setosa versicolor virginica ## [1,] 1 1.029887e-13 4.098385e-18 predict(nb_model1, newdata = test_case_b, type = 'raw') ## setosa versicolor virginica ## [1,] 2.980587e-115 0.9034742 0.09652578 predict(nb_model1, newdata = test_case_c, type = 'raw') ## setosa versicolor virginica  ## [1,] 6.078393e-136 0.09540725 0.9045928  

Could you tell what the issue for the function iris_nb()? And how can I fix ?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Fundamentals of Management

Authors: Robbins, DeCenzo, Coulter

7th Edition

132996855, 0-13-610982-9 , 9780132996853, 978-0-13-61098, 978-0136109822