Question

1 Approved Answer

Posted on Jul 08, 2024

I am wondering about the function, regardless of the data set what would be R function for question 1 and 2 Project 1 Naive Bayes

I am wondering about the function, regardless of the data set what would be R function for question 1 and 2

Project 1 Naive Bayes and Logistic Regression In this project you will code up two of the classification algorithms covered in class: Naive Bayes and Logistic Regression. The framework code for this question can be downloaded from CANVAS. Programming Language: You must write your code in R. . Submission Instructions: For each sub-question you will be given a single function signature. You will be asked to write a single R function which satisfies the signature. In the framework code, we have provided you with & R script for the functions you need to complete. Do not change the structure of the file. Complete each of these functions, and compress the code and the results files, evaluation.txt as a jar file and submit it to Canvas You may submit it multiple times. Each submission will overwrite the previous submission. Only the last submission before the deadline will be graded. . Presentation slides: Make slides to summarize your results. You do not need submit the slides, but I will randomly draw a couple of groups to present their slides in class. SUBMISSION CHECKLIST - Submission executes in less than 20 minutes. - Submission is smaller than 100K. - Submission is a . tar file. - Submission returns matrices of the eract dimension specified. . Data: All questions will use the following datastructures: - sThin ( Rox is a matrix of training data, where each row is a training point, and each column is a feature. - ITest ( RX) is a matrix of test data, where each row is a test point, and each column is a feature. - plain c (1, .., ejaxl is a vector of training labels - prest ( {1,..., e]mix is a (hidden) vector of test labels. 1 Logspace Arithmetic [10 pts] When working with very small and very large numbers (such as probabilities), it is useful to work in lagspace to avoid numerical precision issues. In logspace, we keep track of the logs of numbers, instead of the numbers themselves. (We generally use natural logs for this). For example, if p(z) and p(y) are proba- bility values, instead of storing p(x) and p(p) and computing p(s) . p(y), we work in log space by storing log p( 2), log ply), logjp(3) + p(v]], where logo(x) + p(v)] is computed as logp(2) + log p(v). The challenge is to add and multiply these numbers afile remananay in logspace, without exponentinting. Note that if we exponentiale our numbers at any point in the calculation it completely defeats the purpose of working in log space. 1. Logspace Multiplication [5 pts] Complete log Prod-function(x) which takes as input a vector of numbers in logspace (1.e., 2, - log p,), and returns the product of these numbers in logspace - Le., logProd(x] - log II, p- 2. Logspace Addition [5 pts] Complete logSum-function(x) which takes as input a vector of numbers in logspace (L.e., I - logp,), and returns the sum of these numbers in logspace - Le., logSum(x) = log _ p