Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Suppose I have a labelled dataset S of 800 instances for a binary classification problem. There are 200 positive instances and 600 negative instances. I

image text in transcribed

Suppose I have a labelled dataset S of 800 instances for a binary classification problem. There are 200 positive instances and 600 negative instances. I am considering using the following ML algorithms: A and B. 2.1 I want to split the S dataset into three sets: training set T, validation set V, and test set Z. The training set should be 60% of S, the validation 20% of S, and the test set 20% of S. Propose in general terms how I would split my dataset S. Also give proportions of + and instances in each set. 2.2 The algorithm A has a hypothesis space of 100 hypotheses. Which set (T, V, or Z) should I use to find the hypothesis that minimizes a loss function? What is this process called (there are three accepted terms for this)? I use T to get the training loss minimizer from among the 100 hypotheses. This process is called training. 2.3 I have trained the algorithm A on the training set T to produce a hypothesis hA. Similarly, I have trained the algorithm B on the training set T to produce a hypothesis hB. Now I use the validation set V to select the better of the two hypotheses hA and hB using accuracy as a performance measure. Suppose that hA is the better hypothesis based on the validation set V. I retrain algorithm A on train+val data and compute accuracy performance on the test set. Thus, I have three accuracy measures using algorithm A: the training set accuracy, the validation set accuracy, and the test set accuracy. Order the three accuracy measures (train, val, test) from closest to the true accuracy. Use the probability bounds to answer this question. For the training set accuracy, size of H is 100 hypotheses

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database And Expert Systems Applications Dexa 2022 Workshops 33rd International Conference Dexa 2022 Vienna Austria August 22 24 2022 In Computer And Information Science 33

Authors: Gabriele Kotsis ,A Min Tjoa ,Ismail Khalil ,Bernhard Moser ,Alfred Taudes ,Atif Mashkoor ,Johannes Sametinger ,Jorge Martinez-Gil ,Florian Sobieczky ,Lukas Fischer ,Rudolf Ramler ,Maqbool Khan ,Gerald Czech

1st Edition

3031143426, 978-3031143427

More Books

Students also viewed these Databases questions