Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

( points) Using the dataset below, we want to build a decision tree which classifie whether the given URL is phishing or not. Calculate the

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed

( points) Using the dataset below, we want to build a decision tree which classifie whether the given URL is phishing or not. Calculate the conditional entropy of all th attributes and draw the optimal decision tree. Conditional entropy of Y is defined at below: H(YXi)=xP(Xi=x)H(YXi=x) 8. ( points) With a linear regressor defined as below: y=Xw+ we want to find the optimal w that minimizes the Sum Squared Error (SSE) which is defined as below: SumSquaredError=i=1n(yiwxi)2 Let the cost function J be iei2=eTe, where e=(yXw), derive the maximum likelihood estimate of the parameter w by taking the derivative of the function with respect to w. 9. ( points) Calculate the Average Precision of a search engine model with the predictions and relevancy as below (+ indicates relevant document and - indicates irrelevant document): 0. ( points) Data leakage refers to a mistake make by the creator of a machine learning model in which they accidentally share information between the test and training datasets. For example, Andrew Ng's group had 100k x-rays of 30k patients, meaning 3 images per patient. The paper used random splitting instead of ensuring that all images of a patient was in the same split. Hence the model partially memorized the patients instead of learning to recognize pneumonia in chest x-rays (Example from Wikipedia). - Give an example of possible data leakage that can be incorporated in the machine learning training process (e.g., your term-project). - Discuss how we can avoid data leakage in ML

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Databases In Networked Information Systems 6th International Workshop Dnis 2010 Aizu Wakamatsu Japan March 2010 Proceedings Lncs 5999

Authors: Shinji Kikuchi ,Shelly Sachdeva ,Subhash Bhalla

2010th Edition

3642120377, 978-3642120374

More Books

Students also viewed these Databases questions