Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Programming Problem 1. Let data be a list of n unique real numbers. Furthermore, suppose that each number in data is assigned a color -

image text in transcribedimage text in transcribed

Programming Problem 1. Let data be a list of n unique real numbers. Furthermore, suppose that each number in data is assigned a color - it is either 'red' or 'blue'. Let colors be a list of n strings, such that colors[i] gives the color of data[i]. In all parts of this problem, you may assume for simplicity that there is at least one data point of each color. a) To begin, suppose that all of the blue points are less than the red points. In a file named ppo3.py, write an efficient function called learn_theta(data, colors) which takes in two arguments - the lists data and colors as described above - and returns a single number @ such that all of the blue points are 0, as is depicted in the picture below. You may not assume that data is sorted. The time complexity of your algorithm should be optimal. RRRR i B B B B B Now suppose that a small number of the red points are smaller than some blue points - that is, there is some overlap, as shown below. Assume for simplicity that the largest data point is red. A B B BBB We wish to find a real number which "best" separates the blue points and red points. Clearly the points cannot be separated perfectly. Instead, we define a loss function L(0) which counts the number of points which are on the wrong side of . More precisely: L(0) = (# of red points ) The loss of the shown above is 2, since one red point is to the left of 8 and one blue point is to the right. Our goal is to design an algorithm for finding a minimizer of L(@). This is a simple instance of the machine learning task of classification. b) Also in pp03.py, write a function named compute_ell(data, colors, theta) which takes in lists data and colors as described above, as well as a floating-point number, theta. It should return the loss at theta as a floating-point number. Your algorithm should have the best possible time complexity. c) Also in pp03.py, write a function named minimize_ell(data, colors) which takes in data and colors and returns a floating-point number which minimizes the loss L for that particular data set. Your algorithm should have quadratic time complexity. You may assume for simplicity that the smallest data point is blue. Programming Problem 1. Let data be a list of n unique real numbers. Furthermore, suppose that each number in data is assigned a color - it is either 'red' or 'blue'. Let colors be a list of n strings, such that colors[i] gives the color of data[i]. In all parts of this problem, you may assume for simplicity that there is at least one data point of each color. a) To begin, suppose that all of the blue points are less than the red points. In a file named ppo3.py, write an efficient function called learn_theta(data, colors) which takes in two arguments - the lists data and colors as described above - and returns a single number @ such that all of the blue points are 0, as is depicted in the picture below. You may not assume that data is sorted. The time complexity of your algorithm should be optimal. RRRR i B B B B B Now suppose that a small number of the red points are smaller than some blue points - that is, there is some overlap, as shown below. Assume for simplicity that the largest data point is red. A B B BBB We wish to find a real number which "best" separates the blue points and red points. Clearly the points cannot be separated perfectly. Instead, we define a loss function L(0) which counts the number of points which are on the wrong side of . More precisely: L(0) = (# of red points ) The loss of the shown above is 2, since one red point is to the left of 8 and one blue point is to the right. Our goal is to design an algorithm for finding a minimizer of L(@). This is a simple instance of the machine learning task of classification. b) Also in pp03.py, write a function named compute_ell(data, colors, theta) which takes in lists data and colors as described above, as well as a floating-point number, theta. It should return the loss at theta as a floating-point number. Your algorithm should have the best possible time complexity. c) Also in pp03.py, write a function named minimize_ell(data, colors) which takes in data and colors and returns a floating-point number which minimizes the loss L for that particular data set. Your algorithm should have quadratic time complexity. You may assume for simplicity that the smallest data point is blue

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started