Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Programming Problem 1. Let data be a list of n unique real numbers. Furthermore, suppose that each number in data is assigned a color -
Programming Problem 1. Let data be a list of n unique real numbers. Furthermore, suppose that each number in data is assigned a color - it is either 'red' or 'blue'. Let colors be a list of n strings, such that colors[i] gives the color of data[i]. In all parts of this problem, you may assume for simplicity that there is at least one data point of each color. a) To begin, suppose that all of the blue points are less than the red points. In a file named ppo3.py, write an efficient function called learn_theta(data, colors) which takes in two arguments - the lists data and colors as described above - and returns a single number @ such that all of the blue points are 0, as is depicted in the picture below. You may not assume that data is sorted. The time complexity of your algorithm should be optimal. RRRR i B B B B B Now suppose that a small number of the red points are smaller than some blue points - that is, there is some overlap, as shown below. Assume for simplicity that the largest data point is red. A B B BBB We wish to find a real number which "best" separates the blue points and red points. Clearly the points cannot be separated perfectly. Instead, we define a loss function L(0) which counts the number of points which are on the wrong side of . More precisely: L(0) = (# of red points ) The loss of the shown above is 2, since one red point is to the left of 8 and one blue point is to the right. Our goal is to design an algorithm for finding a minimizer of L(@). This is a simple instance of the machine learning task of classification. b) Also in pp03.py, write a function named compute_ell(data, colors, theta) which takes in lists data and colors as described above, as well as a floating-point number, theta. It should return the loss at theta as a floating-point number. Your algorithm should have the best possible time complexity. c) Also in pp03.py, write a function named minimize_ell(data, colors) which takes in data and colors and returns a floating-point number which minimizes the loss L for that particular data set. Your algorithm should have quadratic time complexity. You may assume for simplicity that the smallest data point is blue. Programming Problem 1. Let data be a list of n unique real numbers. Furthermore, suppose that each number in data is assigned a color - it is either 'red' or 'blue'. Let colors be a list of n strings, such that colors[i] gives the color of data[i]. In all parts of this problem, you may assume for simplicity that there is at least one data point of each color. a) To begin, suppose that all of the blue points are less than the red points. In a file named ppo3.py, write an efficient function called learn_theta(data, colors) which takes in two arguments - the lists data and colors as described above - and returns a single number @ such that all of the blue points are 0, as is depicted in the picture below. You may not assume that data is sorted. The time complexity of your algorithm should be optimal. RRRR i B B B B B Now suppose that a small number of the red points are smaller than some blue points - that is, there is some overlap, as shown below. Assume for simplicity that the largest data point is red. A B B BBB We wish to find a real number which "best" separates the blue points and red points. Clearly the points cannot be separated perfectly. Instead, we define a loss function L(0) which counts the number of points which are on the wrong side of . More precisely: L(0) = (# of red points ) The loss of the shown above is 2, since one red point is to the left of 8 and one blue point is to the right. Our goal is to design an algorithm for finding a minimizer of L(@). This is a simple instance of the machine learning task of classification. b) Also in pp03.py, write a function named compute_ell(data, colors, theta) which takes in lists data and colors as described above, as well as a floating-point number, theta. It should return the loss at theta as a floating-point number. Your algorithm should have the best possible time complexity. c) Also in pp03.py, write a function named minimize_ell(data, colors) which takes in data and colors and returns a floating-point number which minimizes the loss L for that particular data set. Your algorithm should have quadratic time complexity. You may assume for simplicity that the smallest data point is blue
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started