Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Consider the line y = mx + b and the point (x0, y0). Here we look at only the vertical distance, which is the
Consider the line y = mx + b and the point (x0, y0). Here we look at only the vertical distance, which is the difference between the y value of the point and the y value of the line at the same x value. This difference is sometimes called a residual. Specific case: You have done three experiments, leading to the following three results correlating the x value and the y value: (1, 2), (2, 4), (3, 5). We are going to fit a line to the data as follows: we shall find the line that minimizes the sum of the squares of the residuals between these points and the line. We use the square partially because the square is always positive, so we do not have to worry about signs. It is much easier to work with squares than with absolute values. i. Consider the line y = mx + b and the point (x0, y0). Find an expression for the vertical distance between the line and the point, i.e. the residual. ii. Now assume we have a line y = mx + b, and the points above. What are the three residuals. Note that your answers will have m's and b's in them. iii. The function D(m, b), represents the sum of the squares of the residuals: i.e., you square each residual and add the results. Write a formula for D(m, b). iv. Optimize D(m, b) by taking the partial derivative with respect to each of the two variables and setting them equal to zero. This will lead to two linear equations in two unknowns. v. Solve the equations to find the values of m and b. vi. Draw a graph with the three points and the line to make sure it looks reasonable. If it doesn't: return to ii. vii. Check your answer by going to the Wolfram Alpha website and typing: 'best fit line (1,2), (2,4), (3,5). If you have the wrong answer: return to ii. General case. Do exactly what you did above but instead of the three specific points, use k points with unknown values: (x0, y0), (x1, y1)... (xk-1, yk-1) i. Again, the function D(m, b) represents the sum of the squares of the residuals. Write a formula for D(m, b). It is easier now, and will be much easier in the next part, if you work with these quantities using sigma notation. For example, write the sum x0+ x1 ++xk-1 as xi. ii. Now optimize D(m, b). Take the partial derivatives with respect to each of the two variables and set the results equal to zero. This, again, will lead to two linear equations in two unknowns. Note that it is very important that we think of the (x, y) points as constants, even though we do not know their values. iii. Solve the two equations to the extent that they are each written in the following form: b = a fraction that involves a m, xi, yi, k and preferably Sigma signs Note that all symbols may not be needed to present the equations in their required form iv. Use your equations from iii to find the equation of the best-fit line to the following data: (1,2). (1.3). (2,4), (2.2), (4,8), (3.5), (4,5). When you plug in the data, you should end up with two linear equations in two unknowns v. Manipulate your equations from iii to end up with one of the standard equations for linear regression. Take your two equations of the form b = something and set the two somethings equal to each other. Cross multiply and manipulate.
Step by Step Solution
★★★★★
3.41 Rating (160 Votes )
There are 3 Steps involved in it
Step: 1
i The vertical distance between the line and the point is the residual and is expressed as residual y0 mx0 b ii The three residuals for the given poin...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started