Question

1 Approved Answer

Posted on Sep 25, 2024

please answer all the question shows below. The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill

please answer all the question shows below.

image text in transcribed

The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill a vending machine, in minutes. The second column, cases, is the number of cases of soft drinks, and finally, distance is the distance in feet that the worker had to walk to carry the cases from the truck to the vending machine. We might expect that lots of cases and a large distance will imply a large time. Let's fit a model of the form (time) = a (cases) +b x (distance) +c+ (noise) (1) to the data? (a) [2 marks] Load the data using read.csv() to obtain a data frame called data. For simplicity, extract its individual columns as vectors: time = data$time cases = data$cases distance = data$distance (b) [3 marks) Write a function called resid() that takes potential values for the unknown quantities a, b, and c passed to function as a single named vector. The function should compute and return the sum of squared residuals. Within the scope of the function, treat the data vectors as global variables. The sum of squared residuals is: S=[(time); [a(cases)i + b(distance)i + c]]?. (2) (c) [5 marks] Use R's optim() function to find the values of a, b, and c that minimise S. Write a sentence or two describing the results and their interpretation in plain English. (d) [2 marks] Make a different function, called absresid(), that uses the sum of absolute residuals instead: S= |(time); [a(cases)i + b(distance); +c| (3) i=1 (e) [3 marks] Use R's optim() function to find the values of a, b, and c that minimise S'. In practical terms, is there much difference between this result and the result in (b)? N The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill a vending machine, in minutes. The second column, cases, is the number of cases of soft drinks, and finally, distance is the distance in feet that the worker had to walk to carry the cases from the truck to the vending machine. We might expect that lots of cases and a large distance will imply a large time. Let's fit a model of the form (time) = a (cases) +b x (distance) +c+ (noise) (1) to the data? (a) [2 marks] Load the data using read.csv() to obtain a data frame called data. For simplicity, extract its individual columns as vectors: time = data$time cases = data$cases distance = data$distance (b) [3 marks) Write a function called resid() that takes potential values for the unknown quantities a, b, and c passed to function as a single named vector. The function should compute and return the sum of squared residuals. Within the scope of the function, treat the data vectors as global variables. The sum of squared residuals is: S=[(time); [a(cases)i + b(distance)i + c]]?. (2) (c) [5 marks] Use R's optim() function to find the values of a, b, and c that minimise S. Write a sentence or two describing the results and their interpretation in plain English. (d) [2 marks] Make a different function, called absresid(), that uses the sum of absolute residuals instead: S= |(time); [a(cases)i + b(distance); +c| (3) i=1 (e) [3 marks] Use R's optim() function to find the values of a, b, and c that minimise S'. In practical terms, is there much difference between this result and the result in (b)? N