Answered step by step
Verified Expert Solution
Question
1 Approved Answer
please answer all the question shows below. The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill
please answer all the question shows below.
The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill a vending machine, in minutes. The second column, cases, is the number of cases of soft drinks, and finally, distance is the distance in feet that the worker had to walk to carry the cases from the truck to the vending machine. We might expect that lots of cases and a large distance will imply a large time. Let's fit a model of the form (time) = a (cases) +b x (distance) +c+ (noise) (1) to the data? (a) [2 marks] Load the data using read.csv() to obtain a data frame called data. For simplicity, extract its individual columns as vectors: time = data$time cases = data$cases distance = data$distance (b) [3 marks) Write a function called resid() that takes potential values for the unknown quantities a, b, and c passed to function as a single named vector. The function should compute and return the sum of squared residuals. Within the scope of the function, treat the data vectors as global variables. The sum of squared residuals is: S=[(time); [a(cases)i + b(distance)i + c]]?. (2) (c) [5 marks] Use R's optim() function to find the values of a, b, and c that minimise S. Write a sentence or two describing the results and their interpretation in plain English. (d) [2 marks] Make a different function, called absresid(), that uses the sum of absolute residuals instead: S= |(time); [a(cases)i + b(distance); +c| (3) i=1 (e) [3 marks] Use R's optim() function to find the values of a, b, and c that minimise S'. In practical terms, is there much difference between this result and the result in (b)? N The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill a vending machine, in minutes. The second column, cases, is the number of cases of soft drinks, and finally, distance is the distance in feet that the worker had to walk to carry the cases from the truck to the vending machine. We might expect that lots of cases and a large distance will imply a large time. Let's fit a model of the form (time) = a (cases) +b x (distance) +c+ (noise) (1) to the data? (a) [2 marks] Load the data using read.csv() to obtain a data frame called data. For simplicity, extract its individual columns as vectors: time = data$time cases = data$cases distance = data$distance (b) [3 marks) Write a function called resid() that takes potential values for the unknown quantities a, b, and c passed to function as a single named vector. The function should compute and return the sum of squared residuals. Within the scope of the function, treat the data vectors as global variables. The sum of squared residuals is: S=[(time); [a(cases)i + b(distance)i + c]]?. (2) (c) [5 marks] Use R's optim() function to find the values of a, b, and c that minimise S. Write a sentence or two describing the results and their interpretation in plain English. (d) [2 marks] Make a different function, called absresid(), that uses the sum of absolute residuals instead: S= |(time); [a(cases)i + b(distance); +c| (3) i=1 (e) [3 marks] Use R's optim() function to find the values of a, b, and c that minimise S'. In practical terms, is there much difference between this result and the result in (b)? NStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started