Question 1 (20 marks] School administrators want to study the attendance behaviours of high school students. For each of 316 students the following variables are available: * daysabs: The number of days the student was absent from school (the response variable). * daysatt: The number of days the student attended school. . id: An identification number assigned to each student. * male: An indicator variable of the student's gender (1=male, ()=female). . math: The student's standardized test score in mathematics (a continuous variable). . langarts: The students standardized test score in language arts (@ continuous variable). Here is the date for the first five students: id male math langarts daysatt daysabs 1001 1 56.986830 42.45086 73 4 1002 1 37.094160 46.82059 73 4 1003 0 32.275460 43.56657 76 1004 0 29. 056720 43.56657 74 1005 0 6. 748048 27.24847 73 We will consider modelling the data as a time homogeneous poisson process. The R code and output from fitting a log-linear model appears on page 13. (@) Explain why we include an offset term in log-linear models for poisson processes. What would you choose as an appropriate offset term for this data? What would be the implications of omitting the offset term? [4 marks] (b) Consider the main effects model logue otf anthony toffset where =I[male = 1] is the indicator of male gender and # 2 = math is the math score for subject i Assume the appropriate offset has been included. The fitted version of this model is given in the R output. Conduct a Weld based test of the null hypothesis that the rate of absenteeism for males is half that of females. Be sure to carefully state the null and alternative hypotheses in terms of the regression parameters, give the formula of the test statistic and its asymptotic distribution under the null hypothesis. What is the conclusion of the test