Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

StateName DrivLic AveIncome RoadMiles Popul GasTax FuelCons AL 3559897 23471 94440 3451586 4.8 9018766 AK 472211 30064 13628 457728 2.1 891086 AZ 3550367 25578 55245

image text in transcribed

image text in transcribed

StateName DrivLic AveIncome RoadMiles Popul GasTax FuelCons AL 3559897 23471 94440 3451586 4.8 9018766 AK 472211 30064 13628 457728 2.1 891086 AZ 3550367 25578 55245 3907526 4.8 9192603 AR 1961883 22257 98132 2072622 5.7 5141245 CA 21623793 32275 168771 25599275 4.8 55614309 CO 3287922 32949 85854 3322455 5.8 7755033 CT 2650374 40640 20910 2651452 6.6 5520184 DE 564099 31255 5814 610269 6.1 1446189 DC 328094 37383 1534 468575 5.3 563152 FL 12743403 28145 117299 12741821 3.6 28281241 GA 5833802 27940 115534 6250708 2 17767590 HI 787820 28221 4278 949184 4.2 1531895 ID 896666 24180 46310 969166 6.6 2305508 IL 7809500 32259 138359 9530327 5 18984653 IN 4116924 27011 94038 4682392 4 11814208 IA 1978748 26723 113437 2281002 5.3 5586554 KS 1871301 27816 134725 2058489 5.5 4682367 KY 2756634 24294 78914 3161283 4.3 7894961 LA 2718209 23334 60829 3394854 5.3 8144071 ME 942556 25623 22672 1010273 5.8 2233744 MD 3451966 33872 30622 4085342 6.2 9314172 MA 4610666 37992 35408 5008007 5.5 10298246 MI 6976982 29612 121790 7628170 5 18566259 MN 2961236 32101 132280 3782817 5.3 9635875 MS 1859487 20993 73701 2160165 4.9 5589071 MO 3862300 27445 124324 4292175 4.5 11200574 MT 683351 22569 69503 701423 7.1 1769933 NE 1267284 27829 92766 1314974 6.5 3074688 NV 1420714 30529 38658 1537896 6.5 3579646 NH 941829 33332 15508 960593 5.2 2507739 NJ 5715089 36983 36175 6545471 2.8 14807907 NM 1231701 22203 59883 1370134 4.9 3353226 NY 11014805 34547 112961 14797284 5.8 20958346 NC 5884651 27194 101195 6291182 6.4 15371006 ND 455921 25068 86591 502176 5.5 1266386 OH 7736115 28400 117267 8789530 5.8 19034086 OK 2172394 23517 112694 2665966 4.5 6630906 OR 2534464 28350 66784 2673283 6.3 5629923 PA 8226202 29539 119985 9693987 6.9 19020440 RI 660435 29685 6053 827474 7.7 1510806 SC 2849885 24321 66167 3115130 4.2 8392788 SD 544997 26115 83560 577391 5.8 1523522 TN 4188317 26239 87826 4445987 5.3 10741354 TX 13045727 27871 300767 15618097 5.3 40267253 UT 1495887 23907 42208 1598531 6.5 3579223 VT 515348 26901 14291 479265 5.3 1253663 VA 4920753 31162 70721 5529436 4.6 14254787 WA 4237845 31528 80985 4552631 6.1 9927741 WV 1316955 21915 36997 1455370 6.8 3098419 WI 3667497 28232 112663 4156609 7.2 9154215 WY 370713 27230 27292 381882 3.7 1218323

1. File "Fuel_Consumption_2001.txt", reports the consumptions of gasoline in 2001, for all states in US (plus DC). It also reports several features for all states. Specifically, per each state, in 2001, column "StateName" reports the state name; "Drivlic" the number of licensed drivers; "Avelncome" the average income per person in dollars; "Road Miles" the miles of highway; "Popul" the population of age 16 and over; "GasTax" the state tax on gasoline, in cents per liter (one liter is 0.001 cubic meter); "FuelCons" the cubic meters of sold gasoline (for transportation). Derive two new features from the original ones, as follows: "logMiles" log of "RoadMiles", in base 2; "Fuel_pp_lit" "FuelCons"/"Popul" x 1000, it is the fuel consumption in liter per person. 2. Analyze again dataset in file Fuel_Consumption_2001.txt" (the features are explained in problem 1). Derive 2 more new features from the original ones. Define them as follows: "PercDrLic" DrivLic/Popul x 100, percentage of driv. lics. on population; AvelncomeK" Avelncome/1000, average income per person in thousands of dollars; a) Can you guess why PercDrLic may be more appropriate than DrivLic, as a feature in a linear model for predicting Fuel_pp_lit"? b) Build a model for predicting Fuel_pp_lit using four features (plus the constant one associated with the intercept coefficient): GasTax, PercDrLic, AvelncomeK, logMiles. Estimate the five coefficients of the linear model. c) Identify a (1 a) = 95% confidence interval for each of the five coefficients (use either the normal approximation or the t-based approximation for the error). d) Estimate ?, the noise variance, which quantifies the model error, and compute the coefficient of determination R2. e) Consider the state of Pennsylvania (PA). The value for all features of Pennsylvania are reported in the file. What is the actual value of Fuel_pp_lit for Pennsylvania? f) Using the model calibrated above, predict Fuel_pp_lit for Pennsylvania. What is the point estimated? g) What is the (1 a) = 95% confidence interval for this prediction? Is the actual value inside or outside the interval? 1. File "Fuel_Consumption_2001.txt", reports the consumptions of gasoline in 2001, for all states in US (plus DC). It also reports several features for all states. Specifically, per each state, in 2001, column "StateName" reports the state name; "Drivlic" the number of licensed drivers; "Avelncome" the average income per person in dollars; "Road Miles" the miles of highway; "Popul" the population of age 16 and over; "GasTax" the state tax on gasoline, in cents per liter (one liter is 0.001 cubic meter); "FuelCons" the cubic meters of sold gasoline (for transportation). Derive two new features from the original ones, as follows: "logMiles" log of "RoadMiles", in base 2; "Fuel_pp_lit" "FuelCons"/"Popul" x 1000, it is the fuel consumption in liter per person. 2. Analyze again dataset in file Fuel_Consumption_2001.txt" (the features are explained in problem 1). Derive 2 more new features from the original ones. Define them as follows: "PercDrLic" DrivLic/Popul x 100, percentage of driv. lics. on population; AvelncomeK" Avelncome/1000, average income per person in thousands of dollars; a) Can you guess why PercDrLic may be more appropriate than DrivLic, as a feature in a linear model for predicting Fuel_pp_lit"? b) Build a model for predicting Fuel_pp_lit using four features (plus the constant one associated with the intercept coefficient): GasTax, PercDrLic, AvelncomeK, logMiles. Estimate the five coefficients of the linear model. c) Identify a (1 a) = 95% confidence interval for each of the five coefficients (use either the normal approximation or the t-based approximation for the error). d) Estimate ?, the noise variance, which quantifies the model error, and compute the coefficient of determination R2. e) Consider the state of Pennsylvania (PA). The value for all features of Pennsylvania are reported in the file. What is the actual value of Fuel_pp_lit for Pennsylvania? f) Using the model calibrated above, predict Fuel_pp_lit for Pennsylvania. What is the point estimated? g) What is the (1 a) = 95% confidence interval for this prediction? Is the actual value inside or outside the interval

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial Accounting

Authors: John Stittle, Robert Wearing

1st Edition

1412935024, 9781412935029

More Books

Students also viewed these Accounting questions

Question

Would you be willing to work with them?

Answered: 1 week ago