Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1 2 3 4 7 8 9 12 13 14 15 16 17 18 19 20 21 22 23 24 28 29 30 31 38

1 2 3 4 7 8 9 12 13 14 15 16 17 18 19 20 21 22 23 24 28 29 30 31 38 40 41 44 47 48 49 50 51 62 63 64 66 67 68 69 70 71 73 74 76 77 78 79 80 81 82 85 86 87 88 89 90 91 92 93 94 95 99 100 101 104 105 106 108 109 110 111 112 113 114 116 117 118 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 Ozone Solar.R Wind Temp Month Day 41 190 7.4 67 5 1 36 118 8 72 5 2 12 149 12.6 74 5 3 18 313 11.5 62 5 4 23 299 8.6 65 5 7 19 99 13.8 59 5 8 8 19 20.1 61 5 9 16 256 9.7 69 5 12 11 290 9.2 66 5 13 14 274 10.9 68 5 14 18 65 13.2 58 5 15 14 334 11.5 64 5 16 34 307 12 66 5 17 6 78 18.4 57 5 18 30 322 11.5 68 5 19 11 44 9.7 62 5 20 1 8 9.7 59 5 21 11 320 16.6 73 5 22 4 25 9.7 61 5 23 32 92 12 61 5 24 23 13 12 67 5 28 45 252 14.9 81 5 29 115 223 5.7 79 5 30 37 279 7.4 76 5 31 29 127 9.7 82 6 7 71 291 13.8 90 6 9 39 323 11.5 87 6 10 23 148 8 82 6 13 21 191 14.9 77 6 16 37 284 20.7 72 6 17 20 37 9.2 65 6 18 12 120 11.5 73 6 19 13 137 10.3 76 6 20 135 269 4.1 84 7 1 49 248 9.2 85 7 2 32 236 9.2 81 7 3 64 175 4.6 83 7 5 40 314 10.9 83 7 6 77 276 5.1 88 7 7 97 267 6.3 92 7 8 97 272 5.7 92 7 9 85 175 7.4 89 7 10 10 264 14.3 73 7 12 27 175 14.9 81 7 13 7 48 14.3 80 7 15 48 260 6.9 81 7 16 35 274 10.3 82 7 17 61 285 6.3 84 7 18 79 187 5.1 87 7 19 63 220 11.5 85 7 20 16 7 6.9 74 7 21 80 294 8.6 86 7 24 108 223 8 85 7 25 20 81 8.6 82 7 26 52 82 12 86 7 27 82 213 7.4 88 7 28 50 275 7.4 86 7 29 64 253 7.4 83 7 30 59 254 9.2 81 7 31 39 83 6.9 81 8 1 9 24 13.8 81 8 2 16 77 7.4 82 8 3 122 255 4 89 8 7 89 229 10.3 90 8 8 110 207 8 90 8 9 44 192 11.5 86 8 12 28 273 11.5 82 8 13 65 157 9.7 80 8 14 22 71 10.3 77 8 16 59 51 6.3 79 8 17 23 115 7.4 76 8 18 31 244 10.9 78 8 19 44 190 10.3 78 8 20 21 259 15.5 77 8 21 9 36 14.3 72 8 22 45 212 9.7 79 8 24 168 238 3.4 81 8 25 73 215 8 86 8 26 76 203 9.7 97 8 28 118 225 2.3 94 8 29 84 237 6.3 96 8 30 85 188 6.3 94 8 31 96 167 6.9 91 9 1 78 197 5.1 92 9 2 73 183 2.8 93 9 3 91 189 4.6 93 9 4 47 95 7.4 87 9 5 32 92 15.5 84 9 6 20 252 10.9 80 9 7 23 220 10.3 78 9 8 21 230 10.9 75 9 9 24 259 9.7 73 9 10 44 236 14.9 81 9 11 21 259 15.5 76 9 12 136 137 138 139 140 141 142 143 144 145 146 147 148 149 151 152 153 28 9 13 46 18 13 24 16 13 23 36 7 14 30 14 18 20 238 24 112 237 224 27 238 201 238 14 139 49 20 193 191 131 223 6.3 10.9 11.5 6.9 13.8 10.3 10.3 8 12.6 9.2 10.3 10.3 16.6 6.9 14.3 8 11.5 77 71 71 78 67 76 68 82 64 71 81 69 63 70 75 76 68 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28 29 30 Stat 302, Assignment 3, Due Thursday March 10, 2016 at 4:30pm There are 5 multi-part questions and a set of questions based on 7 pages of reading. Please see the Stats Workshop for help, or see me in office hours (Tue 1-2, Thur 3:30-4:30), or e-mail me at jackd@sfu.ca Explanations and relevant computer output should be included, but plots do not need to be. Necessary R code is included for every question, after the reading questions. There are also four practice problems, mostly surrounding review material. Total / 67 Q1 / 11 Q2 / 11 Q3 / 6 Q4 / 16 Q5 / 6 Reading Name ______________________________ Student Number _____________________ / 17 1) Consider the dataset airquality, which is embedded into R. This real dataset has daily measurements of temperature (in Fahrenheit), ozone concentration, and average wind speed (in miles per hour) from 111 of 153 days over a single summer. (The remaining 42 days are removed due to missing data) A 2pts) Construct a regression using Temp as a response to Wind, Ozone, and Solar.R. Give the regression equation. B 1pt) What proportion of the variation in daily temperature is explained by wind and ozone concentration. C 1pt) What is the Akaike information criterion (AIC) for this model? D 2pts) Would the AIC improve if you dropped a variable from this model? Which variable(s) could be dropped? Are these improvements large enough that they couldn't be due to randomness. Comment on how well this does (or doesn't) line up with the p-values from the model summary. E 2pts) Do the variance inflation factors indicate that there is a co-linearity problem in this model? F 3pts) Construct a regression using Temp as a response to only Wind and Ozone. Give the regression equation and the proportion of variance in temperature that is explained by this new model. Compare your answers to those in 1A and 1B. Is the removal of Solar.R justified? 2) Consider again the dataset airquality. This time we're interested in the factors behind A 2pts) Create a scatterplot with Ozone in the Y and Temp as X. Does there appear to be a relationship between Ozone and Temperature? Is it a linear relationship? B 1 pt) Create a regression model with Ozone as a response and Temperature and Wind as explanatory variables. Report the two variance inflation factors (VIFs) from the model and the AIC. C 3pts) Add the interaction between Temperature and Wind to the model from 2B. Report the three VIFs and the AIC. Does the AIC indicate that the model with the interaction term is better? Why are these VIFs so large? D 3pts) Replace the interaction term from the model in 2C with a Wind-squared term. Report the three VIFs and the AIC. Does the AIC indicate that including the squared term is better than not including it? Why is the VIF for Temperature still small? E 2pts) Construct a model that includes both the Wind-Temperature interaction term and the Windsquared term. Report the FOUR VIFs, and the AIC. Does the AIC indicate that the model including both the squared term and the interaction term is best of all? . 3) Consider the dataset airquality one last time. Also consider the set of explanatory variables ( Ozone, Wind, Solar.R ), the squares of these variables, and the interactions between these variables (9 variables in all) . A 3pts) Starting with this set of explanatory variables, find the model with the AIC using the stepwise method. Give the regression equation of the final model and the proportion of variance explained. b 3pts) Repeat 3a, but use BIC instead of AIC as your optimality criterion. Mention any differences and explain them by using the practical difference between AIC and BIC. Reminder for explanation: There are n = 111 observations. 4) Consider the dataset farms.csv a 2pts) Construct a crosstab of the variables 'fertilizer' and 'land'. Which category within each variable will be considered the baseline. B 3pts) Construct a regression with Yield as the response and 'fertilizer' and 'land' as the explanatory variables. Interpret the values of the intercept and each of the four dummy variables. c 4pts) Do a hypothesis test on each of the three dummy variables for fertilizer at alpha = 0.05. Construct a TukeyHSD of the regression in 4b. Do the relevant hypothesis tests in the TukeyHSD analysis agree with those from the dummy variables? D 1pt) Why are the p-values for (nature touch vs. none) and for (skotz vs. none) larger in the TukeyHSD analysis than the equivalent dummy variables in the regression? E 2pts) Construct a regression with Yield as the response and 'fertilizer' and 'land' and the fertilizer:land interaction as explanatory terms. Do a hypothesis test at alpha = 0.05 for each of the interaction dummy variables. F 2pt) Run an ANOVA on this model with interactions. Do the ANOVA results agree with hypothesis tests in 4e? G 2pts) Compare the AICs of the model with and without the interaction term. Which model is better according to the AIC? Do you agree? Briefly justify your answer. 5) Consider the dataset gapminder.csv and the model of birth rates in response to agri_in_gdp , co2_emit , female_work , GINI , HDI , and health_spending A 3pts) Starting with this set of explanatory variables, find the model with the AIC using the stepwise method. Give the regression equation of the final model and the proportion of variance explained. . b 3pts) Repeat 5a, but use BIC instead of AIC as your optimality criterion. Mention any differences and explain them by using the practical difference between AIC and BIC. Reminder for explanation: There are n = 150 observations. Reading questions) These questions pertain to \"Model selection in ecology and evolution\" by Jerald B. Johnson and Kristian S. Omland. Trends in Ecology and Evolution Vol.19 No.2 February 2004 The answer to R1 appears first in the text, and so on. Every question can be answered in 25 words or fewer. For this reading question, ignore the boxes and tables and focus on the main article. The answers are quite short; they are given a lot of marks to reward you for the effort it takes the read the article and find the answers. R1, 2 pts) From the abstract (the first paragraph in bold), how is model selection used. R2, 2 pts). What does model selection offer in contrast to a single null hypothesis test? R3, 3 pts) What are three primary advantages of model selection? R4, 2 pts) Name two commonly used criteria for model selection? R5, 3 pts) Name a method to address the problem of several models all being equally (or nearly equally) viable? (In other words, if more than one model has equal support from the data). What are two advantages of using this model? R6, 3 pts) What is a more recent application of model selection in evolutionary biology? What about model selection makes it well suited to this application? R7, 2 pts. The authors suggest a requirement of the model being selected. This is in order to ensure the parameter estimates are biologically plausible. Describe this requirement. For interest only, 0 pts), . What framework does model selection offer to ecosystem science? ############ PREAMBLE CODE ## Load the car package so we can use VIF install.packages("car") ### Only needed once. library(car) ## Needed every time you open R ########### EXAMPLE CODE QUESTION 1 ### Load the air quality data Q1 = airquality ## Remove the rows of data with a missing value Q1 = Q1[!is.na(Q1$Ozone) & !is.na(Q1$Wind) & !is.na(Q1$Solar.R),] ### Create a model of temp in response to Ozone, Wind, and Solar.R mod = lm( ) summary(mod) ### Get the Akaike information criteria values for this model, ### and the models with 1 variable removed drop1(mod) ### Get the Variance Inflation Factors of variables in this model vif(mod) ### Create a model of temp in response to Ozone and Wind only mod = lm(Temp ~ Ozone + Wind, data=Q1) summary(mod) ########### EXAMPLE CODE QUESTION 2 Q2 = Q1 plot(Q2$Ozone, Q2$Temp) ### Make a linear model and get the AIC and VIFs mod = lm(Ozone ~ Temp + Wind, data=Q2) AIC(mod) vif(mod) ### Make a linear model with an interaction term, get AIC and VIFs mod = lm(Ozone ~ Temp + Wind + Temp:Wind, data=Q2) AIC(mod) vif(mod) ### Make a model with a squared term, get AIC and VIFs mod = lm(Ozone ~ Temp + Wind + I(Temp^2), data=Q2) AIC(mod) vif(mod) ### Make a dummy variable that is 1 when Ozone is over 50. Q2$Temp80 = 0 Q2$Temp80[Q2$Temp > 80] = 1 ### Make a model with a dummy variable, get AIC and VIFs mod = lm(Ozone ~ Temp + Wind + Temp80, data=Q2) AIC(mod) vif(mod) ########### EXAMPLE CODE QUESTION 3 Q3 = Q1 ### Use the stepwise method to find the best model by AIC mod_start = lm(Temp ~ Ozone*Wind*Solar.R + I(Ozone^2) + I(Wind^2) + I(Solar.R^2), data=Q3) mod_end = stepAIC(mod_start) summary(mod_end) ### Use the stepwise method to find the best model by BIC instead n = nrow(Q3) mod_end = stepAIC(mod_start, k=log(n)) summary(mod_end) ########### EXAMPLE CODE QUESTION 4 Q4 = read.csv("farms.csv") ### Get a crosstab table(Q4$fertilizer, Q4$land) ### Get a regression mod = lm(yield ~ land + fertilizer, data=Q4) summary(mod) ## Get a Tukey HSD for comparison mod2 = aov(yield ~ land + fertilizer, data=Q4) TukeyHSD(mod2) ## Get a regression with interactions mod3 = lm(yield ~ land + fertilizer + land:fertilizer, data=Q4) summary(mod3) ### Get the ANOVA anova(mod3) ### Compare AICs AIC(mod) AIC(mod3) ########### EXAMPLE CODE QUESTION 5 ### Very similar to question 3. Q5 = read.csv("gapminder.csv") mod_start = lm(birth_rate ~ agri_in_gdp + co2_emit + female_work + GINI + HDI + health_spending, data=Q5) ### AIC mod_end = stepAIC(mod_start) summary(mod_end) ### BIC n = nrow(Q5) mod_end = stepAIC(mod_start, k=log(n)) summary(mod_end) fertilizer 1 skotz 2 skotz 3 skotz 4 skotz 5 skotz 6 nature touch 7 nature touch 8 nature touch 9 nature touch 10 nature touch 11 greeno 12 greeno 13 greeno 14 greeno 15 greeno 16 A-none 17 A-none 18 A-none 19 A-none 20 A-none 21 skotz 22 skotz 23 skotz 24 skotz 25 skotz 26 nature touch 27 nature touch 28 nature touch 29 nature touch 30 nature touch 31 greeno 32 greeno 33 greeno 34 greeno 35 greeno 36 A-none 37 A-none 38 A-none 39 A-none 40 A-none land yield Flat 74 Flat 79 Flat 83 Flat 69 Flat 68 Flat 69 Flat 83 Flat 77 Flat 76 Flat 78 Flat 90 Flat 91 Flat 89 Flat 101 Flat 74 Flat 60 Flat 56 Flat 77 Flat 74 Flat 66 Sloped 56 Sloped 55 Sloped 57 Sloped 59 Sloped 60 Sloped 75 Sloped 72 Sloped 59 Sloped 63 Sloped 70 Sloped 61 Sloped 74 Sloped 72 Sloped 86 Sloped 79 Sloped 56 Sloped 58 Sloped 49 Sloped 49 Sloped 56 Country Afghanistan Albania Algeria Angola Argentina Armenia Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Belarus Belgium Belize Benin Bolivia Botswana Brazil Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Chad Chile China Colombia Comoros Congo, Dem. Rep. Congo, Rep. Costa Rica Cote d'Ivoire Croatia Cuba Cyprus Denmark Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Fiji Finland France Gabon Gambia Georgia Germany Ghana Greece Guatemala Guinea Guyana Haiti Honduras Hungary India Indonesia Iran Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kuwait Latvia Lebanon Lesotho Liberia Lithuania Luxembourg Macedonia, FYR Madagascar Malawi Malaysia Mali Mauritania Mauritius Mexico Moldova Mongolia Morocco Mozambique Myanmar Namibia agri_in_gdp birth_rate breast_cancer calories_consumed co2_emit crude_death dtp_immune energypercap female_work GDPpercap GINI HDI health_spending imports literacy_female marriage_age new_births poverty_rate schoolyearsf1544 sugar_g teen_fertility tobacco_adult 34.49483268 42.779 11.7 NA 78734333.33 19.68 63 NA 29.5 NA NA 0.363 28.80876746 59.00013812 NA 17.83968322 1101497 NA 0.7 NA 119 15.2 21.11725576 11.631 6.54 2879.57 227322333.3 6.186 98 0.646798636 56.09999847 1681.61391 NA 0.729 232.1804388 54.96472314 NA 23.32650948 34964 NA 10.5 65.75 11.2 NA 8.025345596 22.541 16.7 3153.38 2814925667 4.939 95 1.086021971 38.09999847 2155.485231 NA 0.68 140.8509706 23.29305746 NA 29.6 777471 NA 6.8 84.93 7 NA 7.860272469 48.888 17.1 1973.29 305605666.7 17.127 83 0.625843499 76.40000153 562.9876848 NA 0.471 85.29308127 43.51255039 NA NA 939691 NA 4.3 35.62 171 NA 9.394874713 18.55 19.42 2940.98 6011269000 7.779 96 1.87138944 57.09999847 9388.688523 47.37 0.78 562.4089344 20.33712598 NA 23.26396179 741619 5.46 11.3 112.33 64.5 16.6 20.28206022 14.161 22.09 2279.87 56686666.67 8.694 88 0.925555132 65.19999695 1424.190562 30.23 0.715 116.2641049 39.15081444 NA 22.98603439 42365 20.51 11.3 57.53 26.4 NA 2.367258175 13.569 16.43 3227.05 13351785333 6.851 92 5.683007332 69 24765.5489 NA 0.922 3956.468724 20.92946989 NA 28.93125534 285882 NA 12.6 128.77 15.4 7.4 1.746160423 9.304 17.52 3818.8 4567625333 9.253 85 4.022449074 67.30000305 27036.48733 NA 0.87 4604.068222 53.17035673 NA 28.93756866 77176 NA 11.3 123.29 11.17 NA 7.002185706 20.495 10.34 2961.19 574372333.4 6.783 74 1.411960892 66.5 1945.637549 NA NA 187.4280861 28.51369901 99.39787584 23.88630676 180302 NA 11.7 43.84 34 9.4 2.326734733 15.519 21.5 2712.5 140473666.7 5.993 95 2.15592606 73 21721.61841 NA 0.77 1789.869492 53.96552661 NA 27.19277191 5316 NA 11.8 126.03 32 20.7 NA 17.347 17.7 NA 519064333.3 2.628 97 9.456593856 35.20000076 13373.21994 NA 0.804 725.2666466 68.35164202 NA 25.90460396 17852 NA 10.6 NA 15 6.6 19.24346501 22.858 7.3 2281.18 652428333.4 6.633 95 0.183203413 59.70000076 483.970868 NA 0.478 NA 26.70242715 NA 18.66999817 3347251 NA 4.4 16.44 79 9 9.343416174 10.491 13.72 3145.64 974365333.3 14.678 95 2.891955576 67 2253.464111 28.74 0.738 299.6155558 67.21082526 NA 22.77720261 100559 0.25 12.3 90.41 20.548 8.1 0.881363407 11.667 27.7 3693.57 10990166000 9.732 99 5.366718146 59.59999847 25034.66692 NA 0.88 4169.854734 78.7515524 NA 30.34197617 124933 NA 12.4 150.69 10 NA 12.27138114 24.974 11.9 2717.57 12448333.33 3.656 96 0.571428571 48.5 3680.916429 NA 0.692 208.6497981 61.68572063 NA 26.17763329 7461 NA 9.2 136.99 79 9.1 NA 39.422 19.8 2532.92 43211666.67 9.29 82 0.395343376 59.59999847 366.0449665 NA 0.414 31 31.55261277 NA 20.33333206 343284 NA 2.6 16.44 112 4.3 12.87739439 27.043 11.6 2064.13 288570333.4 7.596 82 0.567544323 67.80000305 1132.21388 57.44 0.645 65.08928224 34.26625468 85.98593929 22.59545898 255423 24.66 8.7 76.71 78 NA 2.033574354 25.438 25 2263.82 83614666.67 11.731 96 1.054540502 50.59999847 4160.662597 NA 0.618 504.8069298 35.42164901 NA 27.11158562 49398 NA 7.8 68.49 52 16.2 5.562714119 16.836 10.8 3112.51 9863685333 6.363 97 1.240032188 64.09999847 4297.823854 55.89 0.7 610.0095454 11.84675863 90.22800731 23.09729576 3240562 13.19 8.6 153.43 76 5.5 5.58366328 9.7 14.36 2766.11 3216994000 14.601 95 2.626143051 58.20000076 2494.352654 28.19 0.758 374.6623996 79.18341967 NA 24.15979195 73424 0.41 12.3 79.45 40 NA NA 43.905 22.1 2676.91 25105666.67 13.131 89 NA 79.90000153 263.0149271 NA 0.313 29.80479692 NA 21.57956529 19.3993969 628028 NA 1.5 16.44 125 14.7 37.81013941 44 14.2 1684.72 8272000 14.011 99 NA 90.19999695 133.286878 NA 0.289 17.64229141 NA NA 22.46302795 375973 NA 3 5.48 19 NA 31.88137487 26.043 9.5 2267.55 49848333.33 8.402 82 0.25468562 77.40000153 510.0197663 44.37 0.508 31.83455106 72.94217882 NA 22.30000114 358541 60.13 4.5 24.66 42 NA 19.46522064 39.788 22.1 2269.12 130390333.3 14.308 75 0.345489504 53 647.9445442 38.91 0.459 52.88653437 21.24620766 63.02205763 20.1510601 760500 30.36 6.2 27.4 128 NA 1.686431041 10.981 17.6 3532.47 25613067334 7.406 94 8.251849719 75 26229.74308 NA 0.9 4340.355972 33.01900053 NA 26.75705528 362807 NA 14.5 172.6 13.952 15 7.199958975 22.935 19.8 2571.92 5518333.333 5.011 98 0.213240363 49.79999924 1735.883891 NA 0.56 137.2683213 77.77201412 NA 25.68690109 11036 NA 5.9 87.67 82 11.2 12.51624817 48.668 12.1 2055.82 8092333.333 16.803 30 NA 72.30000305 292.1434649 NA 0.313 28.34740527 52.3 NA 18.34000015 524977 NA 1.4 19.18 165 13.1 3.913278895 14.627 11.26 2920.42 1888795333 5.382 96 1.837582352 43.70000076 6410.806732 NA 0.789 683.0753431 33.24350028 NA 23.37175751 240952 NA 11.7 126.03 58 18.5 10.76971034 12.125 5.5 2980.54 109545000000 6.992 93 1.528830173 77.09999847 1864.102702 NA 0.656 114.4807539 29.61397918 NA 23.31162262 16012125 NA 8.2 21.92 8 12.8 7.937396848 18.247 9.33 2684.65 2335589667 5.514 93 0.629483567 69 3068.477266 58.88 0.691 336.5918728 19.93788485 92.84562398 23.00436401 809298 17.66 8.2 134.25 74 24.9 45.34372178 36.115 14.2 1884.49 2563000 6.782 75 0.060418331 64.5 347.1000331 NA 0.428 29.26288731 41.32570635 NA 23.61499977 23512 NA 4.5 21.92 58 NA 42.46523139 44.99 7.4 1605.08 173609333.3 12.89 70 0.353022349 55.40000153 97.91018296 NA 0.271 10.21864016 37.93899246 46.10425444 NA 2698744 NA 5.3 5.48 201 7.1 4.327298473 38.658 14 2511.88 47197333.34 4.089 80 0.324156465 55.40000153 1101.219981 NA 0.512 60.21649344 53.52311323 NA NA 144305 NA 7.5 30.14 119 NA 8.48263769 16.574 11.58 2839.76 156273333.3 10.913 89 1.012895813 47 5137.485256 49.25 0.735 524.3388835 53.5160789 NA 23.39002419 72531 4.98 10.2 156.16 66 10.7 23.85368261 38.566 18.3 2527.52 235048000 11.512 76 0.548777819 39.90000153 572.0962695 NA 0.388 61.25118446 41.94078409 NA 22.00292991 730778 NA 3.2 32.88 129 NA 4.874757404 9.884 16.95 2989.91 325559666.7 6.831 96 2.097819883 56.5 6651.741476 NA 0.791 1008.04587 49.75982554 NA 26.23072433 42977 NA 11.2 164.38 13 18.4 4.970419617 11.371 13.87 3273.96 1313165333 7.149 93 0.899606582 52.5 4165.600383 NA 0.759 585.9437956 17.63137014 NA NA 128288 NA 12 117.81 45 8.9 2.215325263 11.818 29.6 3180.69 191726333.3 17.161 97 2.292581468 63.29999924 15248.86529 NA 0.819 1674.059966 54.19620502 NA 25.21969414 12555 NA 13.4 128.77 7 NA 1.176883599 11.711 24.29 3415.88 3523538333 10.292 87 3.618512377 76.30000305 32767.40349 NA 0.89 5664.948176 49.93428752 NA 30.8473587 64100 NA 12.6 158.9 5.9 12.1 6.958102318 22.607 5.47 2300.87 718457666.7 5.154 99 0.818234028 54 1618.868002 54.31 0.702 228.4813241 34.4341453 81.6805873 21.55466843 321734 14.43 9.6 101.37 83 14.6 14.07021828 25.445 17.3 3194.55 3575051333 5.873 98 0.923271167 25.70000076 1765.869427 NA 0.626 84.63868769 34.82814178 NA 23.63006955 1980692 NA 7.6 76.71 47 13.4 11.86587908 19.406 4.23 2589.61 151418666.7 6.821 99 0.729736326 49.70000076 2609.089915 46.97 0.668 206.9868724 48.29469433 79.66212821 22.28621101 116184 13.4 7.7 93.15 83 15 2.677879664 36.973 12.1 NA 37836333.33 15.118 33 2.731637927 44.20000076 7994.386395 NA 0.526 371.5516395 30.28807233 NA NA 24675 NA 6.1 NA 123 30 25.38314991 37.805 14.2 1605.4 7883333.333 8.579 96 0.15017648 56.5 167.7992226 NA NA 9.04065069 35.94535087 NA 20.55499649 165578 NA 3.7 13.7 67 NA NA 11.241 18 3154.03 284166666.7 12.866 95 4.191668306 66.90000153 7072.435864 NA 0.834 835.697326 78.130426 NA 27.58296585 15138 NA 11.6 167.12 23 NA 46.23862365 36.903 17.9 1979.68 123900333.3 11.961 42 0.392598276 82.09999847 176.1074039 NA 0.337 11.79987036 32.02634067 28.9216362 20.5055027 2988776 NA 1.9 10.96 72 31.7 13.90617849 22.586 14.5 3041.15 34536333.33 6.593 99 0.62734431 40.5 2295.4383 NA 0.681 151.9280292 63.66774737 NA 22.89528847 18946 NA 11.1 120.55 45 NA 3.00667159 11.134 14.62 3220.91 2475491333 9.371 99 6.957264518 73.30000305 28839.22241 NA 0.881 3743.795262 40.68842796 NA 30.53718948 58980 NA 12.3 93.15 9.2 21.6 2.217736921 12.694 18.18 3532.24 33646041000 8.576 98 4.116148208 64.19999695 23516.22317 NA 0.877 4483.031211 28.40466676 NA 30.96426582 785953 NA 11.5 109.59 7 37.9 4.846147892 31.582 13.1 2754.9 155701333.3 9.78 45 1.396994459 64.09999847 4143.33458 NA 0.656 233.6901881 32.88143798 NA 22.10285187 45638 NA 8 46.58 90 NA 21.01758889 43.828 4.6 2384.59 7758666.667 11.402 95 0.08357647 70.80000305 598.0822683 NA 0.399 21.49527066 49.23193799 NA 19.64669609 67543 NA 3.1 73.97 77 16.8 10.69548298 13.554 25.1 2859.39 88678333.33 11.902 98 0.761399371 58.79999924 1219.448319 39.37 0.724 188.1498862 57.95058282 NA 22.46738815 59317 34.92 12.6 90.41 45 35.9 0.961878557 8.279 18.92 3546.96 81411858000 10.33 99 4.025569865 68.69999695 25297.38539 NA 0.901 4230.816509 40.16594606 NA 30.12885094 670061 NA 12.1 123.29 10 14.9 29.04994198 33.727 19.8 2907.02 197923000 11.154 94 0.399081726 73.59999847 317.7363633 NA 0.508 65.56446253 40.82908463 NA 22.44979858 761207 NA 6.7 19.18 71 NA NA 10.223 15.52 3724.69 2806085333 10.195 99 2.699663256 56.09999847 14801.55423 NA 0.86 2677.09805 37.00303447 NA 26.94385338 113681 NA 12.1 95.89 11 14.4 12.40599497 30.085 4.1 2159.26 257873000 5.677 85 0.638335194 46.79999924 1877.751467 NA 0.565 183.8884144 42.33322802 NA 20.45073509 415226 NA 5.2 109.59 107 NA 25.34559765 39.948 10.8 2568.03 47542000 11.166 63 NA 82.59999847 387.7516347 39.35 0.337 22.75084471 34.67742447 NA 18.71999931 408271 69.59 1.7 27.4 157 NA 24.40702576 20.123 11.9 2758.51 71041666.67 8.18 94 0.666606228 51.79999924 1107.351968 NA 0.619 122.5612952 NA NA 27.79589272 15040 NA 10.9 98.63 68 NA NA 27.951 2 1869.77 44289666.67 9.112 59 0.28903019 39.70000076 392.0942297 NA 0.443 34.5391947 38.94189269 NA 22.27551079 267367 NA 5.3 65.75 46 15 13.00735474 25.514 12.1 2623.41 140294000 5.061 94 0.655371647 38.20000076 1409.979046 56.16 0.613 102.2212315 81.56164347 83.45277365 20.35041809 182021 29.73 7.4 109.59 93 24.7 4.024421334 9.568 21.44 3465.18 4331180333 13.436 99 2.658032793 55.5 5884.140962 31.18 0.809 1023.258195 80.41307231 NA 27.85076332 96305 0.35 11.9 123.29 19 NA 18.25627018 23.144 10.4 2351.86 31926495333 8.494 71 0.510064486 35.59999847 673.004691 NA 0.523 40.39964904 24.44855293 NA 19.93957901 27294707 NA 5.3 65.75 86 NA 13.7166831 21.364 11.3 2538.42 7256036333 6.327 72 0.794458134 51.70000076 1003.364434 NA 0.591 50.56796819 25.39353149 NA 22.11224365 4967640 56.13 8 43.84 45 24 10.21669258 18.001 7.4 3043.61 9934943334 5.741 99 2.661023907 33 2125.030252 NA 0.694 219.9136381 21.53822451 NA 22.14466476 1294797 NA 7.9 71.23 29 NA NA 35.395 13.9 NA 2613222333 6.085 54 0.764519887 14.89999962 699.7055596 30.86 0.558 107.9483148 NA NA 24.8413105 1011769 21.41 5.1 NA 98 NA 1.426170897 16.12 20.4 3612.26 1664604333 6.443 92 3.463176259 62.40000153 31543.60253 NA 0.909 4552.412765 71.25033187 NA 30.71268082 70442 NA 12.6 115.07 18 22.1 NA 21.144 21.62 3527.47 1608453000 5.476 96 2.885649782 58.70000076 21469.53451 NA 0.882 1737.43088 43.99207009 NA 25.87360382 146850 NA 13.1 104.11 14 23.9 2.046093966 9.434 16.96 3645.65 19483053333 9.923 97 3.024803871 52.29999924 20291.22664 NA 0.869 3094.566014 29.10163551 NA 29.16757965 556487 NA 11.7 84.93 7 32.6 5.31713658 18.719 18.3 2851.57 354082666.7 7.439 99 1.751704911 60 NA NA 0.717 230.1144195 65.9513189 NA 33.20291901 50634 NA 11.4 147.95 77 29.8 1.146307523 8.693 8.18 2812.14 46044404999 9.149 98 4.032163465 61 40837.26664 NA 0.894 2805.832815 15.94466485 NA 28.57218933 1105089 NA 13.3 76.71 5 NA 2.842237843 28.888 14.6 3015.43 398478666.7 4.227 98 1.273219181 16.39999962 2380.50934 NA 0.685 251.9591288 88.3765045 88.90331226 25.86478233 168585 NA 11.3 98.63 26 14.6 6.09658892 21.063 14.99 3490.13 2702916333 11.251 93 4.277298034 72.69999695 2332.29493 30.88 0.727 231.724614 42.75072114 NA 23.35925102 332749 1.48 12.1 71.23 30 15.1 25.01120041 38.026 18.1 2089.31 283396666.7 11.715 81 0.458268755 76.09999847 461.0193747 NA 0.486 32.12121197 37.69921457 66.8631175 21.4477787 1419728 NA 8.1 54.8 100 17.6 NA 22.749 14.31 3064.09 1671274000 1.86 99 10.78060215 44.29999924 25100.0281 NA 0.756 1000.044175 28.31711734 91.50349488 25.06534195 59453 NA 9.1 101.37 14 14.2 3.580362536 10.11 18.45 2962.25 136499000 13.829 98 2.051686657 67.19999695 6296.227116 36.27 0.8 881.7164139 62.38043655 NA 28.15731812 21957 0.47 12.2 93.15 18 36.5 7.119005151 13.127 23.4 3107.09 441650000 6.943 80 1.017481557 27.20000076 5436.651885 NA 0.721 510.0909272 48.19050645 85.96778157 NA 53983 NA 10.5 93.15 16 NA 7.662720872 28.508 9.9 2476.17 NA 16.852 84 0.009021294 69.09999847 446.3204159 NA 0.429 63.94193623 118.0038765 NA 21.30995369 55917 NA 9.2 41.1 73 18.1 NA 38.969 13.8 2203.79 39746666.67 10.603 60 NA 56.70000076 217.1365282 38.16 0.319 24.83121951 190.864416 27.03490666 20.16080284 138998 94.88 3.3 10.96 143 15 3.940478062 9.488 17.78 3436.38 242150333.3 13.123 95 2.800302048 65.19999695 5839.145602 NA 0.803 724.8675005 67.43478961 NA 26.13404846 30806 NA 12.6 106.85 19 13.1 0.399027426 11.386 17.74 3681.07 607189000 8.15 99 8.767504943 59.29999924 56285.27685 NA 0.868 7625.16871 143.6270522 NA 27.83593941 5455 NA 11.3 NA 10 21.1 10.56182571 11.264 17.87 3105.47 180231333.3 9.315 95 1.485416925 84.09999847 2110.685735 NA 0.712 274.1615466 70.79939395 NA 22.8783989 23109 NA 10.9 95.89 22 NA 25.68786084 36.574 14.2 2159.66 58523666.67 12.413 84 NA 75.69999695 255.0941521 NA 0.476 16.00357522 52.06714314 NA 19.84911537 709793 NA 5.2 21.92 134 14.7 30.30488491 42.499 7.4 2171.62 28603666.67 4.469 87 NA 47.29999924 157.6213676 NA 0.367 19.74197454 42.4699572 NA 18.94499969 576373 NA 4.9 24.66 119 NA 10.11542733 17.171 13.5 2923.14 2828004667 15.888 90 2.586567658 38.40000153 4905.121369 46 0.746 261.8976262 89.42806546 NA 25.11786079 459993 2.93 10.8 112.33 14 NA 36.54104686 47.287 13.1 2614.23 18095000 10.459 74 NA 62.59999847 258.0305073 NA 0.338 29.60668871 35.57664234 NA 18.48999977 652188 NA 1.5 30.14 186 26.6 25.57429647 36.146 19.8 2841.06 58938000 7.163 75 NA 46.79999924 609.9344005 NA 0.442 38.6917721 61.18734684 NA 21.77551079 120273 NA 3 109.59 79 14.4 4.480466392 13.55 12.54 2965.39 63283000 4.745 97 0.947316057 44 4650.98538 NA 0.714 316.6177674 67.1708677 NA 22.55157471 16706 NA 9.8 117.81 35 22.4 3.642019206 21.065 8.97 3266.31 13066698333 6.665 98 1.610838263 60.29999924 6333.082389 NA 0.755 564.1853027 29.50796866 91.35555126 22.73022842 2389980 NA 9.6 131.51 71 29.6 12.0098031 10.704 15.29 2771.3 123900333.3 5.833 96 0.820790829 26.60000038 547.6737738 35.27 0.638 133.9265162 97.14084547 NA 20.8733387 44126 6.61 12.3 82.19 34 43.3 20.45809624 21.769 3.5 2285.34 311439333.3 16.079 95 1.186677203 89.09999847 709.7234231 NA 0.631 73.57494515 58.26597799 NA 23.65853691 56761 NA 9.9 32.88 21 NA 13.73079735 20.703 16 3236.03 1005337667 9.925 95 0.462741928 70.69999695 1658.8554 40.88 0.565 125.4839168 44.8641307 NA 26.44129944 644386 14.03 4.1 98.63 15 42.6 27.70962126 42.277 2.8 2066.59 109453666.7 8.468 75 0.423540962 50.29999924 339.063814 NA 0.299 17.68416765 45.15001699 NA 18.73860168 946813 NA 2.6 19.18 149 27.1 NA 21.544 8.9 2464.91 344784000 6.47 97 0.332394063 61.90000153 NA NA 0.459 7.52487452 0.078929169 NA 24.54446983 1091646 NA 6.4 35.62 16 37.7 9.36166507 29.554 18.8 2382.88 29388333.34 8.383 86 0.62290318 70 2598.512655 NA 0.607 283.3106948 52.04490907 78.35097848 27.52589226 62021 NA 8.5 84.93 74 34 Nepal Netherlands New Zealand Nicaragua Niger Nigeria Norway Oman Pakistan Panama Papua New Guinea Paraguay Peru Philippines Poland Portugal Qatar Romania Russia Rwanda Samoa Saudi Arabia Senegal Sierra Leone Singapore Slovak Republic Slovenia Solomon Islands South Africa Spain Sri Lanka Sudan Suriname Swaziland Sweden Switzerland Syria Tajikistan Tanzania Thailand Togo Trinidad and Tobago Tunisia Turkey Turkmenistan Uganda Ukraine United Arab Emirates United Kingdom United States Uruguay Uzbekistan Vanuatu Venezuela Zambia Zimbabwe 33.56069831 2.077809514 NA 18.2833263 NA 32.71415892 1.34005475 NA 20.46411342 5.950865062 36.00582916 22.00143881 7.006707324 12.4967356 4.327878598 2.450810074 NA 8.776638255 4.410792238 35.66098231 12.18202299 2.783500581 13.38326176 49.86088903 0.044424876 4.062297444 2.507475744 44.34413367 3.368429275 2.875400374 11.68316433 28.12599069 11.26399568 7.970969272 1.714620464 1.209249528 17.9391583 22.43210923 29.96573466 10.67570638 35.82352891 0.372330776 9.406179401 8.676405808 12.3 23.62934773 7.46048326 0.97578624 0.688806181 1.130169679 10.18882832 23.95121304 20.48611753 4.194157925 21.76434111 21.59790674 25.607 11.365 14.758 23.444 50.984 41.9 12.514 21.409 30.111 21.02 31.659 23.338 21.508 26.025 10.277 9.868 13.725 10.197 11.164 37.643 29.148 22.695 39.071 41.539 10.104 10.26 9.656 34.063 22.355 10.596 18.415 36.444 20.033 31.357 11.726 10.01 26.564 29.38 42.023 12.323 38.814 15.299 16.951 18.836 22.231 46.322 10.025 12.814 12.366 13.862 15.188 22.344 28.475 21.532 43.172 35.397 9.6 22.99 19.82 5.66 16.7 21.9 15.99 5.8 NA 22 8.01 8 NA 13.9 14 27.1 14.91 15.46 14.6 NA 16.3 16.98 6.2 15.8 10.9 13.3 19.8 11.87 NA 16.95 21 13.9 11.49 14.04 10.3 16.6 12.9 8.9 14.66 17.32 19.9 5.46 15.5 4.13 19.8 21.39 14.2 9.7 8.5 13.4 18.04 10.5 20.6 16.65 21.91 7.62 11.1 12 10 14.1 2359.81 3277.52 3159.44 2403.44 2376.26 2740.8 3464.18 2292.76 2484.04 2634.37 2457.04 2564.91 3420.74 3583.9 3455.47 3375.93 2085.1 2885.92 3143.95 2347.71 2170.26 2892.6 3223.47 2422.27 2998.5 3271.77 2360.61 2282.06 2492.45 2292.25 3110.22 3465.33 3034.26 2117.52 2032.39 2538.62 2161.06 2725.15 3326.45 3516.73 2731.4 2211.3 3223.72 3171.2 3458.41 3748.36 2829.47 2581.11 2740.23 2631.9 1873.04 2237.75 54908333.33 9610381000 1397014667 114125000 29201333.33 2463105333 1991289667 530112000 2506683667 176484000 93236000 99392333.34 1183130667 1961740000 23410849000 1902101667 912101666.7 7215541667 26571977667 20280333.33 4513666.667 8392376667 133411666.7 65090666.67 1508965333 646378333.3 238777000 5408333.333 15060540000 11048004000 266376000 240573666.7 88788333.33 23767333.33 4320646000 2442542667 1335235000 46695000 125359666.7 4491014000 33649000 913374000 549758000 5830520667 594194333.3 52983333.34 6147078667 2287578333 72915267333 338848000000 281094000 1840303667 2959000 5718573667 133243000 584052333.4 7.038 4.735 15.175 16.492 8.729 2.7 6.986 5.041 7.998 5.554 5.398 4.818 10.004 10.141 2.032 12.946 12.329 15.136 14.67 5.349 3.646 10.95 15.998 5.127 9.962 9.455 6.247 15.138 8.725 6.477 10.324 7.565 15.658 10.051 8.265 3.404 6.411 9.148 8.928 8.211 8.002 5.911 5.952 7.71 12.857 16.041 1.508 9.932 11.512 7.841 9.303 6.503 5.043 5.104 17.486 16.19 82 0.327445797 96 4.843796088 88 4.049825935 93 0.553200648 57 NA 42 0.732717093 93 5.849373762 99 5.70241923 83 0.505578848 85 0.895707818 60 NA 92 0.690421854 81 0.508911961 87 0.434438737 99 2.539933464 97 2.385064009 94 17.44572871 96 1.84239908 98 4.733220844 97 NA 71 0.319952338 96 5.650426973 94 0.26143256 64 NA 96 4.735847099 99 3.306946709 97 3.62744621 79 0.128904912 72 2.845049106 96 3.204821949 98 0.462166575 84 0.373954166 84 1.399626766 95 0.414664416 98 5.472135392 94 3.412123134 80 1.184846715 86 0.390774762 83 0.446626093 98 1.547075731 82 0.434839941 88 15.22226463 98 0.883981281 96 1.428791115 98 4.433502791 73 NA 98 2.953032992 92 9.61290239 92 3.459246203 95 7.758205892 94 0.954342271 96 1.813492966 68 0.157390378 62 2.310919659 80 0.613704632 72 0.778871659 71.19999695 39.70000076 39.90000153 39.40000153 75.30000305 27.10000038 21.5 52.09999847 72.19999695 74.19999695 65.40000153 51.20000076 57.09999847 68.40000153 42.29999924 51.09999847 54.59999847 68.90000153 82.19999695 44.70000076 20 62.90000153 67.69999695 59.90000153 62.29999924 65.90000153 56 49.79999924 60.20000076 46 32.79999924 41.20000076 64.69999695 77 74.69999695 21.89999962 59.09999847 89.30000305 70.09999847 49 52.70000076 60.29999924 27.89999962 26 63.79999924 83.59999847 63.70000076 40.79999924 70.09999847 68.5 64.09999847 62.40000153 80.19999695 54.5 60.29999924 61.40000153 244.5372613 NA 0.437 NA 26968.6177 NA 0.902 15392.50271 NA 0.903 1172.860136 NA 0.577 171.4210995 NA 0.273 476.214166 NA 0.441 41904.21021 NA 0.942 10392.28594 NA 0.697 643.8585182 NA 0.493 5210.719076 NA 0.752 656.3571364 NA 0.447 1450.184566 53.31 0.643 2725.815245 51.65 0.704 1283.46737 NA 0.63 5932.47441 34.02 0.8 11965.99661 NA 0.798 31424.32229 41.1 0.825 2595.596086 32.1 0.767 2888.847355 43.71 0.742 304.5930038 NA 0.401 1798.385778 NA 0.684 9364.488672 NA 0.755 550.6476376 NA 0.445 251.4170171 NA 0.319 31246.99685 NA 0.85 8094.874509 28.13 0.825 13377.91533 NA 0.868 1092.939555 NA 0.515 3704.084266 NA 0.604 16351.11108 NA 0.866 1138.366839 40.26 0.673 498.5701577 NA 0.395 2520.586745 NA 0.668 1773.456589 NA 0.507 33259.26285 NA 0.899 38983.87532 NA 0.893 1418.051146 NA 0.628 216.5445832 32.55 0.588 410.6836189 37.58 0.44 2562.72216 NA 0.67 257.57179 NA 0.424 10714.74422 NA 0.752 2921.993577 NA 0.681 5323.682988 39.26 0.688 944.6692397 NA 0.666 339.0620829 NA 0.42 1125.959706 29.56 0.725 28427.09848 NA 0.827 29771.30335 NA 0.856 38710.88544 NA 0.905 7685.2245 47.63 0.764 783.0274969 NA 0.619 1472.19553 NA NA 5777.628568 NA 0.72 385.1228226 NA 0.405 339.6548829 NA 0.35 NA 31.72365136 NA 18.97565842 4625.903792 65.97618286 NA 30.51453972 2694.32672 29.15909128 NA 30.00538063 92.67874475 68.17224794 NA 20.44891739 16.47749282 NA NA 17.60019875 67.78948131 25.93938983 NA 20.90817261 7312.73739 30.45236284 NA 31.6225338 402.9545194 40.16510459 NA 21.67170334 22.55588066 21.34207894 NA 22.12841225 397.5557164 73.93957765 NA 21.91863823 32.16138336 68.13451908 NA 20.80160321 121.5376861 53.91644483 93.45326956 22.6956501 193.6027857 22.41190582 84.64687778 23.56531143 57.27511105 43.35861064 NA 23.18324852 717.1826709 43.6312447 NA 25.29973412 2183.25345 40.18723216 NA 23.93018661 1640.149398 33.61140124 90.43612089 25.78560257 415.3733183 42.85208286 NA 25.30126953 487.9707197 21.54206226 NA 23.58974075 36.48873496 25.53934032 NA 22.57836342 165.2194424 57.05631644 NA 23.94953156 577.8849755 37.74050793 NA 25.83797646 56.47056094 47.70533516 NA 21.53653717 38.98560439 27.79155241 NA 19.75572205 1232.186941 186.6161154 NA 26.5067749 1077.314651 87.96031825 NA 26.3646183 1824.555123 71.27955136 NA 30.29513741 69.12114438 56.38105366 NA 21.22816467 495.2205578 34.22644762 87.04310202 27.90380096 2724.872631 33.6123933 97.26032273 29.30578423 59.48507825 39.49151197 NA 25.26371193 69.44721553 23.72933291 NA 22.66632843 345.0518815 NA NA NA 155.7530679 76.96011675 NA 25.95965004 4508.450278 44.40924524 NA 32.40498352 6050.911859 45.96577908 NA 29.11519051 79.38885487 37.83303084 NA NA 30.11119553 68.68683956 NA 20.9141407 23.33057817 41.09960019 NA 19.96926498 129.841031 65.03645405 NA 24.05609512 30.34908331 61.84473692 NA 21.34904671 787.4770891 35.81570252 NA 26.78060722 216.5993699 52.97076851 68.50937859 29.2 553.41473 27.48387248 81.26382688 23.04790878 133.5036367 38.69974815 NA 23.36683464 37.47946695 30.05234812 NA 20.2470417 195.9739908 50.36261795 NA 22.81653786 1204.346704 64.35569207 NA 23.08673668 3880.817041 29.64021807 NA 29.75162125 7437.292298 16.96891747 NA 25.3 562.0518256 30.11996059 98.2419335 23.30049324 46.58888447 36.52842143 NA 20.58316231 89.52633008 46.0640347 NA 22.6080265 476.5804172 25.29664651 94.92964339 22.74840355 55.8525079 35.24382704 51.78696654 20.53106117 46.38753112 NA 21.03118896 667643 NA 186922 NA 62397 NA 129634 NA 744038 NA 6182016 NA 59261 NA 57755 NA 4828080 NA 72374 NA 202385 NA 139266 609481 2311945 NA 395848 103821 NA 16607 NA 213523 1602363 360329 NA 5317 NA 592825 NA 466777 NA 221123 NA 47887 NA 55356 19516 NA 16768 NA 1110433 NA 475844 NA 364479 1225131 NA 10092 NA 35867 NA 107586 NA 75903 NA 509783 NA 209343 1752555 815614 NA 229089 NA 20034 NA 175198 NA 1312832 108276 NA 1392711 NA 464419 77994 NA 757718 NA 4183204 NA 50861 597017 NA 6267 NA 595878 NA 552311 NA 474454 NA 15.95 18.2 0.29 3.56 0.29 0.19 29.13 36.95 87.87 4.54 0.21 2.92 2.9 10.96 11.8 142.47 13.1 164.38 7.4 98.63 1.2 16.44 6.1 30.14 14.1 120.55 7.8 NA 3.9 73.97 11.2 87.67 5.1 NA 9 63.01 10 104.11 10.3 76.71 12.3 123.29 9.5 93.15 10.2 NA 12.3 71.23 13.1 120.55 4.1 5.48 12.7 68.49 8.1 73.97 2.6 38.36 2.2 10.96 9.3 NA 12.1 84.93 11.9 41.1 6.7 19.18 9.7 90.41 10.7 93.15 10.3 84.93 5.4 57.53 8.3 150.69 8.7 136.99 13.1 128.77 12 164.38 7.2 117.81 11.6 41.1 5.7 19.18 8.6 87.67 3.9 16.44 11.7 156.16 6.7 95.89 7.8 65.75 11.8 27.4 5.2 24.66 12.8 120.55 11.4 104.11 13 112.33 13.4 191.78 11 104.11 11.8 10.96 7.7 38.36 10.2 98.63 6.5 46.58 8.7 104.11 103 NA 5 31 113 207 118 9.1 9 32 83 67 72 55 54 15 17 16 35 30 39 28 12 106 144 5 20 5 70 59 NA 13 NA 24 62 NA 39 NA 84 6 5 43 28 130 43 65 35 6 39 20 150 30 27 30 41 61 NA 14 54 90 NA 147 NA 65 33.4 38.8 28.1 31.7 31.9 31.6 51.8 39.8 26.3 24.6 26.1 26.6 39.4 33 34.7 34.3 32 35.6 35.8 26 32.6 48.5 30.9 26.5 33.7 22 26.5 35.5 35.7 12.8 25.6 18.6 35.4 30.2 30.6 16.5 21.7 24.8 23.6 31.8 14.4 29.4 28.8 26.3 28.6 26 41 28.8 Review TRENDS in Ecology and Evolution Vol.19 No.2 February 2004 Model selection in ecology and evolution Jerald B. Johnson1 and Kristian S. Omland2 1 2 Conservation Biology Division, National Marine Fisheries Service, 2725 Montlake Boulevard East, Seattle, WA 98112, USA Vermont Cooperative Fish & Wildlife Research Unit, School of Natural Resources, University of Vermont, Burlington, VT 05405, USA Recently, researchers in several areas of ecology and evolution have begun to change the way in which they analyze data and make biological inferences. Rather than the traditional null hypothesis testing approach, they have adopted an approach called model selection, in which several competing hypotheses are simultaneously confronted with data. Model selection can be used to identify a single best model, thus lending support to one particular hypothesis, or it can be used to make inferences based on weighted support from a complete set of competing models. Model selection is widely accepted and well developed in certain elds, most notably in molecular systematics and mark - recapture analysis. However, it is now gaining support in several other areas, from molecular evolution to landscape ecology. Here, we outline the steps of model selection and highlight several ways that it is now being implemented. By adopting this approach, researchers in ecology and evolution will nd a valuable alternative to traditional null hypothesis testing, especially when more than one hypothesis is plausible. framework that supports most modern statistical approaches. Moreover, this approach is rapidly gaining support across several elds in ecology and evolution as a preferred alternative to null hypothesis testing [1,3,4]. Advocates of model selection argue that it has three primary advantages. First, practitioners are not restricted to evaluating a single model where signicance is measured against some arbitrary probability threshold. Instead, competing models are compared to one another by evaluating the relative support in the observed data for each model. Second, models can be ranked and weighted, thereby providing a quantitative measure of relative support for each competing hypothesis. Third, in cases where models have similar levels of support from the data, model averaging can be used to make robust parameter estimates and predictions. Here, we review the steps of model selection, overview several elds where model selection is commonly used, indicate how model selection could be more broadly implemented and, nally, discuss caveats and areas of future development in model selection (Box 1). How model selection works Science is a process for learning about nature in which competing ideas about how the world works are evaluated against observations [1]. These ideas are usually expressed rst as verbal hypotheses, and then as mathematical equations, or models. Models depict biological processes in simplied and general ways that provide insight into factors that are responsible for observed patterns. Hence, the degree to which observed data support a model also reects the relative support for the associated hypothesis. Two basic approaches have been used to draw biological inferences. The dominant paradigm is to generate a null hypothesis (typically one with little biological meaning [2]) and ask whether the hypothesis can be rejected in light of observed data. Rejection occurs when a test statistic generated from observed data falls beyond an arbitrary probability threshold (usually P , 0.05), which is interpreted as tacit support for a biologically more meaningful alternative hypothesis. Hence, the actual hypothesis of interest (the alternative hypothesis) is accepted only in the sense that the null hypothesis is rejected. By contrast, model selection offers a way to draw inferences from a set of multiple competing hypotheses. Model selection is grounded in likelihood theory, a robust Generating biological hypotheses as candidate models Model selection is underpinned by a philosophical view that understanding can best be approached by simultaneously weighing evidence for multiple working hypotheses [1,3,5]. Consequently, the rst step in model selection lies in articulating a reasonable set of competing hypotheses. Ideally, this set is chosen before data collection and represents the best understanding of factors thought to be involved in the process of interest. Hypotheses that originate in verbal or graphical form must be translated to mathematical equations (i.e. models) before being t to Box 1. The big picture Biologists rely on statistical approaches to draw inferences about biological processes. In many elds, the approach of null hypothesis testing is being replaced by model selection as a means of making inferences. Under the model selection approach, several models, each representing one hypothesis, are simultaneously evaluated in terms of support from observed data. Models can be ranked and assigned weights, providing a quantitative measure of relative support for each hypothesis. Where models have similar levels of support, model averaging can be used to make robust parameter estimates and predictions. Corresponding author: Jerald B. Johnson ( jerry.johnson@noaa.gov). www.sciencedirect.com 0169-5347/$ - see front matter q 2004 Published by Elsevier Ltd. doi:10.1016/j.tree.2003.10.013 102 Review TRENDS in Ecology and Evolution Box 2. From multiple working hypotheses to a set of candidate models To use model selection, verbal hypotheses must be translated to mathematical models. Ideally, the parameters of such models have direct biological interpretation, but translating hypotheses to meaningful models (as opposed to statistically arbitrary models, e.g. ANOVA or linear regression) is not always intuitive. Hence, we offer some guidance about how to get from multiple working hypotheses to a set of candidate models [2,6]. The rst step is to specify variables in the model. Variables should correspond directly to causal factors outlined in the verbal hypotheses. The second step is to decide on the functions that dene the relationship between independent variables and the response variable in terms of mathematical operators and parameters. In elds where model selection is commonly used (Box 5), appropriate functions can be found in published literature or tailored software [45,46]. In other elds, suitable models can be found in theoretical literature or borrowed from other disciplines. The third step is to dene the error structure of the model. Generating hypotheses and translating them to models is an iterative process. For example, one hypothesis might seem to be equally well depicted by two or more models, including different error structures. In such cases, the verbal rendition of the hypothesis must be rened so that there is a one-to-one mapping from hypothesis to model. This can lead to an increase in the number of working hypotheses; however, care should be taken not to include models with functional relationships among variables that are not interpretable. In this regard, model selection differs from data dredging, where the analyst explores all possible models regardless of the interpretability of their functions, or continues to develop models to be tested after analysis is underway [3]. Ultimately, the number of candidate models should be small (some argue, on philosophical grounds, that this should be fewer than 20 [3]). The guiding principle at this step is to avoid generating so many models that spurious ndings become likely. Moreover, one should avoid relying on computing power to t all available models in lieu of identifying a bona de candidate set. data [1,6]. Translating hypotheses to models requires identifying variables and selecting mathematical functions that depict the biological processes through which those variables are related (Box 2). Fitting models to data Once a set of candidate models is specied, each model must be t to the observed data. At an early stage of the analysis, one can examine the goodness-of-t of the most heavily parameterized (i.e. global) model in the candidate set [3]. Such goodness-of-t can be assessed using conventional statistical tests (e.g. x 2 tests or G-tests) [7] or a PARAMETRIC BOOTSTRAP procedure (see Glossary). If the global model provides a reasonable t to the data, then the analysis proceeds by tting each of the models in the candidate set to the observed data using the method of MAXIMUM LIKELIHOOD or the method of LEAST SQUARES . Selecting a best model or best set of models Model selection is frequently employed as a way to identify the model that is best supported by the data (referred to as the 'best model') from among the candidate set. In other words, it can be used to identify the hypothesis that is best supported by observations. Two fundamentally different approaches are frequently used to address this in ecology and evolution (Box 3). One is to use a series of null www.sciencedirect.com Vol.19 No.2 February 2004 Glossary Akaike information criterion (AIC ): an estimate of the expected Kullback- Leibler information [3] lost by using a model to approximate the process that generated observed data (full reality). AIC has two components: negative loglikelihood, which measures lack of model t to the observed data, and a bias correction factor, which increases as a function of the number of model parameters. Akaike weight: the relative likelihood of the model given the data. Akaike weights are normalized across the set of candidate models to sum to one, and are interpreted as probabilities. A model whose Akaike weight approaches 1 is unambiguously supported by the data, whereas models with approximately equal weights have a similar level of support in the data. Akaike weights provide a basis for model averaging (Box 4). Least squares: a method of tting a model to data by minimizing the squared differences between observed and predicted values. Likelihood ratio test: a test frequently used to determine whether data support a fuller model over a reduced model (Box 3). The fuller model is accepted as best when the likelihood ratio (reduced model negative log-likelihood: full model negative log-likelihood) is sufciently large that the difference is unlikely to have occurred by chance (i.e. P , 0.05). Maximum likelihood: a method of tting a model to data by maximizing an explicit likelihood function, which species the likelihood of the unknown parameters of the model given the model form and the data. Parameter values associated with the maximum of the likelihood function are termed the maximum likelihood estimates of that model. Model averaging: a procedure that accounts for model selection uncertainty (dened below) in order to obtain robust estimates of model parameters u^ or ^ model predictions y (Box 4). A weighted average of the model-specic ^ estimates of u^ or y is calculated based on the Akaike weight [3] (or posterior probabilities if estimated using a Bayesian approach [48]) of each model. Where u^ does not appear in a model, the value of zero is entered. Model selection bias: bias favoring models with parameters that are overestimated; such bias can be overcome during model averaging by entering the value 0 for parameters when they are not already included in the particular models to be averaged. Model selection uncertainty: uncertainty about parameter estimates or model predictions that arises from having selected the model based on observations rather than actually knowing the best approximating model. Model selection uncertainty can be accounted for using model averaging. Parametric bootstrap: a statistical technique in which new data are generated from Monte Carlo simulations of the tted model. A measure of t (typically the deviance) is then computed, both for the model t to the observed data, and for the model t to the simulated data. If the deviance of the model t to the observed data falls within the core of the distribution of the deviance of model t to the simulated data, then the model is said to t the data adequately. Parsimony: in statistics, a tradeoff between bias and variance. Too few parameters results in high bias in parameter estimators and an undert model (relative to the best model) that fails to identify all factors of importance. Too many parameters results in high variance in parameter estimators and an overt model that risks identifying spurious factors as important, and that cannot be generalized beyond the observed sample data. Schwarz criterion (SC) (also known as the Bayesian information criterion) [10]: a model selection criterion designed to nd the most probable model (from a Bayesian perspective) given the data (Box 3). Supercially similar to AICc , SC has two components: negative log-likelihood, which measures lack of t, and a penalty term that varies as a function of sample size and the number of model parameters. SC is equivalent (under certain conditions) to the natural logarithm of the Bayes factor [48]. hypothesis tests, such as LIKELIHOOD RATIO TESTS in phylogenetic analysis or F - tests in multiple regression analysis, to compare pairs of models from among the candidate set. However, this approach is typically restricted to nested models (i.e. the simpler model is a special case of the more complex model) and, in some cases, leads to suboptimal models that are dependent upon the hierarchical order in which models are compared [8]. Moreover, such tests cannot be used to quantify the relative support for the various models. By contrast, model selection criteria can be used to rank competing models and to weigh the relative support for each one. These techniques utilize maximum likelihood scores as a measure of t (more precisely, negative Review TRENDS in Ecology and Evolution 103 Vol.19 No.2 February 2004 Box 3. Approaches to model selection Once a set of candidate models is dened, they can be t to observed data and compared to one another. Practitioners typically use one of three kinds of statistical approach to compare models: (i) maximizing t; (ii) null hypothesis tests; and (iii) model selection criteria. Here, we highlight ve frequently used techniques (Table I). Our list is not exhaustive (for additional examples, see [47-50]). Rather, we describe approaches most commonly used in ecology and evolutionary biology. Maximizing t A nave approach to model selection is to calculate a measure of t, such as adjusted R 2, and select the model that maximizes that quantity. Maximizing t, with no consideration of model complexity, always favors fuller (i.e. more parameter rich) models. However, it neglects the principle of PARSIMONY and, consequently, can result in imprecise parameter estimates and predictions, making it a poor technique for model selection. By contrast, tests or criteria that account for both t and complexity are better suited for selecting a model. Null hypothesis tests The likelihood ratio test (LRT) is the most commonly used null hypothesis approach. LRT compare pairs of nested models. When the likelihood of the more complex model is signicantly greater than that of the simpler model (as judged by a x 2 statistic), the complex model is chosen, and vice versa. Selection of the more complex model indicates that the benet of improved model t outweighs the cost of added model complexity. LRT are often used hierarchically in a procedure analogous to forward selection in multiple regression, where the analyst starts with the simplest model and adds terms as LRTs indicate a signicant improvement in t. A drawback is that it requires several nonindependent tests, thus inating type I error. In addition, hierarchical LRTs sometimes select suboptimal models that are dependent upon the order in which models are compared, in which case dynamical LRTs can be employed [8]. However, no form of LRT can be used to quantify relative support among competing models. Model selection criteria Model selection criteria consider both t and complexity, and enable multiple models to be compared simultaneously. The Akaike information criterion (AIC) estimates the Kullback -Leibler information lost by approximating full reality with the tted model. Computation entails terms representing lack of t and a bias correction factor related to model complexity. AIC has a second order derivative, AICc , which contains a bias correction term for small sample size, and should be used when the number of free parameters, p, exceeds , n /40 (where n is sample size). Schwarz criterion (SC; also referred to as a Bayesian information criterion, or BIC) [9] is structurally similar to AIC (Table I), but includes a penalty term dependent on sample size. Consequently, SC tends to favor simpler models, particularly as sample size increases [47]. Under certain conditions, model selection using SC and Bayes factor are equivalent, such that choosing the model with the smallest SC is equivalent to choosing the model with the greatest posterior probability [48]. Derivation of SC rests on several stringent assumptions that are seldom satised with empirical data, including that one true model exists, that this model is among the candidate set, and that the true model has an equal prior probability to each of the other models in the candidate set. Although SC supercially resembles AICc , it is not based in Kullback -Leibler information theory. Which approach to use? Which model selection approach is most appropriate? Techniques that maximize t alone have clear limitations with regard to parsimony. Among approaches that consider t and model complexity, many practitioners are moving from LRTs toward model selection criteria. For example, molecular systematists have traditionally used hierarchical LRTs to choose among competing models. However, this pattern could shift as researchers recognize the limitations of LRTs relative to the model selection criteria [4] (Box 5). Among model selection criteria, AIC is generally favored because it has its foundation in Kullback -Leibler information theory [3]. Yet, some prefer SC over AIC because the former selects simpler models [6]. An important advantage of using model selection criteria (e.g. AIC and SC) is that they can be used to make inferences from more than one model, something that cannot be done using the t maximization or null hypothesis approaches. Table I. Commonly used model selection methods Model selection method Calculationa Adjusted R 2 RSS=n 2 p 2 1 2 Radj 1 2 P \u0016 yi 2 y2 =n 2 1 ^ ^ LRT 22{lnLup ly 2 lnLupq ly} , x2 q Likelihood ratio test Elements Akaike information criterion (AIC) ^ AIC 22lnLup ly 2p Small sample unbiased AIC (AICc) n ^ AICc 22lnLup ly 2p n2p21 Schwarz criterion ^ SC 22lnbLup lyc plnn ! Refs Fit [7] Fit and complexity [7] Fit and complexity [3] Fit and complexity (with bias correction term for small sample size) Fit, complexity, and sample size [3] [10] a RSS, residual sum of squares for a linear model; n, sample size; p, count of free parameters (s 2 must be included if it is estimated from the data); q, additional parameters of a fuller model; y : data; Lu^ly : likelihood of the model parameters (more precisely, their maximum likelihood estimates, u^p ) given the data, y ; for a model tted by least squares with the usual assumptions, InLu^p ly 2n=2InRSS=n; enabling computation of LRTs, AIC, AICc , and SC from standard regression output. log-likelihood scores as a measure of lack of t) and a term that, in effect, penalizes models for greater complexity. Two criteria commonly used in ecology and evolution are the AKAIKE INFORMATION CRITERION (AIC) [9] and the SCHWARZ CRITERION (SC; known also as the Bayesian information criterion, or BIC ) [10]. The use of model selection criteria enable inference to be drawn from several models simultaneously, so that researchers can consider a 'best set' of similarly supported models. www.sciencedirect.com Parameter estimation and model averaging Often, the underlying motive for model selection is to estimate model parameters that are of particular biological interest (e.g. survival rate in mark- recapture studies, or transition:transversion ratios for phylogenetic studies), or to identify a model that can be used for prediction. When there is clear support for one model, maximum likelihood parameter estimates or predictions from that model can be used. However, there is sometimes nearly equivalent support in the observed data for multiple Review 104 TRENDS in Ecology and Evolution Vol.19 No.2 February 2004 Box 4. Multi-model inference The model selection paradigm is moving beyond simply choosing a single, best model. Multi-model inference refers to a set of analysis techniques employed to enable formal inference from more than one model [3]. These techniques can be divided into two areas. Generating a condence set of models How do we know which models are well supported by the data? A set of calculations based on Akaike information criterion (AIC) provides one way for making this determination. Once each model has been t to the data and an AIC score has been computed, differences in these scores between each model and the best model are calculated (the 'best' model in the set has the minimum AIC score) (Eqn I) Di AICi 2 AICmin Eqn I The likelihood of a model, gi, given the data, y, is then calculated as Eqn II, Lgi ly exp21=2Di Eqn II In some cases, it is informative to contrast the likelihood of pairs of models, particularly that of the best model with each other model, using the evidence ratio (Eqn III), ER Lgbest ly : Lgi ly exp21=2Di R X exp21=2Dj When the underlying goal of model selection is parameter estimation or prediction, and no single model is overwhelmingly supported by the data (i.e. wbest , 0.9), then model averaging can be used. This entails ^ calculating a weighted average of parameter estimates, u (Eqn V), ^\u0016 u R X ^ w i ui Eqn IV This value, referred to as the Akaike weight, provides a relative weight of evidence for each model. Akaike weights can be interpreted as the models [i.e. Akaike information criterion (AIC) values are nearly equal], making it problematic to choose one model over another. MODEL AVERAGING provides a way to address this problem (Box 4). Parameter estimates or predictions obtained by model averaging are robust in the sense that they reduce MODEL SELECTION BIAS and account for MODEL SELECTION UNCERTAINTY. Inference from model selection Ultimately, model selection is a tool for making inference about unobserved processes based on observed patterns. Data that clearly support one model over several others lend strong support to the corresponding hypothesis (among those considered); that is, we can infer the process that is most likely to have operated in generating the observed data. However, some inferences, such as determining the relative importance of predictor variables, can be made only by examining the entire set of candidate models (Box 4). Where model selection is being used Model selection is well established as a basic tool in select biological disciplines. In particular, it is a prerequisite for most mark- recapture studies and for most phylogenetic studies (Box 5). Model selection is now beginning to be implemented more broadly to address a variety of additional questions in ecology and evolution (Table 1). Here, we highlight some areas where such an approach has proved useful. Eqn V i1 ^ ^ (where ui is the estimate of u from the i th model) across all R models in the candidate set. The variance of these estimates can also be calculated (Eqn VI), ^ ^\u0016 v aru ji www.sciencedirect.com Model averaging Eqn III Model likelihood values can also be normalized across all R models so that they sum to 1 (Eqn IV), Wi probability that model i is the best model for the observed data, given the candidate set of models. They are additive and can be summed to provide a condence set of models, with a particular probability that the best approximating model is contained within the condence set. They also provide a way to estimate the relative importance of a predictor variable (or a functional form that represents some biological process). This measure of relative importance can be calculated as the sum of the Akaike weights over all of the models in which the parameter (or functional form) of interest appears [3]. R X ^\u0016 ^ ^ ^ wi v arulgi ui 2 u2 Eqn VI i1 ^ ^ (where v aru lgi is the estimate of the variance of u from the i th model). This variance estimator can be used to assess the precision of the estimate over the set of models considered, thereby providing a way to generate a condence interval on the parameter estimate that accounts for model selection uncertainty. Predicted values of the response variable can be averaged over the models in the candidate set in an analogous way [3]. Ecology Mark - recapture analyses are used widely to estimate population abundance and survival probabilities [11,12]. A fundamental challenge is to separate the probability that a marked individual has died from the probability that it was not recaptured in spite of having survived. Wildlife biologists address this problem by generating a set of competing models that depict different ways in which survival and encounter probabilities could vary as a function of time, the environment, or individual traits (e.g. sex or size) (Box 5). The favored model (or set of models) is then used to estimate parameters of interest, or to infer the biological processes governing survival or abundance. This approach has been used to estimate vital rates for management and conservation [13,14], and to infer how factors, such as individual physiological status, or environmental conditions, affect vital rates [15,16]. Community ecologists [17] and paleontologists [18] have even adopted this mark- recapture model selection framework to estimate species richness and species turnover rates. There is also a rich tradition of using models to explore population dynamics [6]. Ecologists have proposed many competing hypotheses to explain patterns of population uctuation over time. An increasing number of studies have t models depicting competing hypotheses to observed time series data; applications include detecting chaotic dynamics in natural populations [19], inferring the mechanism underlying population cycles [20,21], and separating the inuence of density-dependent and Review TRENDS in Ecology and Evolution Vol.19 No.2 February 2004 105 Box 5. Parallel development of model selection in wildlife biology and molecular systematics Although the initial statistical machinery and philosophical underpinnings of model selection have been available for 30 years [9], ecologists and evolutionary b

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Homework Clubs Preparing For Algebra Math Help For Struggling Kids

Authors: Susan Everingham

1st Edition

1723708585, 978-1723708589

More Books

Students also viewed these Mathematics questions