Question
1) From the Murder data file (http://stat4ds.rwth-aachen.de/data/Murder.dat) at the book's website, use the variable murder, which is the murder rate (per 100,000 population) for each
1) From the Murder data file (http://stat4ds.rwth-aachen.de/data/Murder.dat) at the book's website, use the variable murder, which is the murder rate (per 100,000 population) for each state in the U.S. in 2017 according to the FBI Uniform Crime Reports. At first, do not use the observation for D.C. (DC). Using software: (a) Find the mean and standard deviation and interpret their values. (b) Find the five-number summary, and construct the corresponding box plot. Interpret. (c) Now include the observation for D.C. What is affected more by this outlier: The mean or the median? The range or the inter-quartile range?
2) The Income data file (http://stat4ds.rwth-aachen.de/data/Income.dat) at the book's website reports annual income values in the U.S., in thousands of dollars.
(a) Using software, construct a histogram. Describe its shape. (b) Find descriptive statistics to summarize the data. Interpret them. (c) The kernel density estimation method finds a smooth-curve approximation for a histogram. At each value, it takes into account how many observations are nearby and their distance, with more weight given those closer. Increasing the bandwidth increases the influence of observations further away. Plot a smooth-curve approximation for the histogram of income values. Summarize the impact of increasing and of decreasing the bandwidth substantially from the default value. (d) Construct and interpret side-by-side box plots of income by race (B = Black, H = Hispanic, W = White). Compare the incomes using numerical descriptive statistics
3) The Houses data file (http://stat4ds.rwth-aachen.de/data/Houses.dat) at the book's website lists the selling price (thousands of dollars), size (square feet), tax bill (dollars), number of bathrooms, number of bedrooms, and whether the house is new (1 = yes, 0 = no) for 100 home sales in Gainesville, Florida. Let's analyze the selling prices.
(a) Construct a frequency distribution and a histogram. Describe the shape. (b) Find the percentage of observations that fall within one standard deviation of the mean. Why is this not close to 68%? (c) Construct a box plot, and interpret. (d) Use descriptive statistics to compare selling prices according to whether the house is new.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started