Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Here is everything...Can you help me to solve the p2.2 please!! Project 2.2: Recommend a City Note that this project is a continuation from Project
Here is everything...Can you help me to solve the p2.2 please!!
Project 2.2: Recommend a City Note that this project is a continuation from Project 2.1: Data Cleanup. You must meet specifications for Project 2.1 before you can continue on with this Project 2.2 Step 1: Linear Regression Create a linear regression model off your training set and present your model. Visualizations are highly encouraged in this section. Important: Make sure you have dealt with outliers and removed one city from your training set. You should have 10 rows of data before you begin modeling the dataset. Build a linear regression model to help you predict total sales. At the minimum, answer these questions: 1. How and why did you select the predictor variables (see supplementary text) in your model? You must show that each predictor variable has a linear relationship with your target variable with a scatterplot. 2. Explain why you believe your linear model is a good model. You must justify your reasoning using the statistical results that your regression model created. . For each variable you selected, please justify how each variable is a good fit for your model by using the p-values and Rsquared values that your model produced. 3. What is the best linear regression equation based on the available data? Each coefficient should have no more than 2 digits after the decimal (ex: 1.28) Step 2: Analysis Use your model results to provide a recommendation. At the minimum, answer this question: 1. Which city would you recommend and why did you recommend this city? Project 2.1: Data Cleanup Step 1: Business and Data Understanding Provide an explanation of the key decisions that need to be made. (250 word limit) Key Decisions: Answer these questions 1. What decisions needs to be made? Answer: The decision that we are making is related to expanding Pawdacity's stores and building a 14th pet store in Wyoming for Pawdacity. One of the main issues I am trying to uncover is where the best location is to build the store based on projected target revenue of $200,000 in their first year. 2. What data is needed to inform those decisions? Answer: The data I need to collect to inform my decision are as follows: 1. The sales data from 2010 of Pawdacity stores and their respective cities. 2. 2010 Census population data for cities in which the stores were located. 3. Total households with people under 18. 4. Land Area 5. Population Density of those cities. 6. Total Families living in those cities. 7. Number of pets owned by families in the area. 8. Types of pets owned in the area we want to build a store. 9. Number of parks and communal pet parks. 10. Data based on current and local marketing budget spent per city on current stores. 11. Data for the amount of people projected to be living a city we are planning on opening the store. 12. Competitors market data. Step 2: Building the Training Set Build your training set given the data provided to you. Your column sums of your dataset should match the sums in the table below. In addition provide the averages on your data set here to help reviewers check your work. You should round up to two decimal places, ex: 1.24 Column Sum Average Census Population 233,862 Total Pawdacity Sales 21,260.18 3,773,304 343,027.64 Households with Under 18 34,064 3,096.73 Land Area 33,071 3,006.49 Population Density 63 5.71 Total Families 69,653 6,332.07 Step 3: Dealing with Outliers Answer these questions Are there any cities that are outliers in the training set? Which outlier have you chosen to remove or impute? Because this dataset is a small data set (11 cities), you should only remove or impute one outlier. Please explain your reasoning. Answer: Land_Area - There does not appear to be outliers in this data set based on the interquartile analysis. Households_Under_18 - There does not appear to be outliers in this data set, all data seems to follow that for Pawdacity, the more 18 year olds you have in your house, the higher the likelihood that you will purchase items from the store. Population_Density - With population density, there is one outlier for the largest city (Cheyenne) in the state which has a population density of 20.34, however it seems to follow the relationship between population_density and sales based on the scatter plot, and therefore does not appear to skew the data. Additionally, by keeping the this in the dataset we have a more robust model for modelling big cities in the future. Total_Families - There is one outlier in total_families (Casper), however the city of Casper does not skew the data in terms of sales and due to the limited number of data points I have decided to keep it. 2010_Census - Within the 2010 Census data, using the interquartile range analysis, there appears to be one outlier than we can exclude, and that is the city of (Gillette). While the population data does not seem to skew the model, the sum_sales for this city appears to be the clearest outlier. Due to the fact that the sales is an outlier in such a small population relatively, leaving this data point in the model has the potential to go against our logic of larger city/larger revenue Project Overview This project is a continuation of Project 2.1 regarding trying to find the best city to expand for Pawdacity's newest pet store. Scenario Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity's newest store, based on predicted yearly sales. How Do I Complete this Project? This project uses skills learned throughout the "Multivariable Linear Regression\" lesson. To complete this project: Go through the course Apply the skills learned in the course to solve the business problem given in the project details section. Use our guidelines and rubric to help build your project. When you're ready, submit it to us for review using the submission template found in the supporting materials section. Skills Required In order to complete this project, you must be able to: Choose appropriate predictor variables Analyze for correlations between predictor variables Build a linear model The Business Problem Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity's newest store, based on predicted yearly sales. In the first part, you've already cleaned up the dataset and dealt with outliers. In this project, you will take this dataset that you cleaned up and use this dataset to train a linear regression model in order to predict sales Here are the criterias given to you in choosing the right city: 1 2 3 4 5 The new store should be located in a new city. That means there should be no existing stores in the new city. The total sales for the entire competition in the new city should be less than $500,000 The new city where you want to build your new store must have a population over 4,000 people (based upon the 2014 US Census estimate). The predicted yearly sales must be over $200,000. The city chosen has the highest predicted sales from the predicted set. Steps to Success Step 1: Build a Linear Regression Model Analyze the dataset you created in Project 2.1 and look at the distribution of your data. You can create histograms to look at each of your continuous and categorical data to determine the nature of the data you're working with. Important: Make sure you have dealt with outliers and removed one city from your training set. You should have 10 rows of data before you begin modeling the dataset. Build a linear regression model to help you predict total sales. Step 2: Perform the Analysis Use your regression model to calculate predicted sales for all of the cities and use the criteria given to you to make a recommendation. Data Please refer to the Supporting Materials section in Project 2.1 to access the data you need to complete this projectStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started