Answered step by step
Verified Expert Solution
Question
1 Approved Answer
You have to develop a logistic regression model with the 3000 clients to predict the Y variable (did the client subscribe to the term deposit
You have to develop a logistic regression model with the 3000 clients to predict the Y variable (did the client subscribe to the term deposit - yeso) with the help of the provided explanatory variables. The file pa2_score.sas7bdat contains the clients for which a prediction of Y is needed (10 394 clients). This file contains the explanatory variables but not the variable Y. It also contains a client identification variable "ID"- The performance criterion to evaluate your model will be the rate of correct classification. Explain clearly your methodology and describe the chosen model (which variables were used)- 2. The Utah tourist office in the United States did a survey with 200 people living in the country. A questionnaire with the following questions was used. On a scale of 1 to 7, where 1 is "not important at all" and 7 is "very important", what importance would you attribute to each of the following elements: (a) National parks; (b) Museums; (c) Hiking: (d) Night clubs; (e) National forests; (f) Ski; (g) Historical sites; (h) Entertainment; (i) Fishing; (j) Orchestras; (k) Night life, (1) Camping; (m) State parks; (n) Water sports; (o) Theater; To reduce the number of variables, a factor analysis was performed with those 15 elements. The solution with 4 factors was retained. Those 4 factors are: . ressext: importance of the quality of outdoor resources: items (a), (e), (g) and (m) above. . culture: importance of the quality of cultural activities: items (b), (h), (j) and (o) above. . actext: importance of the quality of outside activities items (c), (f), (1), (1) and (n) above. . nightlife: importance of the quality of nightlife: items (d) and (k) above. The 4 factors were constructed by taking the mean of the variables which are part of the factor. The data are in the file pa2q2.sas7bdat. The file also contains the age and sex (0-male, 1=female) of the person. (a) Do a cluster analysis with the 4 factors above: ressext, culture, actext and nightlife to segment the individuals. You have to give a clear interpretation of each cluster. Present the mean of the 4 factors for each cluster. Each cluster has to contain at least 30 individuals. (6 points) The tourism office wants to assign future visitors to the clusters created in (a). However, the vari- ables "resext", "culture", "actext" and "nightlife" used to do the segmentation are not available for the future visitors who have not taken the questionnaire. The variables "age" and "sex" are available. Develop a model to classify future visitors in one of your clusters using the variables 'age" and "sex" only, without any transformations or model selection
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started