Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

You have to develop a logistic regression model with the 3000 clients to predict the Y variable (did the client subscribe to the term deposit

image text in transcribed
You have to develop a logistic regression model with the 3000 clients to predict the Y variable (did the client subscribe to the term deposit - yeso) with the help of the provided explanatory variables. The file pa2_score.sas7bdat contains the clients for which a prediction of Y is needed (10 394 clients). This file contains the explanatory variables but not the variable Y. It also contains a client identification variable "ID"- The performance criterion to evaluate your model will be the rate of correct classification. Explain clearly your methodology and describe the chosen model (which variables were used)- 2. The Utah tourist office in the United States did a survey with 200 people living in the country. A questionnaire with the following questions was used. On a scale of 1 to 7, where 1 is "not important at all" and 7 is "very important", what importance would you attribute to each of the following elements: (a) National parks; (b) Museums; (c) Hiking: (d) Night clubs; (e) National forests; (f) Ski; (g) Historical sites; (h) Entertainment; (i) Fishing; (j) Orchestras; (k) Night life, (1) Camping; (m) State parks; (n) Water sports; (o) Theater; To reduce the number of variables, a factor analysis was performed with those 15 elements. The solution with 4 factors was retained. Those 4 factors are: . ressext: importance of the quality of outdoor resources: items (a), (e), (g) and (m) above. . culture: importance of the quality of cultural activities: items (b), (h), (j) and (o) above. . actext: importance of the quality of outside activities items (c), (f), (1), (1) and (n) above. . nightlife: importance of the quality of nightlife: items (d) and (k) above. The 4 factors were constructed by taking the mean of the variables which are part of the factor. The data are in the file pa2q2.sas7bdat. The file also contains the age and sex (0-male, 1=female) of the person. (a) Do a cluster analysis with the 4 factors above: ressext, culture, actext and nightlife to segment the individuals. You have to give a clear interpretation of each cluster. Present the mean of the 4 factors for each cluster. Each cluster has to contain at least 30 individuals. (6 points) The tourism office wants to assign future visitors to the clusters created in (a). However, the vari- ables "resext", "culture", "actext" and "nightlife" used to do the segmentation are not available for the future visitors who have not taken the questionnaire. The variables "age" and "sex" are available. Develop a model to classify future visitors in one of your clusters using the variables 'age" and "sex" only, without any transformations or model selection

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Income Tax Fundamentals 2013

Authors: Gerald E. Whittenburg, Martha Altus Buller, Steven L Gill

31st Edition

1111972516, 978-1285586618, 1285586611, 978-1285613109, 978-1111972516

Students also viewed these Economics questions