Question
Client Retention Dataset: ClientID Phone Internet TVService MoviesService Gender SeniorCitizen Married HasDependents MonthsAsClient SecurityService BackupService DeviceProtection TechSupport ContractType Paperless PaymentMethod MonthlyCharges TotalCharges Quit PB-1 No
Client Retention Dataset:
ClientID | Phone | Internet | TVService | MoviesService | Gender | SeniorCitizen | Married | HasDependents | MonthsAsClient | SecurityService | BackupService | DeviceProtection | TechSupport | ContractType | Paperless | PaymentMethod | MonthlyCharges | TotalCharges | Quit |
PB-1 | No | DSL | No | No | Female | 0 | Yes | No | 1 | No | Yes | No | No | Monthly | Yes | e-check | 29.85 | 29.85 | No |
PB-2 | Yes | DSL | No | No | Male | 0 | No | No | 34 | Yes | No | Yes | No | 1 Year | No | Check | 56.95 | 1889.5 | No |
PB-3 | Yes | DSL | No | No | Male | 0 | No | No | 2 | Yes | Yes | No | No | Monthly | Yes | Check | 53.85 | 108.15 | Yes |
PB-4 | No | DSL | No | No | Male | 0 | No | No | 45 | Yes | No | Yes | Yes | 1 Year | No | EBT | 42.3 | 1840.75 | No |
PB-5 | Yes | FiberOptic | No | No | Female | 0 | No | No | 2 | No | No | No | No | Monthly | Yes | e-check | 70.7 | 151.65 | Yes |
PB-6 | Yes | FiberOptic | Yes | Yes | Female | 0 | No | No | 8 | No | No | Yes | No | Monthly | Yes | e-check | 99.65 | 820.5 | Yes |
PB-7 | Yes | FiberOptic | Yes | No | Male | 0 | No | Yes | 22 | No | Yes | No | No | Monthly | Yes | Credit Card | 89.1 | 1949.4 | No |
PB-8 | No | DSL | No | No | Female | 0 | No | No | 10 | Yes | No | No | No | Monthly | No | Check | 29.75 | 301.9 | No |
PB-9 | Yes | FiberOptic | Yes | Yes | Female | 0 | Yes | No | 28 | No | No | Yes | Yes | Monthly | Yes | e-check | 104.8 | 3046.05 | Yes |
PB-10 | Yes | DSL | No | No | Male | 0 | No | Yes | 62 | Yes | Yes | No | No | 1 Year | No | EBT | 56.15 | 3487.95 | No |
The dataset consists of 19 variables and the target variable. Below is the description of all variables:
- ClientID: The internal ID of PythonBell clients.
- Phone: Does the client have phone line - Yes, No.
- Internet: Does the client have internet - DSL, FiberOptic, None.
- TVService: Does the client have TV Service - Yes, No, NotApplicable.
- MoviesService: Does the client have Movie Service - Yes, No, NotApplicable.
- Gender: The gender of the client - Male, Female.
- SeniorCitizen: Is the client a senior citizen - 1 =Yes, 0 = No.
- Married: Is the client married - Yes, No.
- HasDependents: Does client have dependents - Yes, No.
- MonthsAsClient: How many months the client has been a customer - continuous variable.
- SecurityService: Does the client have the Security addon - Yes, No, NotApplicable.
- BackupService: Does the client have the Backup addon - Yes, No, NotApplicable.
- DeviceProtection: Does the client have the Protection addon - Yes, No, NotApplicable.
- TechSupport: Does the client have the Support addon - Yes, No, NotApplicable.
- ContractType: Contract length - Monthly, 1 Year, 2 Years.
- Paperless: Bill sent electronically - Yes, No.
- PaymentMethod: The method the client pays - e-check, Check, EBT, Credit Card.
- MonthlyCharges: The monthly bill the client receives - continuous variable.
- TotalCharges: The charges the client received so far - continuous variable.
- Quit: Did the client terminate the service - Yes, No.
In Python Jupyter Notebook, You will be doing these steps:
- Import Libraries
- Load dataset
- Exploratory Data Analysis
- Data Dimension
- Data Types
- Summary Statistics
- Correlation plots
- Data Distribution (plot features against Target variable)
- Data Pre-Processing and Wrangling
- Check for Missing Values
- Remove or Fill Missing values (if needed)
- Check for Duplicate Data
- Perform Feature Engineering
- Check for Outliers using Box Plots
- Perform Categorical Data Encoding
- Perform Feature Scaling - if needed
- Survival Models Building:
- Kaplan-Meier model
- Exponential Model
- Cox Proportional Hazards model
- An appropriate regression model suitable for this dataset
- Models Evaluation and Comparisons - evaluation metrics below will be used with the proper model as not all metrics are applicable to all models
- Classification Report
- log-rank hypothesis test.
write the report based on interpreted result, with clear objectives and mission statements. Problem Statement
Retaining clients is a major problem that companies work hard to achieve, besides acquiring new clients. Current customers are very valuable to keep as they are paying the full prices after their promotions have expired. The advancement of statistical analysis and modeling can be applied to study when and why customers can terminate their services and move their business somewhere else. This task is very important to maintain high profits.
You will build several models (2 survival and 1 regression) to identify clients who are at risk of terminating their services. Using the expertise gained in this course: importing libraries, loading data into data frames, cleaning and manipulating data, exploring and visualizing data, feature engineering, developing models, and finally evaluating and comparing models. The goal is to build the space yacht, I mean to keep customers.
Scenario:
as a data scientist for the biggest telecommunication companies . do a thorough survival analysis to identify clients who may terminate their service in the hope of retaining them before the switch. build a prediction model using some suitable regression model. Based on available data from current and previous clients,
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started