Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

By udin Background. When we talk about relationships between men and women, we usually refer to marriage. However, how can we identify a good relationship?

By udin

Background. When we talk about relationships

between men and women, we usually refer to marriage. However, how can we identify a good relationship? Predicting divorce has been an area of interest for researchers and practitioners for many years, as it has significant social and economic implications. One of the main reasons for predicting divorce is to identify couples who may be at risk of divorce so that interventions can be put in place to help them address any issues and improve their relationship. There are

Divorce

several potential benefits to using machine learning to predict divorce. For example, machine learning algorithms can identify risk factors for divorce that may not be immediately obvious to human analysts.

Dataset. The Divorce Predictors Dataset is a publicly available UCI Machine Learning Repository dataset. A total of 170 couples are included in this dataset, along with the Divorce Predictor Scale variables (DPS), a questionnaire of 54 questions, based on the Gottman couples therapy. Records were collected from face-to-face interviews with couples who were already divorced (49%) or happily married (51%) in various regions of Turkey. A five-point scale was used to measure all responses (0-Never, 1-Seldom, 2=Average, 3-Frequently, 4-Always). The dataset can be downloaded from here.

Requirements.

1. Formulate the machine learning problem by identifying the task (classification or

regression), data, and challenges.

2. Perform Exploratory Data Analysis. This includes:

o Plot relationship between variables

Identify correlated features or the ones that highly correlated with the label/outcome, if any

Perform data imputation for missing variables Encode your outcome to a one-hot vector, if needed

Remove redundant variables by dimensionality reduction techniques

3. Consider splitting the data into Training, Validation, and Testing sets. The suggested split is 70%, 10%, and 20% held-out testing set, respectively. (same split for all tasks)

4. Build and develop the following models (tasks): o Task#1: Clustering in an unsupervised fashion. 5. For each task, run the model with and without processing the data, e.g., without normalization or dimensionality reduction, and compare the model's performance

after you normalize and/or reduce the dimensionality of the data. 6. For each task, show how you performed the model selection. For example,

demonstrate the performance of variants of your model with different hyper-parameters, e.g., number of clusters and initialization when it comes to

clustering methods.

7. For each task, perform a 5-fold Cross Validation.

8. For each task, run the corresponding evaluation metrics on each fold to demonstrate

the performance of your odel on the held-out testing set.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

SQL Instant Reference

Authors: Gruber, Martin Gruber

2nd Edition

0782125395, 9780782125399

More Books

Students also viewed these Databases questions

Question

10-5. What is paraphrasing and what is its purpose? [LO-4]

Answered: 1 week ago

Question

(8) What am I doing to stretch the high achievers?

Answered: 1 week ago

Question

(9) What am I doing to develop the poor performers?

Answered: 1 week ago