Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This assignment expects you to make a use of multiple machine learning algorithms to make predictions from the following datasets. Refer to our lecture notes

This assignment expects you to make a use of multiple machine learning algorithms
to make predictions from the following datasets. Refer to our lecture notes and
practicals to use appropriate algorithms as per the outline defined.
Datasets
1) The dataset film_collection_dataset.csv contains information about movies and
their marketing, production expense, budget of the movie, length of the movie, critic
rating etc and the money earned.
2) The dataset loan_dataset.csv contains peoples personal information and a
classification field loan_status states whether or not their request to loan was
approved based on their education, income and credit score.
3) The dataset marketing_campaign_dataset.csv contains data about peoples
education, marital status, income, number of kids in the household etc and their
preferences to multiple products and their binary response (acceptance/rejection) to
multiple offers made in campaigns (from columns AcceptedCmp1 to AcceptedCmp2
and the response column). The dataset also contains information about the amount of
money spent on products such as Gold, Fruits, Meat, Fish, Sweets and Wines in the
last two years.
Outline
1. Create optimum training/testing split to form appropriate machine learning
models for both classification and regression problems and also make a use of
cross validation methods to avoid model overfitting problems.
2. Achieve necessary data pre-processing steps including outlier removals and
appropriate visualisation steps such as pair plots or correlation matrix to better
understand the data distribution.
3. Create Linear and Multiple regression models to predict the revenue of movies
by proposing unseen input data by keeping in mind the concept of
multicollinearity. Also calculate the coefficient of determination r2(R
squared). Perform the analysis with and without data standardisation to
differentiate the prediction effect.
4. Make a decision tree model (for a regression problem) using optimum training/
testing split and calculate the Mean Squared Error (MSE) to check the models
accuracy. Also predict some unseen movies data and compare the models
accuracy against the regression model to find out which model performs better.
5. Train the Logistic Regression and Decision Tree models with optimum
train/test split for solving classification problems using GridSearch
Hyperparameter tuning to predict whether or not a loan of a certain profiles of
individuals would be approved. Also employ the Random Forest classification
model to predict the class of the same unseen data (calculating the accuracy of
the model) and compare the results with the Logistic Regression and Decision
Tree models and evaluate your analysis.
6. Use the same classification models for the Marketing campaign dataset and
predict whether individual profiles with certain characteristics (such as marital
status, income or education level) is likely to respond to the campaigns made.
7. Also use the K-means clustering algorithm to identify clusters of people with
certain characteristics (such as education, marital status or income level) and
the money they spent on products like Gold, Fruits, Meat, Fish, Sweets and
Wines etc.
8. In your report, show appropriate visualisations, confusion matrix and
classification report for each classification model wherever necessary.
Write me the codes for each.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

19. How Instant Messaging Can Improve Corporate Communication

Answered: 1 week ago

Question

6. Identify characteristics of whiteness.

Answered: 1 week ago

Question

e. What are notable achievements of the group?

Answered: 1 week ago