Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

APPENDIX 1 - DATA DICTIONARY Note: The meaning of each value for the internal codes of the organization is unknown. Besides blanks, 'Unkn' and '???'

image text in transcribedimage text in transcribedimage text in transcribed

APPENDIX 1 - DATA DICTIONARY Note: The meaning of each value for the internal codes of the organization is unknown. Besides blanks, 'Unkn' and '???' are expressions in the dataset that denote missing values. With this consideration, read in the dataset as a Pandas dataframe, and state the variables that contain missing values. (5 marks) Question 2 As part of data preparation, treat the missing data, and explain your rationale of the treatments. (15 marks) Question 3 Explain and implement three (3) other data preparation tasks required for further analysis of the data. Any appropriate Python related libraries, functions, methods (e.g. pandas.to_datetime) can be used. (15 marks) Question 4 Analyse the data and describe three (3) insights into the corporate claims processing of the insurance company, with at least one (1) supporting visualization created to illustrate each insight. (30 marks) Question 5 Perform linear regression modelling to predict the delay in days (between the Planned and Actual date) in processing the claims, explaining the approach taken, including any further data pre-processing needed for modelling. (25 marks) Question 6 Discuss the results obtained from the modelling and state the linear regression equation. APPENDIX 1 - DATA DICTIONARY Note: The meaning of each value for the internal codes of the organization is unknown. Besides blanks, 'Unkn' and '???' are expressions in the dataset that denote missing values. With this consideration, read in the dataset as a Pandas dataframe, and state the variables that contain missing values. (5 marks) Question 2 As part of data preparation, treat the missing data, and explain your rationale of the treatments. (15 marks) Question 3 Explain and implement three (3) other data preparation tasks required for further analysis of the data. Any appropriate Python related libraries, functions, methods (e.g. pandas.to_datetime) can be used. (15 marks) Question 4 Analyse the data and describe three (3) insights into the corporate claims processing of the insurance company, with at least one (1) supporting visualization created to illustrate each insight. (30 marks) Question 5 Perform linear regression modelling to predict the delay in days (between the Planned and Actual date) in processing the claims, explaining the approach taken, including any further data pre-processing needed for modelling. (25 marks) Question 6 Discuss the results obtained from the modelling and state the linear regression equation

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Oracle Databases On The Web Learn To Create Web Pages That Interface With Database Engines

Authors: Robert Papaj, Donald Burleson

11th Edition

1576100995, 978-1576100998

More Books

Students also viewed these Databases questions