Question: Assignment 1: Duta exploration and preparation Dataset Description: You will work on Credit dataset. The dataset classifies people by a set of attributes as good
Assignment 1: "Duta exploration and preparation" Dataset Description: You will work on Credit dataset. The dataset classifies people by a set of attributes as good or bad credit risks. The dataset includes 750 examples and cach example is described by 20 attributes and a class label Data sets: Each group is assigned an individual dataset. Please, download the "Credit-Dataset" that is linked to your group name (es. group is assigned Credit Dataset().csv") Tasks: Your tasks include: A. Initial data exploration Al. Identify the type of each attribute (nominal, ordinal, asymmetric binary, symmetric binary, interval or ratio). A2. Using Weka, explore your data set and identify any outliers A3. Using Weka, explore your data set and identify any patterns. Hint: please consider scatter plots B. Data pre-processing BI. Use the following binning techniques to smooth the values of the duration" attribute: equi-width binning (3 bins). equi-depth binning (3 bins). B2. Use the following techniques to normalise the credit_amount" attribute min-max normalization to transform the values onto the range (0.0-1.0). 2-score normalization to transform the values B3. Discretise the "Age" attribute into the following categories: Teenager = 1-16 Young - 17-35: Mid Age = 36-55: Mature - 56-70; Old - 71 Provide the frequency of each category in your data set. C. Association Rules Mining Use Association rule techniques to CI. Extract and evaluate possible associations C2 Explain three selected rules The delivery for this assignment is a report. In the report include a section (starting with a section title) for cach of the tasks in this assignment including tasks A, B and C. Assignment 1: "Duta exploration and preparation" Dataset Description: You will work on Credit dataset. The dataset classifies people by a set of attributes as good or bad credit risks. The dataset includes 750 examples and cach example is described by 20 attributes and a class label Data sets: Each group is assigned an individual dataset. Please, download the "Credit-Dataset" that is linked to your group name (es. group is assigned Credit Dataset().csv") Tasks: Your tasks include: A. Initial data exploration Al. Identify the type of each attribute (nominal, ordinal, asymmetric binary, symmetric binary, interval or ratio). A2. Using Weka, explore your data set and identify any outliers A3. Using Weka, explore your data set and identify any patterns. Hint: please consider scatter plots B. Data pre-processing BI. Use the following binning techniques to smooth the values of the duration" attribute: equi-width binning (3 bins). equi-depth binning (3 bins). B2. Use the following techniques to normalise the credit_amount" attribute min-max normalization to transform the values onto the range (0.0-1.0). 2-score normalization to transform the values B3. Discretise the "Age" attribute into the following categories: Teenager = 1-16 Young - 17-35: Mid Age = 36-55: Mature - 56-70; Old - 71 Provide the frequency of each category in your data set. C. Association Rules Mining Use Association rule techniques to CI. Extract and evaluate possible associations C2 Explain three selected rules The delivery for this assignment is a report. In the report include a section (starting with a section title) for cach of the tasks in this assignment including tasks A, B and C
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
