Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In: In: 2. Preliminary Wrangling, Dataset Information: Dataset recording people to invest in each other in a way that is nancially and socially rewarding. On
In: In: 2. Preliminary Wrangling, Dataset Information: Dataset recording people to invest in each other in a way that is nancially and socially rewarding. On loans, borrowers list loan requests between 2, OOOrmd 35,000 and individual investors invest as little as $ 25 in each loan listing they select. Prosper handles the servicing of the loan on behalf of the matched borrowers and investors. A. Read the dataset called Pri-Load.csv 'l[ B. Check the data type and adjust datatype for all other categorical columns. 0. If you nd any of the missing values in the ProsperRating column, then drop it. 3. UNIVARIATE ANALYSIS A. What are the main features of interest in your dataset? Step1: Apply Univariate analysis using suitable charts for[ Loan status, Employment Status, Stated Monthly Income] Step2: Check if any column distribution skew or not. Step3: Write at least 2 observations 'For' each visualization. In [ ]: 4. BIVARIATE ANALYSIS A. Check the correlation matrix for all numeric variables. Maintain the Strong positive and Negative correlations columns. B. Check the relation between LoanOriginalAmount and BorrowerAPR columns. Step1: Use subplots. Plot1: Scatter Plot of LoanOriginalAmount and BorrowerAPR columns Plot2: HeatMap of LoanOriginalAmount and BorrowerAPR Step2: Write your observations. C.Display the seperate box plot for y = BorrowerAPR with x1 = LoanStatus , x2 = EmploymentStatus columns.Write your observations. In [ ]:5. MULTI VARIATE ANALYSIS, FEATURE ENGINEERING A. Write a program Step 1: Create a condition = 'LoanStatus'== 'Completed' | 'LoanStatus' == 'Defaulted' |\\' LoanStatus' == 'Chargedoff' Step2: Create a user define function using condition and LoanStatus column. Hint: df[ ' LoanStatus' ] = df . apply(user define function , axis=1) Sample output : Completed 168 Defaulted 59 B. Write a program Step 1: Create a dictionary called categories = 1: 'Debt Consolidation', 2: 'Home Improvement', 3: 'Business', 6: 'Aut o', 7: 'Other' Step2: Create a user define function using categories and ListingCategory (numeric) column. Hint: df[ 'ListingCategory (numeric) ' ] = df . apply(user define function , axis=1) Sample output : Debt Consolidation 106 Other 65B. Write a program Step 1: Create a dictionary called categories = 1: 'Debt Consolidation', 2: 'Home Improvement', 3: 'Business', 6: 'Aut o', 7: 'other' Step2: Create a user define function using categories and ListingCategory (numeric) column. Hint: df[ 'ListingCategory (numeric) ' ] = df . apply(user define function , axis=1) Sample output : Debt Consolidation 106 other 65 Business 25 Home Improvement 22 Auto 9 C. Display the box plot for ProsperRating (Alpha) vs LoanOriginalAmount and hue = Loan Status [Completed, Defaulted]. Write your observations. D. Display the catplot for ProsperRating (Alpha) vs ListingCategory (numeric) [Debt Consolidation, Other, Business, Home Improvement,Auto] and hue = Loan Status [Completed, Defaulted]. Write your observations. In [ ]:Exploratory Data Analysis Project 1. DESCRIPTIVE STATISTICS 1. Create a dataframe using below data and answer the below questions: Hourly_Income = [1000, 2009, 24418, 444478, 324235, 243242, 3434234, 7567457, 9235, 238237, 1312, 3412] Hourly_Expense = [651361, 217371, 2746, 2356, 13436, 5732, 346346, 3463, 1132, 23534, 242235, 235235] family_members_count = [3,4, 2,3, 1, 4,5,6,3,6,3,5] House_rent = [1299, 2300, 3411, 3422, 4566, 4211, 4600, 736, 672, 0, 734, 2374] Highest_income_Member = ["olivia", "George", "Isla", "Harry", "Ava", "Noah", "Sophia", "Jacobi", "Freddie", "Ella", "Grace", "E 1la" ] A. Display the five point summary of the data. B. What is the mean of the hourly expense? C. What is the median of the hourly expense? D. Find the family member with maximum income and using a suitable graph. E. Calculate IQR(the difference between 75% and 25% quartile) for Hourly_Income and Hourly_Expense . F. Calculate the standard deviation for first 2 columns. G. Calculate variance for the first 4 columns In [ ]
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started