Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Data Analysis using R: Employee Attrition Dataset and Acme Dataset Location: Course Content - Datasets - EmployeeAttrition.csv and Acme.csv Tool: You need to use RStudio

Data Analysis using R: Employee Attrition Dataset and Acme Dataset Location: Course Content - Datasets - EmployeeAttrition.csv and Acme.csv Tool: You need to use RStudio for this assignment inside Virtual Box VM

  1. Open RStudio by clicking on blue circle R icon on the left side launch bar. This will open RStudio main screen.
  2. On the top menu, click on File > Open File and locate the Assignment-5-R-Data-Analytics-Source-File.R provided with the assignment.
  3. Also make sure to add the complete path of EmployeeAttrition.csv file inside source file. For example, if you store EmployeeAttrition.csv in Documents folder then your complete path should be

~/Documents/EmployeeAttrition.csv

  1. Press Source button (green arrow icon) inside RStudio window on top. This will compile and run your code.
  2. If you want to compile and run single line/statement in your code, press Run button (green arrow icon) inside RStudio window on top.
  3. If everything goes well, you will see output in blue or black color. If your code has any error, it will be written in red color.
  4. Write your remaining code in the same source file to answer all the questions below.

ANSWER ALL Questions BELOW:

Please answer the following questions

NOTES: Submit one R source code file of your code and output document (word or pdf) containing all the outputs. Make sure to print all outputs to questions in your R source code file so that when class TA runs your code in RStudio, it will show all the output for each question.

Please change your source code filename with your FullName-GNumber and submit to blackboard.

Part 1

  1. Use EmployeeAttrition.csv and write R code to find the following:
    1. Find the number of rows and columns in the dataset
    2. Find the maximum Age in the dataset
    3. Find the minimum DailyRate in the dataset
    4. Find the average/mean MontlyIncome in the dataset
    5. How many employees rated WorkLifeBalance as 1

  1. What percent of total employees have TotalWorkingYears less than equal to 5? Also calculate the percentage for TotalWorkingYears greater than 5
  2. Print EmployeeNumber, Department and MaritalStatus for those employees whose Attrition is Yes and RelationshipSatisfaction is 1 and YearsSinceLastPromotion is greater than 3
  3. Find the mean, median, mode, standard deviation and frequency distribution of EnvironmentSatisfaction for males and females separately. (Hint: For frequency distribution use table() function

Part 2

Acme Corporation is accused of gender bias in setting starting salaries for newly hired workers. The accompanying synthetic dataset lists recent salary data of new hires, showing college degree (BS, MS, PhD), gender (M, F), years of previous experience and starting salary (thousands of dollars).

  1. Use Acme.csv and write R code to find the following:

    1. Identify data types for each attribute in the dataset
    2. Produce a summary statistics for each attribute in the dataset
    3. Produce visualizations for each attribute (Hint: use hist() function)
  1. Display the relationship between
    1. Years of Experience and Starting Salary for all employees
    2. Years of Experience and Starting Salary for each gender
    3. Years of Experience and Starting Salary for each degree

(Hint: use Scatter Plots)

  1. Find the correlation between Starting Salary and Years of Experience?
    1. Is the correlation different for each gender?
    2. Is the correlation different for each degree?
  2. What can you conclude about Acme with respect to gender bias after your overall analysis?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Online Market Research Cost Effective Searching Of The Internet And Online Databases

Authors: John F. Lescher

1st Edition

0201489295, 978-0201489293

More Books

Students also viewed these Databases questions

Question

My opinions/suggestions are valued.

Answered: 1 week ago