Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 16, 2020

The adult data set at the UCI Machine Learning Repository is derived from census records.5 In these data, the goal is to predict whether a

The “adult” data set at the UCI Machine Learning Repository is derived from census records.5 In these data, the goal is to predict whether a person’s income was large (defined in 1994 as more than $50K) or small. The predictors include educational level, type of job (e.g., never worked, and local government), capital gains/losses, work hours per week, native country, and so on.6 After filtering out data where the outcome class is unknown, there were 48842 records remaining. The majority of the data were associated with a small income level (75.9 %). The data are contained in the arules package and the appropriate version can be loaded using data (AdultUCI).

(a) Load the data and investigate the predictors in terms of their distributions and potential correlations.

(b) Determine an appropriate split of the data.

(c) Build several classification models for these data. Do the results favor the small income class?

(d) Is there a good trade-off that can be made between the sensitivity and specificity?

(e) Use sampling methods to improve the model fit.

(f) Do cost-sensitive models help performance?

Step by Step Solution

★★★★★

3.55 Rating (152 Votes )

There are 3 Steps involved in it

Step: 1

a Load the data and investigate the predictors in terms of their distributions and potential correlations The data can be loaded from the arules package Once the data is loaded it is important to inve... blur-text-image

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Probability And Statistics

Probability And Statistics

Authors: Morris H. DeGroot, Mark J. Schervish

4th Edition

9579701075, 321500466, 978-0176861117, 176861114, 978-0134995472, 978-0321500465

More Books

Students explore these related Physics questions

Question

An air-hockey puck has m = 50 g and D = 9 cm. When placed on a 20C air table, the blower forms a 0.12-mm-thick air film under the puck. The puck is struck with an initial velocity of 10 m/s. How long...

Answered: 3 weeks ago

Question

Can gains or losses to a parent/investor result from a subsidiarys/investees treasury stock transactions? Explain.

Answered: 3 weeks ago

Question

Can nonprofit, educational or government organizations benefit from supply chain management? How?

Answered: 3 weeks ago

Question

Match the phrase that follows with the term (a-e) it describes. estimates the number of units to be manufactured to meet sales and inventory levels integrated set of operating and financing budgets...

Answered: 3 weeks ago

Question

Compute the value of E: 60 45 30 15 i-1 2%

Answered: 3 weeks ago

Question

14. Healthy Eating Has the consumption of red meat decreased over the last 10 years? A researcher selected hospital nutrition records for 400 subjects surveyed 10 years ago and compared the average...

Answered: 3 weeks ago

Question

=+b) Why is there no predictor variable for December?

Answered: 3 weeks ago

Question

In what types of situations is conducting a census more appropriate than sampling? When is sampling more appropriate than taking a census?

Answered: 3 weeks ago

Question

Write a user-defined MATLAB Function for the following math function: Z(x,y) = ex cos (y) + sin(x2 - y) The inputs to the function are x and y the output is Z. a. Write the function such that x can...

Answered: 3 weeks ago

Question

Telo Companys ledger on July 31, its fiscal year-end, shows merchandise inventory of $37,800 before accounting for any shrinkage. A physical count of its July 31 year end inventory discloses that the...

Answered: 3 weeks ago

Question

A tire company tested a particular model of super radial tire and found the tires to be normally distributed with respect to wear. The "average" (mean) tire wore out at 59,000 miles, and the standard...

Answered: 3 weeks ago

Question

a (400-2a) Figure 1.48 Problem 4 4 m T a A 15 mm- D |-- P-17 kN A round steel bar of a length of 400mm and a diameter of 15mm at the middle portion is subjected to an axial tensile force of 17kN....

Answered: 3 weeks ago

Question

One of the restrictions in this exercise was that you were not allowed to use loops. All of the functions in this exercise could have been implemented with loops instead of list methods. Which do you...

Answered: 3 weeks ago

Question

Answer the question in terms of kN for the last picture no need to do the other ones with the solution. In problem 2-23, a round tube section 3 mm thick was one of the options to prevent buckling. If...

Answered: 3 weeks ago

Question

Question 1 (6 marks) Two cables are attached to a wall by a pin at O with the tensions in the cables adjusted to 450 N and 500 N as shown in the following diagram. 450N 30 500N Use the method of...

Answered: 3 weeks ago

Question

Many people have argued that the continuum of processes described by Garvin is outdated and does not apply to digital businesses. Very often digital businesses do not have equipment, or raw material,...

Answered: 3 weeks ago

Question

1. In the formula A ( t ) = Pe rt for continuously compound interest, the letters P , r , and t stand for principal , interest rate per year , and , number of years respectively, and A ( t ) stands...

Answered: 3 weeks ago

Question

Subtract the polynomials. (-x+x-5) - (x-x + 5)

Answered: 3 weeks ago

Question

Suppose that X1 and X2 are independent random variables, and that Xi has the normal distribution with mean bi and variance 2i for i = 1, 2. Suppose also that b1, b2, 21, and 22 are known positive...

Answered: 3 weeks ago

Question

If the probability that student A will fail a certain statistics examination is 0.5, the probability that student B will fail the examination is 0.2, and the probability that both student A and...

Answered: 3 weeks ago

Question

Suppose that a random sample of 10,000 observations is taken from the normal distribution with unknown mean and known variance is 1, and it is desired to test the following hypotheses at the level...

Answered: 3 weeks ago

Question

23. What are some possible treatments for Parkinsons disease other than L-dopa?

Answered: 3 weeks ago

Question

11. What would happen to the sleepwake schedule of someone who took a drug that blocked GABA? 12. Someone who has just awakened sometimes speaks in a loose, unconnected, illogical way. How could you...

Answered: 3 weeks ago

Question

6. What do long, slow waves on an EEG indicate?

Answered: 3 weeks ago

Previous Question Next Question