Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Many thousands of audits of taxpayers' tax returns might be performed each year. The outcome of an audit may be productive, in which case an

Many thousands of audits of taxpayers' tax returns might be performed each year. The outcome

of an audit may be productive, in which case an adjustment to the information supplied was

required, usually resulting in a change to the amount of tax that the taxpayer is liable to pay (an

increase or a decrease). An unproductive audit is one for which no adjustment was required after

reviewing the taxpayer's affairs. The audit dataset attempts to simulate this scenario. It is

supplied as a CSV file. The dataset consists of 2,000 fictional taxpayers who have been audited

for tax compliance. For each case, an outcome of the audit is recorded in TARGET_Adjusted,

i.e., whether the financial claims had to be adjusted or not. The actual dollar amount of any

adjustment that resulted is also recorded in RISK_Adjustment (noting that adjustments can go

in either direction). The audit dataset contains 12 variables, with the first variable being a unique

client ID, followed by 9 input variables from Age to Hours.

Please NOTE: 1) RISK_Adjustment is only used for generating the profit curve. Do not use it as

an input variable when developing a classifier; 2) TARGET_Adjusted is the target variable with

1 for Yes as the positive class and 0 for No as the negative class.

You are asked to use this dataset accomplish the following tasks:

1) Load the dataset into Rattle and impute the missing values of Employment and

Occupation. Show in your report a screenshot of your variables (after imputation) under

the Data tab in Rattle.

2) Choose a seed (note: show it in your report).

3) Find the BEST SVM model to predict the target variable, TARGET_Adjusted, by tuning

the parameters using the validation set (of the default partition of 70/15/15) and the

area under the ROC curve (AUC) as your evaluation criterion (noting that: the higher AUC

score, the better performance). Show your parameters and AUC scores in a table (no need

to show the ROC plots). 

4) Then set the partition to 85/0/15, and rebuild your SVM model using the parameters

that give you the highest AUC score in Task 3. Show the ROC plot of this model on the

testing dataset.

5) Finally, use the model in Task 4 and the testing dataset to generate a profit curve. Assume

that the cost for auditing a return is $3000, and that the RISK_Adjustment column

contains the amount recovered from a successful audit, which is used to estimate the

benefit. Based on your profit curve, give the IRS a specific recommendation for what

percentage of returns should be audited

Step by Step Solution

3.34 Rating (148 Votes )

There are 3 Steps involved in it

Step: 1

Description Many thousands of audits of taxpayers tax returns might be performed e... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Auditing and Assurance services an integrated approach

Authors: Alvin a. arens, Randal j. elder, Mark s. Beasley

14th Edition

133081605, 132575957, 9780133081602, 978-0132575959

More Books

Students also viewed these Accounting questions

Question

10. What is meant by a feed rate?

Answered: 1 week ago