Question
Many thousands of audits of taxpayers' tax returns might be performed each year. The outcome of an audit may be productive, in which case an
Many thousands of audits of taxpayers' tax returns might be performed each year. The outcome
of an audit may be productive, in which case an adjustment to the information supplied was
required, usually resulting in a change to the amount of tax that the taxpayer is liable to pay an
increase or a decrease An unproductive audit is one for which no adjustment was required after
reviewing the taxpayer's affairs. The audit dataset attempts to simulate this scenario. It is
supplied as a CSV file. The dataset consists of fictional taxpayers who have been audited
for tax compliance. For each case, an outcome of the audit is recorded in TARGETAdjusted
ie whether the financial claims had to be adjusted or not. The actual dollar amount of any
adjustment that resulted is also recorded in RISKAdjustmentnoting that adjustments can go
in either direction The audit dataset contains variables, with the first variable being a unique
client ID followed by input variables from Age to Hours
Please NOTE: RISKAdjustment is only used for generating the profit curve. Do not use it as
an input variable when developing a classifier; TARGETAdjusted is the target variable with
for Yes as the positive class and for No as the negative class.
You are asked to use this dataset accomplish the following tasks:
Load the dataset into Rattle and impute the missing values of Employment and
Occupation Show in your report a screenshot of your variables after imputation under
the Data tab in Rattle.
Choose a seed note: show it in your report
Find the BEST SVM model to predict the target variable, TARGETAdjusted by tuning
the parameters using the validation set of the default partition of and the
area under the ROC curve AUC as your evaluation criterion noting that: the higher AUC
score, the better performance Show your parameters and AUC scores in a table no need
to show the ROC plots
Then set the partition to and rebuild your SVM model using the parameters
that give you the highest AUC score in Task Show the ROC plot of this model on the
testing dataset.
Finally, use the model in Task and the testing dataset to generate a profit curve. Assume
that the cost for auditing a return is $ and that the RISKAdjustment column
contains the amount recovered from a successful audit, which is used to estimate the
benefit. Based on your profit curve, give the IRS a specific recommendation for what
percentage of returns should be audited
Step by Step Solution
3.34 Rating (148 Votes )
There are 3 Steps involved in it
Step: 1
Description Many thousands of audits of taxpayers tax returns might be performed e...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started