Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Plz code in java and help on how to use the program where the fit data and test data are supposed to go in the
Plz code in java and help on how to use the program where the fit data and test data are supposed to go in the script just tell me where and how plz need help thank you
The assignment is to prepare the datasets in a format required to be used with Weka. You need to convert the files into the ARFF format described to build and evaluate classification models:
SET
:
For Prediction
fit.arff file to build the prediction models. You only need to reformat the original fit file to the ARFF format without any changes and add the required labels.
test.arff file to evaluate the prediction models. Same as above.
SET
:
For Classification
fit.arff file to build classification models. You need to add a column describing the class of each module: fault
prone
fp
or not fault
prone
nfp
Fault proneness is based on a threshold of number of faults. In this assignment, modules with less than
faults are considered nfp
and modules with
or more faults are considered fp
Make sure you do not use the number of faults column as an independent variable while doing classification.
test.arff file to evaluate the classification models. Same as above.
Please make sure you label the data correctly and comment the ARFF file
instances
attributes date, author....
The original data file has
columns Following is the description of what each column represents
in the same order
:
Number of unique operators
NUMUORS
Number of unique operands
NUMUANDS
Total number of operators
TOTOTORS
Total number of operands
TOTOPANDS
McCabe's cyclomatic complexity
VG
Number of logical operators
NLOGIC
Lines of code
LOC
Executable line of code
ELOC
Number of faults
FAULTS
A sample of an ARFF file for classification is available here.
I
B
Modeling assignment: Prediction
No late submissions are accepted.
This part of the project will build models to predict the number of faults based on the other attributes of the instances. Each model is to be first built and evaluated using
fold cross validation on the fit data set, and then validated using the test data set. Use the data sets prepared for prediction in the previous assignment of the project.
Build the following prediction models:
Linear Regression
Decision Stump
For linear regression, compare the model selection methods: greedy, M
no selection. Compare the models, how many and which independent variables were selected? Use the statistical indicators provided by Weka to perform the comparisons.
Your report should include all the results based on
fold cross
validation and on the test data set. You should also compare the results of all the methods.
This assignment is in java in a data set open source program called weka the university of new zealand
Plz code in java and help on how to use the program where the fit data and test data are supposed to go in the script just tell me where and how plz need help thank you
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started