Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 06, 2024

Machine Learning - Project Objective. In this project you will have to: Design Implement Evaluate and Optimize a machine learning model to be applied for

Machine Learning $-$ Project

Objective. In this project you will have to:

Design

Implement

Evaluate and Optimize

a machine learning model to be applied for Sentiment Classification.

Learning Outcomes. Through this project, the student will get familiar with the experimental

evaluation of machine learning models, pre $-$ processing of data, writing a short technical report,

selecting parameters of a model, combining classifiers, use of appropriate libraries $($ RapidMiner $,$

Python $),$ and dealing with a real $-$ world machine learning problem.

Teams. This project can be done individually or in teams of two or three members.

Data. You can find the data uploaded to moodle. The file includes a set of documents marked with

the following labels: positive, neutral, negative. The data are in raw format $($ text $)$ so

you have to convert them to the form that the algorithms can process. This is part of the

pre $-$ processing step.

Goal. You have to train, optimize and evaluate the models on these data in order to get the best

possible predictive performance in new, unknown data.

Experimenting and Selecting your model. You are provided with a file named train.csv $.$ Use this

to select the best model for this type of data in your opinion. For that, you need to experiment with

multiple algorithms $($ that we have seen in class $)$ and parameters $($ as we have seen in class $)$ for each

algorithm and with a proper evaluation process $($ train $-$ test, cross $-$ validation, or whatever you think is

best $) .$ This experimentation should be presented in your Technical Report.

Preparing your model for evaluation. After you decide on the best model, you need to prepare it to

classify some new, unknown data. The new data will not be provided to you. The instructors should

be able to use your model in order to classify new data. The instructors will have a test.csv file which

will have an identical format as the train.csv file, but with no labels $($ that column will be missing $) .$

Your model $($ process $)$ should load that data, classify and store the decision in a

predictions.txt file. One prediction per line.

The predictions.txt file should look like this:The instructors, having:

a $)$ The predictions.txt of your model.

b $)$ The actual $($ correct $)$ labels of the new data $($ that you don't have $) .$

Will calculate the accuracy of your approach.

Based on the accuracy, your approach will be ranked in comparison to the other teams in the course.

This ranking will affect one part of your project grade $($ see below $) .$

Deliverables

For this project, you must submit on Moodle $($ only one member of the team $)$ :

Your trained model.

a $.$ RapidMiner

i $.$ The $.$ rmp file of your process $($ train $_$ test.rmp $) .$ This process will be

used by the evaluators. The process should load the train.csv $.$

Instructors should be able to use this process to load a

test.cv $($ with the

unknown data, as described above $)$ and generate your

predictions.txt file.

ii $.$ The $.$ rmp file of the process you used for evaluating the model.

$($ experiments $.$ rmp $) .$ If you have more than one process that you used

for your experiments you can upload multiple files $($ experiments $1 . r r m p,$

experiments $2 .$ rmp $,$ etc $) .$

b $.$ Python $-$ you should provide the files with all your code. One of your files should be

named

main.py $.$ In this file, you should have a function that will be named

train $_$ test $($ train $.$ csv $,$ test.csv $)$ taking as parameters the

train.csv file $($ that is provided to you $)$ and the test.csv file $($ that is not

provided to you $),$ so that the evaluators can easily run it and get results $($ i $.$ e $.$ the

predictions.txt file $) .$

c $.$ RapidMiner and Python:

If you think it is a good idea to add a file with instructions on how to run

your process $/$ code to the instructors please add a readme.txt file to your

submission.

Obviously, apart from the files above that are required for the final

evaluation, you will have to write code or create processes that will help you

identify what is the best model. The code $($ or the processes $)$ of these

experiments should be submitted separately and named

experiments.py or experiments.rmp $)$

A technical report

Describe in $4$ pages how you have decided to use the model you have selected. What

experiments you did, which models you tried out, your observations, how you did

the evaluation, why you selected that particular setting in the end and whatever else

you think is necessary. You are advised to professionally present your results with

tables and plots.

Marks $-$ Overview: Total: $100$ points, out of which:

A $. 14$ points: pre $-$ processing

B $. 50$ points: model evaluation $($ experiments $,$ tuning, argumentation, links with theory, etc. $)$

C $. 20$ points: quality of technical report $($ presentation of results, plots, tables, etc. $)$

D $. 16$ points: your rank

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

__________ is a number of currency units, shares, bushels, pounds, or other units specified in a contract. a. Premium b. Margin c. Notional amount d. Collateral amount e. Settlement amount

Answered: 1 week ago

Question

★★★★★

Several years ago, Blaha Company purchased Husker Company as a subsidiary. At that time, Blaha Company recorded goodwill of $100,000 related to the purchase. Since that time, the company has not...

Answered: 1 week ago

Question

★★★★★

Custom Cabinetry has one job in process (Job 120) as of June 30: at that time, its job cost sheet reports direct materials of $7.600. direct labor of $2.900, and applied overhead of $2.610. Custom...

Answered: 1 week ago

Question

★★★★★

Identifying financing, investing, and operating transactions Required For a company like Canadian Tire Corporation, provide two examples of transactions that you would classify as financing,...

Answered: 1 week ago

Question

★★★★★

What are the different journals used in accounting and why use different journals?

Answered: 1 week ago

Question

★★★★★

Despite the three strategies the middle class used when their wages went flat, the middle class eventully started to shrink. What are three reasons why? (the Vicious Cycle) Your reasons will be the...

Answered: 1 week ago

Question

★★★★★

Write 2-3 paragraphs (300 words) We have spent our time together exploring the nuts and bolts of the Human Resources processes and their value to our organizations and to health care leaders. As we...

Answered: 1 week ago

Question

★★★★★

Bongo has a long term contract (it can be regarded as indefinite) that they must fulfil to provide swimsuits for the local swimming team. The swimsuits are "one size fits all". The year can be split...

Answered: 1 week ago

Question

★★★★★

Research another country's elder care system and address beliefs, spiritual issues, and cultural issues that are also factors in the delivery and provision of health care to this population.

Answered: 1 week ago

Question

★★★★★

A Drive - In Restaurant is considering a new marketing idea. At this restaurant, diners have the choice of dining inside or staying in their cars to eat. To serve those dining in their cars, carhops...

Answered: 1 week ago

Previous Question Next Question