Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Project Description Develop a spelling checker (i.e., best word predictor) using a 3-gram language model. Each student needs to collect an Arabic corpus of 1

image text in transcribed

Project Description Develop a spelling checker (i.e., best word predictor) using a 3-gram language model. Each student needs to collect an Arabic corpus of 1 million words at least. Students can not share the same corpus, fully or partially with each other, and cannot re-use text from previous years. Tokenize the corpus into tokens/words, and then build a tri-gram language model for this corpus. The language model should contain: token, count. + the probability (or log) of the token, and should be saved in a CSV file. Develop an interface to allow the user to write text then click a "spell" button. If the user writes "#" in the text, the program will suggest the top five words and their probability) as a replacement of the #, using the language model. Each student should submit his/her project via Moodle. The project should include. The source code, corpus, language model. The project should be JAVA. Example: Spell # 0.81 0.4 0.38 0.21 0.75

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Datacasting How To Stream Databases Over The Internet

Authors: Jessica Keyes

1st Edition

007034678X, 978-0070346789

Students also viewed these Databases questions