Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 29, 2024

spaces) in such a way that the resulting phrase makes the most sense. For example, consider the string: breakfastinside If we consider only valid English

spaces) in such a way that the resulting phrase makes the most sense. For example, consider the string: "breakfastinside" If we consider only valid English words, some of the segmentations include: - "break fast in side" - "breakfast in side" - "break fast inside" However, the most logical way to segment this phrase is "breakfast inside". In natural language processing, one way to tell if a phrase makes sense is by using an n-gram model. In this homework, we will use a bigram model (n=2). Given two words w1 and w2, the bigram function b(w1,w2) gives you a score representing the likelihood that the word w2 appears after w1 in actual text. A lower score means that the likelihood is higher. For example, "red apple" will have a lower score than "dog apple". You might be wondering how does a bigram function know this likelihood. Well, one way is to learn these scores by scanning real-word newspapers, articles, etc! Given a phrase with words [w1,w2,w3,w4], the total bigram score is computed as: b(w0,w1)+b(w1,w2)+b(w2,w3)+b(w3,w4) where w0 is a special token to denote the beginning of the phrase. The bigram function b(w0,w1) can be interpreted as the likelihood that w1 is the first word of a phrase. A lower total score is an indicator of how sensible the whole phrase is. For this homework, assume that: - The only valid English words are "break", "fast", "breakfast", "in", "side", and "inside". Again, assume that w0 is a special token denoting the beginning of the string. - The bigram scores are as follows (the likelihood of the first word being immediately followed by the second word*): - This is fust a simpithed example contanung syntnetic values to ulustrate the inturton behina bagrams. In practice, bigram scores are usually log-likelihoods of actual word order frequencies and contain a lot of decimal values. Given the example above, the score of "breakfast inside" should be 3+3=6, while the score of "breakfast in side" should be 3+1+5=9. 1. Formulate the word segmentation task as a search problem. Make sure that your formulation works not only for the specific example above ("breakfastinside"), but also for any other string (assuming you always have access to a dictionary of valid words and bigram scores). a. What information should a state contain? b. What is the initial state? c. Given a state, how do you get the list of actions? d. Given a state and an action, how do you compute the cost? e. What are the goal states? 2. Draw the entire search tree for the specific problem "breakfastinside" based on the formulation of the search problem. Note that since this is a short string, the search tree should not have too many nodes. If your tree has a depth of more than 7 or has more than 15 nodes, you probably need to improve your formulation of the problem. Clearly label the states, actions, action costs, and the goal state. Make sure the states and actions have descriptive labels. 3. Which algorithm (breadth-first search, depth-first search, uniform-cost search) is most appropriate for finding a solution to this problem? Justify briefly. Again, think about the problem in general and not just the specific example above

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

=+8.16. Show that the period of j is the greatest common divisor of the set (8.52) [n: n > 1, f/m)> 0]. 8.17. 1 Recurrent events. Let f1, f2 .... be nonnegative numbers with = D, f.s

Answered: 1 week ago

Question

★★★★★

a. Barga Company purchases $20,000 of equipment on January 1, 2015. The equipment is expected to last five years and be worth $2,000 at the end of that time. Prepare the entry to record one years...

Answered: 1 week ago

Question

★★★★★

Q 3 ) In a straight tube measuring 0 . 2 5 m long, acetone vapor diffuses in the presence of air at 2 9 8 K and a total pressure of 1 atm. At one end of the tube, the partial pressure of acetone is 1...

Answered: 1 week ago

Question

★★★★★

On January 1, 2015, Carr Company purchased Fay Company 9% bonds with a face amount of P4,000,000 for P3,756,000 to yield 10%. The bonds are dated January 1, 2015, mature on December 31, 2024, and pay...

Answered: 1 week ago

Question

★★★★★

Suppose you are the operations manager at an airline company. The company guarantees its customers that its flights are delayed by at most three minutes on average, as opposed to much higher figures...

Answered: 1 week ago

Question

★★★★★

CASE 9-29 Master Budget for a Manufacturer [LO3, LO4] Garneau Manufacturing Ltd. produces and distributes a special type of chemical compound called Compound WX. The information below about Garneau's...

Answered: 1 week ago

Question

★★★★★

The master budget at Monroe Manufacturing last period called for sales of 43,500 units at $57 each. The costs were estimated to be $41 variable per unit and $539,000 fixed. During the period, actual...

Answered: 1 week ago

Question

★★★★★

Which sentence best explains a paragraph? a A collection of sentences that discuss several ideas. b A collection of ideas in many sentences to explain a topic. c A paragraph is a collection of...

Answered: 1 week ago

Question

★★★★★

please help Two charged objects are located at the vertices of a right-angle triangle as shown. O A = - 5.0 PC 20 cm X

Answered: 1 week ago

Previous Question Next Question