Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Draw an FST that will use as input pronunciation representations, one symbol at a time, and produce the corresponding words as output. The input alphabet

Draw an FST that will use as input pronunciation representations, one symbol at a time, and produce the corresponding words as output. The input alphabet should be the set of phonemes in the CMU dictionary page (i.e. ? = {AA, AE, AH, AO, AW, B, CH, D, DH, ...}), and the output alphabetshould be exactly ? = {language, languages, student, students}. (Pay close attention: the output alphabet is not the set of letters; it is those four symbols.) The output language should be ?*.

Part 2: Prolog FST

Consider the following FST, represented in Prolog. (You don't need to know Prolog to understand this way to represent FSTs.)

transition(1, iy, 2, eps). transition(2, z, 3, eps). transition(3, iy, 1, easy). transition(1, iy, 4, eps). transition(4, z, 5, eps). transition(5, iy, 6, eps). transition(6, g, 7, eps). transition(7, ow, 8, eps). transition(8, ih, 9, eps). transition(9, ng, 1, easygoing). initial(1). final(1).

Each line represents a transition from one state to the other. The first argument is the state from where the transition starts, the second argument is the input symbol for the transition, the third argument is the state where the transition goes, and the last argument is the output symbol. Epsilon is represented as eps, and the pronunciation symbols are lowercased, so that they are not interpreted as prolog variables (we will talk about this in class). The initial state is state 1, and the only final state is also state 1.

Now examine the file fst.pl, available in the HW1 folders under Files on Canvas. Try to understand how the prolog code runs a transducer expressed like the one above.

Don't panic! This will be explained in detail in class, but an outline is provided below.

Using the predicate fst/2 (the slash 2 means that go takes two arguments), you can specify an input string and an output string. With fst(T, [easy]), prolog will find that T = [iy, z iy]. With fst([iy, z, iy], W), prolog will find that W = ["easy"]. This way, the transducer can be run in either direction.

fst(Input, Output) :- initial(State), go(State, Input, Output).

fst/2 first finds a state State that makes initial(State) true. In the example above, the knowledge base (the specification of the transducer) includes initial(1). so prolog finds that State = 1 works, and proceeds to try go(1, Input, Output), where Input and Output are exactly the same as in how go appeared in the query. If the query is fst([iy, z, iy], W), prolog tries go(1, [iy, z, iy], W), which we can interpret as: the current state is 1, the input symbols are [iy, z, iy], and the output is W (which is a variable). Prolog will then try to find a value of W that makes this true.

go(CurrentState, [A|InString], OutString) :- transition(CurrentState, A, NextState, eps), A \= eps, go(NextState, InString, OutString).

go/3 works by first finding a suitable transition from the knowledge base. Notice that the second argument is in the form [A|InString]. That means that we expect the second argument to be a list, and the first item in the list will go in the variable A, and the rest of the list will be in the variable InString. In our example above, go(1, [iy, z, iy], W), A will be iy (the first item in the list), and InString will be [z, iy] (the rest of the list). We then try transition(1, iy, NextState, eps), and we find that our knowledge base does include two transitions that match; one of them makes NextState = 2, and the other makes NextState = 4. Prolog will try them both, and see if one of them works out. A \= eps is true if A is not eps, which means that we don't want an epsilon as the input symbol at this point. Finally, go(NextState, InString, OutString) is the recursive call that continues the transduction. We move to the next state, with what's left of the input and output.

Notice that there are four versions of go/3 in the code. One has no epsilon in the input or output. One has epsilon in the input, but not in the output. One has epsilon in the output, but not in the input. The difference in each of these is what happens to the input and output lists. Your task is to add support for transitions with epsilon both in the input and output. In other words, add the code to handle transitions like transition(10, eps, 1, eps) (which cannot be handled by the code provided). You don't need to be a prolog programmer to do this.

Part 3: Beach wrecking

In this last part of the assignment you will download both fst.pl and cmudict.pl (a prolog FST built from the CMU pronunciation dictionary, with phonemes as the input symbols, and words as the output symbols, as in the example in part 2). Save fst.pl and cmudict.pl in the same directory. Open SWI-prolog and change the current directory to the directory where you saved the files. For example:

?- cd('/Users/sagae/Desktop/LIN177').

Your directory will be different. Find out what it is (if you have no idea, ask for help!), and replace it in the query above.

If that succeeds, you can now consult the files fst.pl and cmudict.pl. Consulting a file is how prolog load the contents of the file so that it can be used. Consult (load) the files like this:

?- [fst]. ?- [cmudict].

(Or, instead of using the commands above, you can just consult fst.pl and cmudict.pl by using the menu bar and choosing File -> Consult.)

Note the following: (1) you always need the period at the end; (2) do not type the "?-" part, that's only there so that you know that this is meant to be typed in the interpreter window (the one that shows you the ?- prompt).

Then try the following:

?- fst(T, [student]).

That should give you T = [s, t, uw, d, ah, n, t]. You used the transducer to find the pronunciation of the word student.

Now try

?- fst([s, t, uw, d, ah, n, t], W).

You should get W = ["student"]. This time, you used the transducer to find which word has the pronunciation S T UW D AH N T. In Prolog, we can have lists enclosed in square brackets, like the list of symbols above.

Now try

?- fst(T, [ice, cream]).

Note the value of T, which is a representation of the pronunciation of ice cream. Use this pronunciation as the input:

?- fst(enter the input here, W).

(Don't type "enter the input here"; you need to get the actual list from the previous query.)

The FST will give you the words that correspond to the pronunciation provided.

Once you get a result, press the ";" key. That will give you additional results, if there are any. Keep pressing ";" until you reach the end ("fail", which doesn't means it failed in general, it just means that you ran out of results after succeeding several times).

What values of W did you get?

Finally, write a predicate soundslike/2 that will take two arguments. You can add it to the end of fst.pl, or create a new file. Your predicate will start with:

soundslike(X, Y) :-

and enter your code after that. The soundslike/2 (remember, the slash 2 just means that soundslike takes two arguments) predicate will be true if each of the two arguments is a list of English words, and each of the lists have the same pronunciation. For example, if you enter

?- soundslike([ice, cream], W).

you should get the same values of W as in the ice cream example above. That is, each value of W will be a list of words that together sound like "ice cream".

Include your soundslike/2 code in your submission (just paste it into your document).

Finally, with your working soundslike/2, try:

?- soundslike([computational, linguistics], A).

What happens?

(And that's only part of why it's hard to wreck a nice beach!)

Submission

Submit a single PDF through canvas. Your PDF must include at least:

The FST from part 1 (a picture is fine);

Your code for part 2 (only your code is necessary, not the entire fst.pl code, unless you change it);

The values of W for part 3;

Your soundslike/2 code for part 3;

Your observations after running soundslike/2 with [computational, linguistics].

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Concepts

Authors: David Kroenke

4th Edition

0136086535, 9780136086536

More Books

Students also viewed these Databases questions