Question
Part 2: Aspire You may have noticed that the t sound is a little different in the words top and stop , and the p
Part 2: Aspire
You may have noticed that the t sound is a little different in the words top and stop, and the p sound is also a little different in pit and spit. This is because of aspiration. In top and pit, there is a stronger burst of breath that comes out with the first consonant sound.
A (simplistic) phonological rule for aspiration in English goes as follows:
Voiceless plosives are aspirated when they occur immediately before a stressed vowel, and there is no [s] immediately preceding the voiceless plosive.
For example, in pot [pht] we get aspiration, but not in spot [spt].
Let's now consider a version of the CMU pronunciation dictionary that has stress marked in the transcriptions. For example, the word potato looks like
P AH T EY1 T OW
where the 1 appended to EY means that the vowel is stressed. Every vowel can be unstressed, e.g. AA, EY, OW, or stressed, e.g. AA1, EY1, OW1 (this is a slight simplification of how stress is represented in the standard CMU pronunciation dictionary.)
Create an FST that adds aspiration to transcriptions according to the rule above. The input alphabet is the set of symbols in the ARPABET plus the stressed vowels ending in 1. The output alphabet is the same, plus additional symbols as follows: each symbol that corresponds to a voiceless plosive gets a duplicate with _h appended to it. For example, the symbol T is in the output alphabet, and in addition, we also have the new symbol T_h in the output alphabet.
Step 1: Find the CMU dictionary ARPABET symbols that correspond to voiceless plosives (consult the Ojeda book if needed, or the IPA chart).
Step 2: Consider now the expanded output alphabet based on step 1.
Step 3: Create a transducer that implements the phonological rule above and produces output with the new symbols that reflect aspiration when appropriate. The input is the transcription as produced by a transducer like cmudict.pl from HW 1, but with the additional symbols to denote stress (e.g. ay1, ow1, ey1), as shown above. The output should be the transcription with aspiration.
To create the transducer in prolog format, we make all of the symbols lowercase. For example, if our input is
[p, ah, t, ey1, t, ow]
the output should be
[p, ah, t_h, ey1, t, ow]
where the t_h is the aspirated allophone of t.
Notice that the first t is aspirated, but the p and second t are not.
(It is fine if your transducer maps the input string to itself, in addition to the aspirated version. However, it is not fine if your transducer creates aspirated plosives where it is not appropriate according to the rule above.)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started