| | A Mandarin Text-to-Speech System (Site not responding. Last check: 2007-10-17) |
 | | In TA, statistical model based method is first employed to automatically tag the input text to obtain the word sequence and the associated part-of-speech (POS) sequence. |
 | | In PIG, a four-layer recurrent neural network (RNN) is employed to generate some prosodic information including the pitch contour, energy level, initial duration and final duration of syllables as well as the inter-syllable pause duration. |
 | | Lastly, in PSOLA, the basic waveform sequence is modified using the prosodic information to generate output synthetic speech. |
| rocling.iis.sinica.edu.tw /CLCLP/Vol1-1/a3.htm (248 words) |