Please use this identifier to cite or link to this item:
Type: Artigo de periódico
Title: A hybrid model for text-to-speech synthesis
Author: Violaro, F
Boeffard, O
Abstract: This paper describes a hybrid model developed for high-quality, concatenation-based, text-to-speech synthesis. The speech signal is submitted to a pitch-synchronous analysis and decomposed into a harmonic component, with a variable maximum frequency, plus a noise component. The harmonic component is modeled as a sum of sinusoids with frequencies multiple of the pitch. The noise component is modeled as a random excitation applied to an LPC filter. In unvoiced segments, the harmonic component is made equal to zero. In the presence of pitch modifications, a new set of harmonic parameters is evaluated by resampling the spectrum envelope at the new harmonic frequencies. For the synthesis of the harmonic component in the presence of duration and/or pitch modifications, a phase correction is introduced into the harmonic parameters. The sinusoidal model of synthesis is used for the harmonic component and the LPC model combined with an overlap and add procedure is used for the noise synthesis. This hybrid model enables independent and continuous control of duration and pitch of the synthesized speech. Comparative evaluation tests made in a text-to-speech environment have shown that the hybrid model assures better performance than the time-domain pitch-synchronous overlap-add (TD-PSOLA) model.
Subject: prosodic modifications
speech synthesis
Country: EUA
Editor: Ieee-inst Electrical Electronics Engineers Inc
Rights: fechado
Identifier DOI: 10.1109/89.709668
Date Issue: 1998
Appears in Collections:Artigos e Materiais de Revistas Científicas - Unicamp

Files in This Item:
File Description SizeFormat 
WOS000075543300002.pdf250.42 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.