A comparative study between MFCC and LSF coefficients in automatic recognition of isolated digits pronounced in Portuguese and English - doi: 10.4025/actascitechnol.v35i4.19825

Diego Furtado Silva, Vinícius Mourão Alves de Souza, Gustavo Enrique Almeida Prado Alves Batista

Resumo


Recognition of isolated spoken digits is the core procedure for a large number of applications which rely solely on speech for data exchange, as in telephone-based services, such as dialing, airline reservation, bank transaction and price quotation. Spoken digit recognition is generally a challenging task since the signals last for a short period of time and often some digits are acoustically very similar to other digits. The objective of this paper is to investigate the use of machine learning algorithms for spoken digit recognition and disclose the free availability of a database with digits pronounced in English and Portuguese to the scientific community. Since machine learning algorithms are fully dependent on predictive attributes to build precise classifiers, we believe that the most important task for successfully recognizing spoken digits is feature extraction. In this work, we show that Line Spectral Frequencies (LSF) provide a set of highly predictive coefficients. We evaluated our classifiers in different settings by altering the sampling rate to simulate low quality channels and varying the number of coefficients.

 


Palavras-chave


spoken digit recognition; mel-frequency cepstrum coefficients; line spectral frequencies.

Texto completo:

PDF PDF (English) (baixado


DOI: http://dx.doi.org/10.4025/actascitechnol.v35i4.19825





ISSN 1806-2563 (impresso) e ISSN 1807-8664 (on-line) e-mail: actatech@uem.br

  

Resultado de imagem para CC BY