Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Psycho-acoustics inspired automatic speech recognition

Articolo
Data di Pubblicazione:
2021
Abstract:
Understanding the human spoken language recognition process is still a far scientific goal. Nowadays, commercial automatic speech recognisers (ASRs) achieve high performance at recognising clean speech, but their approaches are poorly related to human speech recognition. They commonly process the phonetic structure of speech while neglecting supra-segmental and syllabic tracts integral to human speech recognition. As a result, these ASRs achieve low performance on spontaneous speech and require enormous costs to build up phonetic and pronunciation models and catch the large variability of human speech. This paper presents a novel ASR that addresses these issues and questions conventional ASR approaches. It uses alternative acoustic models and an exhaustive decoding algorithm to process speech at a syllabic temporal scale (100-250 ms) through a multi-temporal approach inspired by psycho-acoustic studies. Performance comparison on the recognition of spoken Italian numbers (from 0 to 1 million) demonstrates that our approach is cost-effective, outperforms standard phonetic models, and reaches state-of-the-art performance.
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
Automatic speech recognition; Deep learning; Long short term memory; Convolutional neural networks; Factorial hidden Markov models; Hidden Markov models; Speech; Psycho-acoustics; Syllables
Elenco autori:
Massoli, FABIO VALERIO; Coro, Gianpaolo
Autori di Ateneo:
CORO GIANPAOLO
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/400480
Link al Full Text:
https://iris.cnr.it//retrieve/handle/20.500.14243/400480/136156/prod_454447-doc_175186.pdf
https://iris.cnr.it//retrieve/handle/20.500.14243/400480/136160/prod_454447-doc_175187.pdf
Pubblicato in:
COMPUTERS & ELECTRICAL ENGINEERING
Journal
  • Dati Generali

Dati Generali

URL

https://www.sciencedirect.com/science/article/pii/S0045790621002251
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)