- Tytuł:
- Building compact language models for medical speech recognition in mobile devices with limited amount of memory
- Autorzy:
- Sas, J.
- Powiązania:
- https://bibliotekanauki.pl/articles/332971.pdf
- Data publikacji:
- 2012
- Wydawca:
- Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
- Tematy:
-
automatyczne rozpoznawanie mowy
medyczne systemy informacyjne
modelowanie języka
automatic speech recognition
medical information systems
language modeling - Opis:
- The article presents the method of building compact language model for speech recognition in devices with limited amount of memory. Most popularly used bigram word-based language models allow for highly accurate speech recognition but need large amount of memory to store, mainly due to the big number of word bigrams. The method proposed here ranks bigrams according to their importance in speech recognition and replaces explicit estimation of less important bigrams probabilities by probabilities derived from the class-based model. The class-based model is created by assigning words appearing in the corpus to classes corresponding to syntactic properties of words. The classes represent various combinations of part of speech inflectional features like number, case, tense, person etc. In order to maximally reduce the amount of memory necessary to store class-based model, a method that reduces the number of part-of-speech classes has been applied, that merges the classes appearing in stochastically similar contexts in the corpus. The experiments carried out with selected domains of medical speech show that the method allows for 75% reduction of model size without significant loss of speech recognition accuracy.
- Źródło:
-
Journal of Medical Informatics & Technologies; 2012, 20; 111-119
1642-6037 - Pojawia się w:
- Journal of Medical Informatics & Technologies
- Dostawca treści:
- Biblioteka Nauki