Tomas Vykruta Data Daily

MODELING NON-LINGUISTIC CONTEXTUAL SIGNALS IN LSTM LANGUAGE MODELS VIA DOMAIN ADAPTATION

Our latest machine learning paper has been published. link to view MODELING NON-LINGUISTIC CONTEXTUAL SIGNALS IN LSTM LANGUAGE MODELS VIA DOMAIN ADAPTATION Abstract: When it comes to speech recognition for voice search, it would be advantageous to take into account application information associated with speech queries. However, in practice, the vast majority of queries typically lack such annotations, posing a challenge to train domain-specific language models (LMs). To obtain robust domain LMs, typically a LM which has been pre-trained on general data will be adapted to specific domains. We propose four adaptation schemes to improve the domain performance of long shortterm memory (LSTM) language models, by incorporating application based contextual signals of voice search queries. Most adaptation strategies are shown to be effective, giving up to 21% relative reduction in perplexity relative to a fine-tuned baseline on a heldout domain specific development set. Initial...

Tomas Vykruta Data Daily

Search This Blog

Posts

MODELING NON-LINGUISTIC CONTEXTUAL SIGNALS IN LSTM LANGUAGE MODELS VIA DOMAIN ADAPTATION