Domain Specific, Intent Classification For Sinhala Speech Data


The most important intents(commands) in the banking domain were selected with the help of expertise in the domain. A crowdsourcing method was used to create a set of inflections for each of the intents. An online survey containing the predefined set of intents was conducted and participants were requested to provide an alternative way of saying the provided intent. Using Voicer tool around 8460 speech clips were recorded for six intents in banking domain. 10 hours worth speech data was gathered from 215 speakers of which 152 were male and 63 were female.

Associated Publication I: - 

Paper Title: Voicer: A Crowd Sourcing Tool for Speech Data Collection
Published in: 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer), Colombo, Sri Lanka
Date of Conference: 26-29 Sept. 2018
DOI: 10.1109/ICTER.2018.8615521

Citation for "Voicer: A Crowd Sourcing Tool for Speech Data Collection": -

D. Buddhika, R. Liyadipita, S. Nadeeshan, H. Witharana, S. Jayasena and U. Thayasivam, "Voicer: A Crowd Sourcing Tool for Speech Data Collection," 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer), 2018, pp. 174-181, doi: 10.1109/ICTER.2018.8615521.

Associated Publication II: -

Paper Title: Domain Specific Intent Classification of Sinhala Speech Data
Published in: 2018 International Conference on Asian Language Processing (IALP), Bandung, Indonesia
Date of Conference: 15-17 Nov. 2018
DOI: 10.1109/IALP.2018.8629103

Citations for "Domain Specific Intent Classification of Sinhala Speech Data":

D. Buddhika, R. Liyadipita, S. Nadeeshan, H. Witharana, S. Javasena and U. Thayasivam, "Domain Specific Intent Classification of Sinhala Speech Data," 2018 International Conference on Asian Language Processing (IALP), 2018, pp. 197-202, doi: 10.1109/IALP.2018.8629103.