Statistical Machine Translation System for Sinhala and Tamil Languages

Rajpirathap, S; Sheeyam, S; Umasuthan, K; Chelvarajah, A

UoM IR
→
Research Publications
→
Conference Proceedings
→
UoM Conferences
→
Information Technology Research Unit (ITRU & ICITR)
→
ITRU - 2014
→
View Item

dc.contributor.author	Rajpirathap, S
dc.contributor.author	Sheeyam, S
dc.contributor.author	Umasuthan, K
dc.contributor.author	Chelvarajah, A
dc.date.accessioned	2017-04-03T03:44:23Z
dc.date.available	2017-04-03T03:44:23Z
dc.identifier.uri	http://dl.lib.mrt.ac.lk/handle/123/12629
dc.description.abstract	Statistical machine translation method is one of the most promising and efficient method to perform machine translation for Sri Lankan languages likes Sinhala and Tamil. Statistical approach is more suitable for structurally dissimilar pairs of languages and efficient solution for large text translation. In Sri Lanka we have a rising need for translation for Sinhala and Tamil and the statistical machine translation approach is more suitable for the concerned languages. Sinhala and Tamil have a similarity in grammar and statistical approach will help to obtain more accurate results. We have developed a bi-directional translation system for both Tamil to Sinhala and Sinhala to Tamil for this research. We have used the Sri Lankan parliament corpus to train the language model. We have critically evaluated the both systems with parameter optimizations and have obtained the most accurate and efficient system. We have also utilized the scoring techniques like BLEU [2, 8] & NIST [2] for the system evaluation and we have integrated the MERT technique to tune the decoder.	en_US
dc.language.iso	en	en_US
dc.title	Statistical Machine Translation System for Sinhala and Tamil Languages	en_US
dc.type	Conference-Full-text	en_US
dc.identifier.faculty	IT	en_US
dc.identifier.department	Department of Information Technology	en_US
dc.identifier.year	2014	en_US
dc.identifier.conference	ITRU RESEARCH SYMPOSIUM	en_US
dc.identifier.place	UNIVERSITY OF MORATUWA	en_US
dc.identifier.pgnos	38-43	en_US
dc.identifier.email	[email protected]	en_US
dc.identifier.email	[email protected]	en_US