• Login
    • Login
    Advanced Search
    View Item 
    •   UoN Digital Repository Home
    • Theses and Dissertations
    • Faculty of Science & Technology (FST)
    • View Item
    •   UoN Digital Repository Home
    • Theses and Dissertations
    • Faculty of Science & Technology (FST)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A Parallel Corpus Based Translation Using Sentence Similarity

    Thumbnail
    View/Open
    FullText (649.4Kb)
    Date
    2014
    Author
    Ruoro, Simon Wachira
    Type
    Thesis; en_US
    Language
    en
    Metadata
    Show full item record

    Abstract
    When large quantities of technical texts are being translated manually, it is very difficult to produce consistent translations of recurrent stretches of text, such as paragraphs, sentences and phrases, making it not possible to reuse old translations stored as translation memories of previous versions of handbooks and thereby minimizing the chances of producing variant translations of the same source sentence that provide users with better understanding on word usage in sentences. We developed an English-Swahili example-based machine translation (EBMT) system, which exploited a bilingual corpus to find examples that match the input source-language the Translation examples were extracted from a collection of parallel and sentence aligned in English – Swahili for translation. We used the technique of splitting phrase or paragraph into sentences through the use of N-gram. In previous research, many methods used N-gram clues to split sentences. In this project, to supplement N-gram based splitting methods, we introduced another clue using sentence similarity based on edit-distance. In our splitting method, candidate sentence were generated by splitting paragraph based on N-grams, and select the best one by measuring sentence similarity. We conducted experiments using two EBMT systems, one of which use a word and the other of which use a sentence as a translation unit. Which showed that the system performs slightly better when using sentence similarity in terms of performance a considerable success rate (above 95% at sentence) was encountered in order to construct a database with truthfully correspondent units sentence. The use of words show also showed a good performance of above 65%. Also the use of classifying text into their domain/topic did show some improvement. Through the use of translation memory (TM) with repository in which the user store previously translation helping to improve translator productivity and consistency, while a TM system functions as an information retrieval system that tries to retrieve one or more suggestions from a TM database that would assist the translator in his/her current translation task or learning how a sentence can be used in different contexts or domains
    URI
    http://hdl.handle.net/11295/90185
    Citation
    Masters of Science in Computer Science
    Publisher
    University of Nairobi
    Collections
    • Faculty of Science & Technology (FST) [4206]

    Copyright © 2022 
    University of Nairobi Library
    Contact Us | Send Feedback

     

     

    Useful Links
    UON HomeLibrary HomeKLISC

    Browse

    All of UoN Digital RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright © 2022 
    University of Nairobi Library
    Contact Us | Send Feedback