• Login
    • Login
    Advanced Search
    View Item 
    •   UoN Digital Repository Home
    • Journal Articles
    • Faculty of Science & Technology (FST)
    • View Item
    •   UoN Digital Repository Home
    • Journal Articles
    • Faculty of Science & Technology (FST)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Word Sense Disambiguation of Swahili: Extending Swahili Language Technology with Machine Learning

    Thumbnail
    Date
    2005
    Author
    Nganga, W.
    Type
    Article
    Language
    en
    Metadata
    Show full item record

    Abstract
    This thesis addresses the problem of word sense disambiguation within the context of Swahili-English machine translation. In this setup, the goal of disambiguation is to choose the correct translation of an ambiguous Swahili noun in context. A corpus based approach to disambiguation is taken, where machine learning techniques are applied to a corpus of Swahili, to acquire disambiguation information automatically. In particular, the Self-Organizing Map algorithm is used to obtain a semantic categorization of Swahili nouns from data. The resulting classes form the basis of a class-based solution, where disambiguation is recast as a classification problem. The thesis exploits these semantic classes to automatically obtain annotated training data, addressing a key problem facing supervised word sense disambiguation. The semantic and linguistic characteristics of these classes are modelled as Bayesian belief networks, using the Bayesian Modelling Toolbox. Disambiguation is achieved via probabilistic inferencing.The thesisdevelops a disambiguation solution which does not make extensive resource requirements, but rather capitalizes on freely-available lexical and computational resources for English as a source of additional disambiguation information. A semantic tagger for Swahili is created by altering the configuration of the Bayesian classifiers. The disambiguation solution is tested on a subset of unambiguous nouns and a manually created gold standard of sixteen ambiguous nouns, using standard performance evaluation metrics.
    URI
    http://erepository.uonbi.ac.ke:8080/xmlui/handle/123456789/35166
    Citation
    Nganga, W. 2005. Word Sense Disambiguation of Swahili: Extending Swahili Language Technology with Machine Learning, 2005. : Helsinki University Press
    Publisher
    UNiversity of Nairobi
     
    college of biological and physical science
     
    Collections
    • Faculty of Science & Technology (FST) [4284]

    Copyright © 2022 
    University of Nairobi Library
    Contact Us | Send Feedback

     

     

    Useful Links
    UON HomeLibrary HomeKLISC

    Browse

    All of UoN Digital RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright © 2022 
    University of Nairobi Library
    Contact Us | Send Feedback