• Login
    • Login
    Advanced Search
    View Item 
    •   UoN Digital Repository Home
    • Theses and Dissertations
    • Faculty of Science & Technology (FST)
    • View Item
    •   UoN Digital Repository Home
    • Theses and Dissertations
    • Faculty of Science & Technology (FST)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A comparative evaluation of sentiment analysis techniques on Facebook data using three machine learning algorithms: Naïve Bayes, maximum entropy and support vector machines

    Thumbnail
    View/Open
    Full text (939.5Kb)
    Date
    2014
    Author
    Anyim, Julianne A
    Type
    Thesis; en_US
    Language
    en
    Metadata
    Show full item record

    Abstract
    The rapid growth and popularity of social networks has led to the creation of vast amounts of textual data often in an unstructured, fragmented and informal form. Huge volumes of electronic data in form of reviews, customer feedback, elicited surveys, unsolicited comments, suggestions and criticisms are generated on a daily basis which makes it difficult for institutions, government bodies, companies and prospective organizations to react to feedback quickly due to the inadequate capacity to handle the volumes. While recent NLP-based sentiment analysis has centered around Twitter and product or service reviews, we believe it is possible to more accurately classify the emotion in Facebook status messages due to their nature. Facebook status messages are more concise than reviews and tweets, thus allowing for more characters to be used which means better writing and a more accurate portrayal of emotions. In this study, we perform Sentiment Analysis on Facebook by fetching the posts and extracting their content. We then tokenize the data in order to extract their keyword combinations and perform feature selection to keep only the n-grams that are important for the classification problem. We finally train our classifier to identify the polarity of the posts i.e. whether positive, negative or neutral. We analyze the suitability of various approaches to NLP sentiment analysis by comparing the performance of the Naïve Bayes Classifier, Maximum Entropy Classifier and Support Vector Machines. We notice that feature selection technique has a significant impact on the performance of the algorithm. The presence of trigram and bigram information produced better results with all the three algorithms compared to unigrams. This is attributed to the fact that trigrams and bigrams are better at capturing sentiment patterns unlike unigrams which just provide a good coverage of the data. Trigrams achieved an overally higher performance in all instances giving an accuracy of 82.6% with unigrams achieving the least accuracy of 73.8%. However, as statements became long and winded with contradictory phrases, the classifiers performed poorly. This means therefore, that feature selection method alone is not enough to determine the performance of an algorithm. Some advanced NLP techniques might be required to deal with this shortcoming.
    URI
    http://hdl.handle.net/11295/75456
    Collections
    • Faculty of Science & Technology (FST) [4213]

    Copyright © 2022 
    University of Nairobi Library
    Contact Us | Send Feedback

     

     

    Useful Links
    UON HomeLibrary HomeKLISC

    Browse

    All of UoN Digital RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright © 2022 
    University of Nairobi Library
    Contact Us | Send Feedback