A Comparative Study of Transformer-based Models for Hate-Speech Detection in English-Kiswahili Code-Switched Social Media Text

Ng’ang’a, Njung’e Fredrick.; Oirere, Aaron M.; Ndung'u, Rachel N.

dc.contributor.author	Ng’ang’a, Njung’e Fredrick.
dc.contributor.author	Oirere, Aaron M.
dc.contributor.author	Ndung'u, Rachel N.
dc.date.accessioned	2024-11-08T09:05:46Z
dc.date.available	2024-11-08T09:05:46Z
dc.date.issued	2024
dc.identifier.issn	2278-3091
dc.identifier.other	https://doi.org/10.30534/ijatcse/2024/011352024
dc.identifier.uri	http://repository.mut.ac.ke:8080/xmlui/handle/123456789/6483
dc.description.abstract	The transformer architecture, first introduced in 2017 by researchers at Google, has revolutionized natural language processing in various tasks, including text classification. This architecture formed the basis of future models such as those used in hate speech detection in code-switched text. In this research, we conduct a comparative study of transformer-based models for hate speech detection in English-Kiswahili code-switched text. First, the models were compared as feature extractors using a traditional classifier and then as end-to-end classifiers. The three multilingual transformer-based models compared include mBERT, mDistilBERT and XLM-RoBERTa, using SVM as the traditional classifier for the extracted features. The HateSpeech_Kenya dataset, sourced from Kaggle, was utilized in this study. As a feature extractor, mBERT’s hidden states trained the highest-performing SVM with an accuracy of 0.5461 and a macro f1 score of 0.40. Among the three models evaluated, XLM-RoBERTa achieved the highest accuracy of 0.6069 and a macro f1 score of 0.49 on a balanced dataset. In contrast, mBERT achieved the highest accuracy of 0.7820 and a macro f1 score of 0.53 on an imbalanced dataset. The comparative study establishes that using transformer-based models as end-to-end classifiers generally performs better than using them as feature extractors with traditional classifiers. This is because directly training the models allows them to learn more task-specific features. Furthermore, the varying performance across balanced and imbalanced datasets highlights the need for careful model selection based on the dataset characteristics and specific task requirements.	en_US
dc.language.iso	en	en_US
dc.publisher	International Journal of Advanced Trends in Computer Science and Engineering	en_US
dc.subject	Code-Switching, English-Kiswahili, Hate Speech, Multilingual Language Understanding, Text Classification, Transformers.	en_US
dc.title	A Comparative Study of Transformer-based Models for Hate-Speech Detection in English-Kiswahili Code-Switched Social Media Text	en_US
dc.type	Article	en_US

Files in this item

Name:: ijatcse011352024.pdf
Size:: 367.7Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Journal Articles (CI) [132]

Show simple item record