Please use this identifier to cite or link to this item:
https://idr.l2.nitk.ac.in/jspui/handle/123456789/7124
Title: | A Bag-of-Phonetic-Codes Modelfor Cyber-Bullying Detection in Twitter |
Authors: | Shekhar, A. Venkatesan, M. |
Issue Date: | 2018 |
Citation: | Proceedings of the 2018 International Conference on Current Trends towards Converging Technologies, ICCTCT 2018, 2018, Vol., , pp.- |
Abstract: | Social networking sites such as Twitter, Facebook, MySpace, Instagram are emerging as a strong medium of communication these days. These have become a part and parcel of daily life. People can express their thoughts and activities among their social circle with brings them closer to their community. However this freedom of expression has its drawbacks. Sometimes people show their aggression on Social Media which in turn hurts the sentiments of the targeted victims. Certain forms of cyber-bullying are sexual, racial and physical disability based. Hence a proper surveillance is necessary to tackle such situations. Twitter as a micro-blogging site sees cyber abuse on a daily basis. However, tweets are raw texts; containing a lot of misspelled words and censored words. This paper proposes a novel method to detect cyber-bullying, a Bag-of-Phonetic-Codes model. Using pronunciation of words as features can rectify misspelled words and can identify censored words. Correctly identifying duplicate words can lead to smaller vocabulary of words, thereby reducing the feature space. The inspiration for this proposed work is drawn from the famous Bag-of-Words model for extracting textual features. Phonetic code generation has been done using the Soundex Algorithm. Besides the proposed model, experiments were carried out with both supervised and unsupervised machine learning approaches on multiple datasets to understand the approaches and challenges in the domain of cyber-bullying detection. � 2018 IEEE. |
URI: | http://idr.nitk.ac.in/jspui/handle/123456789/7124 |
Appears in Collections: | 2. Conference Papers |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.