Detecting Spammers in Twitter Network
DOI:
https://doi.org/10.18100/ijamec.2017436078Keywords:
Spammer Detection, Spam Tweets, Classification AlgorithmsAbstract
The goal of Twitter is to allow friends communicate and stay connected through the exchange of short messages. However, sometimes, spammers also use Twitter as a platform to post malicious links, send unsolicited messages to legitimate users, and hijack trending topics because of two problems of Twitter. These problems are the possibilities to automatically receive following users’ updates and to write on followers’ profile pages. For this reason, spam is becoming an increasing problem on Twitter day after day as other online social network sites are. In this article, we present several methods to detect spam tweets on Twitter. For this purpose, we utilize Naive Bayes, Random Forest J48, and IBK algorithms. The experiments conducted on real Twitter accounts demonstrate that the Random Forest algorithm gives us the best result to detect spammers in Twitter.Downloads
References
Reporting Spam on Twitter, https://help.twitter.com/tr/safety-and-security/report-spam, (Last accesed 29 December 2017).
F. Benevenuto, G. Magno, T. Rodrigues and V. Almedia, “Detecting Spammers on Twitter, 7th Annual Collaboration, Eletrocnic Messasging”, Anti-Abuse and Spam Conference, Washington, USA, 2010.
F. Ahmed and M. Abulaish,. “An mcl-based approach for spam profile detection in online social networks”. In Trust, Security and Privacy in Computing and Communications (TrustCom), 2012 IEEE 11th International Conference on (pp. 602-608). IEEE, 2012.
S. Y. Bhat, M. Abulaish and A.A. Mirza., “Spammer classification using ensemble methods over structural social network features”. In Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on (Vol. 2, pp. 454-458). IEEE, August 2014.
Z. Miller, B. Dickinson, W. Deitrick, W. Hu, and A.H. Wang, ,. “Twitter spammer detection using data stream clustering”. Information Sciences, 260, pp.64-73, 2014.
N. Eshraqi, M. Jalali and M.H. Moattar. “Spam detection in social networks: A review”. In Technology, Communication and Knowledge (ICTCK), 2015 International Congress on (pp. 148-152). IEEE, 2015.
N. Eshraqi, M. Jalali and M.H. Moattar. “Detecting spam tweets in Twitter using a data stream clustering algorithm”. In Technology, Communication and Knowledge (ICTCK), 2015 International Congress on (pp. 347-351). IEEE, 2015.
A. Gupta, and R. Kaushal. “Improving spam detection in online social networks”. In Cognitive Computing and Information Processing (CCIP), 2015 International Conference on (pp. 1-6). IEEE, 2015.
C. Meda, F. Bisio, P. Gastaldo, and R. Zunino. “A machine learning approach to Twitter Spammers Detection”, 2014 International Carnahan Conference on Security Technology (ICCST), Italy, 2014
H. Xu, W. Sun, and A. Javaid. “Efficient spam detection across Online Social Networks”. In Big Data Analysis (ICBDA), 2016 IEEE International Conference on (pp. 1-6). IEEE, 2016.
Twitter Streaming API, https://dev.twitter.com/docs/api/streaming.
P. Heymann, G. Koutrika, and H. Garcia-Molina. “Fighting spam on social web sites: A survey of approaches and future challenges”. IEEE Internet Computing, 11(6), 2007.
http://www.saedsayad.com/naive_bayesian.htm, Naive Bayesian, acssessd on May 2, 2017.
T.R. Patil and S.S. Sherekar, “Performance analysis of Naive Bayes and J48 classification algorithm for data classification”. International Journal of Computer Science and Applications, 6(2), pp.256-261, 2013.
L. Naidoo, M.A. Cho, R. Mathieu and G. Asner. “Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LiDAR data in a Random Forest data mining environment”. ISPRS Journal of Photogrammetry and Remote Sensing, 69, pp.167-179, 2012.
P.K. Korir, P. Geeleher and C. Seoighe. “Seq-ing improved gene expression estimates from microarrays using machine learning”. BMC bioinformatics, 16(1), p.286, 2015.
G. Kaur and A. Chhabra. “Improved J48 classification algorithm for the prediction of diabetes”. International Journal of Computer Applications, 98(22), 2014.
D.W. Aha, D. Kibler and M.K. Albert.”Instance-based learning algorithms”. Machine learning, 6(1), pp.37-66,1991.
T. Srivastava, “Introduction to k-nearest neighbors: Simplified”https://www.analyticsvidhya.com/blog/2014/10/introduction-k-neighbours-algorithm-clustering/ [Accessed: Nov. 3, 2017].
J.M. KellerGray, M.R. and J.A. Givens. “A fuzzy k-nearest neighbor algorithm”. IEEE Transactions on Systems, Man, and Cybernetics, (4), pp.580-585, 1985.
Downloads
Published
Issue
Section
License
Copyright (c) 2017 International Journal of Applied Methods in Electronics and Computers
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.