Arabic Crime Tweet Filtering and Prediction Using Machine Learning




Cybercrime, Machine Learning, Twitter Analysis, Natural Language Processing (NLP), Random Forest, Logistic Regression.


Crime is undeniably rising, thus negatively affecting countries’ economies. Despite several efforts to study crime prediction to reduce crime rates, few studies take the timeline factor into account when extracting crime-related tweets to predict crime. Aiming to predict Arabic crime tweets on Twitter/X, this study predicts crimes after analyzing social sentiment—that is, whether a tweet raises positive, negative, or neutral feelings—and filters the tweets based on crime behavior through an intelligent dictionary built through a genetic algorithm. The study uses a variety of machine learning (ML) models—random forest, logistic regression, and decision trees—which are assessed according to their accuracy, precision, recall, and F1 scores to guarantee robustness and dependability in crime prediction. The accuracy after filtering crimes based on an intelligent dictionary is 97% for decision tree, 97% for random forest, and 94.43% for logistic regression. This research will provide insight into potential crime attitudes and public opinion toward safety and law enforcement.


Download data is not yet available.

Author Biographies

Zainab Khyioon Abdalrdha, Iraqi Commission for Computers and Informatics / Informatics Institute of Postgraduate Studies

Informatics Institute of Postgraduate Studies

Prof. Dr. Alaa K. Farhan , University of Technology

Department of Computer Sciences