PAAD: POLITICAL ARABIC ARTICLES DATASET FOR AUTOMATIC  TEXT CATEGORIZATION

Dhafar  Hamed Abd; Ahmed   T. Sadiq; Ayad  R. Abbas

doi:10.25195/ijci.v46i1.246

PAAD: POLITICAL ARABIC ARTICLES DATASET FOR AUTOMATIC TEXT CATEGORIZATION

Authors

Dhafar Hamed Abd University of Technology, Al-Maarif University College
Ahmed T. Sadiq University of Technology
Ayad R. Abbas University of Technology

DOI:

https://doi.org/10.25195/ijci.v46i1.246

Keywords:

Arabic Political Article, Orientation, Sentiment Analysis, Natural language Processing, Opinion Mining

Abstract

Now day’s text Classification and Sentiment analysis is considered as one of the popular Natural Language Processing (NLP) tasks. This kind of technique plays significant role in human activities and has impact on the daily behaviours. Each article in different fields such as politics and business represent different opinions according to the writer tendency. A huge amount of data will be acquired through that differentiation. The capability to manage the political orientation of an online article automatically. Therefore, there is no corpus for political categorization was directed towards this task in Arabic, due to the lack of rich representative resources for training an Arabic text classifier. However, we introduce political Arabic articles dataset (PAAD) of textual data collected from newspapers, social network, general forum and ideology website. The dataset is 206 articles distributed into three categories as (Reform, Conservative and Revolutionary) that we offer to the research community on Arabic computational linguistics. We anticipate that this dataset would make a great aid for a variety of NLP tasks on Modern Standard Arabic, political text classification purposes. We present the data in raw form and excel file. Excel file will be in four types such as V1 raw data, V2 preprocessing, V3 root stemming and V4 light stemming.

Downloads

Download data is not yet available.

Author Biographies

Dhafar Hamed Abd , University of Technology, Al-Maarif University College

Department of Computer Science

Ahmed T. Sadiq, University of Technology

Department of Computer Science

Ayad R. Abbas, University of Technology

Department of Computer Science

Downloads

Published

2020-06-30 — Updated on 2020-06-30

Versions

2020-06-30 (2)
2020-06-30 (1)

How to Cite

Hamed Abd , D. ., T. Sadiq, A. . ., & R. Abbas, A. . (2020). PAAD: POLITICAL ARABIC ARTICLES DATASET FOR AUTOMATIC TEXT CATEGORIZATION. Iraqi Journal for Computers and Informatics, 46(1), 1–11. https://doi.org/10.25195/ijci.v46i1.246

Download Citation

Issue

Vol. 46 No. 1 (2020): Iraqi Journal for Computers and Informatics

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

IJCI applies the Creative Commons Attribution (CC BY) license to articles. The author of the submitted paper for publication by IJCI has the CC BY license. Under this Open Access license, the author gives an agreement to any author to reuse the article in whole or part for any purpose, even for commercial purposes. Anyone may copy, distribute, or reuse the content as long as the author and source are properly cited. This facility helps in re-use and ensures that journal content is available for the needs of research.
If the manuscript contains photos, images, figures, tables, audio files, videos, etc., that the author or the co-authors do not own, IJCI will require the author to provide the journal with proof that the owner of that content has given the author written permission to use it, and the owner has approved that the CC BY license being applied to content. IJCI provides a form that the author can use to ask for permission from the owner. If the author does not have owner permission, IJCI will ask the author to remove that content and/or replace it with other content that the author owns or has such permission to use.
Many authors assume that if they previously published a paper through another publisher, they have the right to reuse that content in their PLOS paper, but that is not necessarily the case – it depends on the license that covers the other paper. The author must ascertain the rights he/she has of a specific license (a license that enables the author to use the content). The author must obtain written permission from the publisher to use the content in the IJCI paper. The author should not include any content in her/his IJCI paper without having the right to use it, and always give proper attribution.
The accompanying submitted data should be stated with licensing policies, the policies should not be more restrictive than CC BY.
IJCI has the right to remove photos, captures, images, figures, tables, illustrations, audio, and video files, from a paper before or after publication, if these contents were included in the author's paper without permission from the owner of the content.

PAAD: POLITICAL ARABIC ARTICLES DATASET FOR AUTOMATIC TEXT CATEGORIZATION

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Dhafar Hamed Abd , University of Technology, Al-Maarif University College

Ahmed T. Sadiq, University of Technology

Ayad R. Abbas, University of Technology

Downloads

Published

Versions

How to Cite

Issue

Section

License

Issn Journal

Current Issue

Information