Hierarchical multi-label news article classification with distributed semantic model based features

International Journal of Advances in Intelligent Informatics

View Publication Info
Field Value
Title Hierarchical multi-label news article classification with distributed semantic model based features
Creator Irsan, Ivana Clairine
Khodra, Masayu Leylia
Subject Multi-label classification; Hierarchical multi-label classification; CNN; Word embedding; News
Description Automatic news categorization is essential to automatically handle the classification of multi-label news articles in online portal. This research employs some potential methods to improve performance of hierarchical multi-label classifier for Indonesian news article. First potential method is using Convolutional Neural Network (CNN) to build the top level classifier. The second method could improve the classification performance by calculating the average of the word vectors obtained from distributed semantic model. The third method combines lexical and semantic method to extract documents features, which multiplied word term frequency (lexical) with word vector average (semantic). Model build using Calibrated Label Ranking as multi-label classification method, and trained using Naïve Bayes algorithm has the best F1-measure of 0.7531. Multiplication of word term frequency and the average of word vectors were also used to build this classifiers. This configuration improved multi-label classification performance by 4.25%, compared to the baseline. The distributed semantic model that gave best performance in this experiment obtained from 300-dimension word2vec of Wikipedia’s articles. The multi-label classification model performance is also influenced by news’ released date. The difference period between training and testing data would also decrease models’ performance.
Publisher Universitas Ahmad Dahlan
Date 2019-03-20
Type info:eu-repo/semantics/article

Format application/pdf
Identifier http://ijain.org/index.php/IJAIN/article/view/168
Source International Journal of Advances in Intelligent Informatics; Vol 5, No 1 (2019): March 2019; 40-47
Language eng
Relation http://ijain.org/index.php/IJAIN/article/view/168/ijain_v5i1_p40-47
Rights https://creativecommons.org/licenses/by-sa/4.0

Contact Us

The PKP Index is an initiative of the Public Knowledge Project.

For PKP Publishing Services please use the PKP|PS contact form.

For support with PKP software we encourage users to consult our wiki for documentation and search our support forums.

For any other correspondence feel free to contact us using the PKP contact form.

Find Us


Copyright © 2015-2018 Simon Fraser University Library