Please use this identifier to cite or link to this item:
Type: Artigo
Title: Mdltext: An Efficient And Lightweight Text Classifier
Author: Silva
Renato M.; Almeida
Tiago A.; Yamakami
Abstract: In many areas, the volume of text information is increasing rapidly, thereby demanding efficient text classification approaches. Several methods are available at present, but most exhibit declining performance as the dimensionality of the problem increases, or they incur high computational costs for training, which limit their application in real scenarios. Thus, it is necessary to develop a method that can process high dimensional data in a rapid manner. In this study, we propose the MDLText, an efficient, lightweight, scalable, and fast multinomial text classifier, which is based on the minimum description length principle. MDLText exhibits fast incremental learning as well as being sufficiently robust to prevent overfitting, which are desirable features in real-world applications, large-scale problems, and online scenarios. Our experiments were carefully designed to ensure that we obtained statistically sound results, which demonstrated that the proposed approach achieves a good balance between predictive power and computational efficiency. (C) 2016 Elsevier B.V. All rights reserved.
Subject: Text Categorization
Minimum Description Length
Machine Learning
Natural Language Processing
Editor: Elsevier Science BV
Rights: fechado
Identifier DOI: 10.1016/j.knosys.2016.11.018
Date Issue: 2017
Appears in Collections:Unicamp - Artigos e Outros Documentos

Files in This Item:
File SizeFormat 
000393009800014.pdf1.17 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.