Please use this identifier to cite or link to this item: http://repositorio.unicamp.br/jspui/handle/REPOSIP/341518
Type: Artigo
Title: Open-set web genre identification using distributional features and nearest neighbors distance ratio
Author: Pritsos, Dimitrios
Rocha, Anderson
Stamatatos, Efstathios
Abstract: Web genre identification can boost information retrieval systems by providing rich descriptions of documents and enabling more specialized queries. The open-set scenario is more realistic for this task as web genres evolve over time and it is not feasible to define a universally agreed genre palette. In this work, we bring to bear a novel approach to web genre identification underpinned by distributional features acquired by doc2vec and a recently-proposed open-set classification algorithm—the nearest neighbors distance ratio classifier. We present experimental results using a benchmark corpus and a strong baseline and demonstrate that the proposed approach is highly competitive, especially when emphasis is given on precision
Subject: Gênero
Algoritmos
Country: Alemanha
Editor: Springer
Rights: Fechado
Identifier DOI: 10.1007/978-3-030-15719-7_1
Address: https://link.springer.com/chapter/10.1007/978-3-030-15719-7_1
Date Issue: 2019
Appears in Collections:IC - Artigos e Outros Documentos

Files in This Item:
File Description SizeFormat 
2-s2.0-85064856679.pdf361.31 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.