Please use this identifier to cite or link to this item:
http://repositorio.unicamp.br/jspui/handle/REPOSIP/341518
Type: | Artigo |
Title: | Open-set web genre identification using distributional features and nearest neighbors distance ratio |
Author: | Pritsos, Dimitrios Rocha, Anderson Stamatatos, Efstathios |
Abstract: | Web genre identification can boost information retrieval systems by providing rich descriptions of documents and enabling more specialized queries. The open-set scenario is more realistic for this task as web genres evolve over time and it is not feasible to define a universally agreed genre palette. In this work, we bring to bear a novel approach to web genre identification underpinned by distributional features acquired by doc2vec and a recently-proposed open-set classification algorithm—the nearest neighbors distance ratio classifier. We present experimental results using a benchmark corpus and a strong baseline and demonstrate that the proposed approach is highly competitive, especially when emphasis is given on precision |
Subject: | Gênero Algoritmos |
Country: | Alemanha |
Editor: | Springer |
Rights: | Fechado |
Identifier DOI: | 10.1007/978-3-030-15719-7_1 |
Address: | https://link.springer.com/chapter/10.1007/978-3-030-15719-7_1 |
Date Issue: | 2019 |
Appears in Collections: | IC - Artigos e Outros Documentos |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2-s2.0-85064856679.pdf | 361.31 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.