Please use this identifier to cite or link to this item:
Type: Artigo de evento
Title: Empath: A Framework For Evaluating Entity-level Sentiment Analysis
Author: Ward C.B.
Choi Y.
Skiena S.
Xavier E.C.
Abstract: Sentiment analysis is the fundamental component in text-driven monitoring or forecasting systems, where the general sentiment towards real-world entities (e.g., people, products, organizations) are analyzed based on the sentiment signals embedded in a myriad of web text available today. Building such systems involves several practically important problems, from data cleansing (e.g., boilerplate removal, web-spam detection), and sentiment analysis at individual mention-level (e.g., phrase, sentence-, document-level) to the aggregation of sentiment for each entity-level (e.g., person, company) analysis. Most previous research in sentiment analysis however, has focused only on individual mention-level analysis, and there has been relatively less work that copes with other practically important problems for enabling a large-scale sentiment monitoring system. In this paper, we propose Empath, a new framework for evaluating entity-level sentiment analysis. Empath leverages objective measurements of entities in various domains such as people, companies, countries, movies, and sports, to facilitate entity-level sentiment analysis and tracking. We demonstrate the utility of Empath for the evaluation of a large-scale sentiment system by applying it to various lexicons using Lydia, our own large scale text-analytics tool, over a corpus consisting of more than a terabyte of newspaper data. We expect that Empath will encourage research that encompasses end-to-end pipelines to enable a large-scale text-driven monitoring and forecasting systems. © 2011 IEEE.
Rights: fechado
Identifier DOI: 10.1109/CEWIT.2011.6135866
Date Issue: 2011
Appears in Collections:Unicamp - Artigos e Outros Documentos

Files in This Item:
File SizeFormat 
2-s2.0-84857221210.pdf426.57 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.