Friday, March 15, 2019

Towards Building a High-Quality Microblog-Specific Chinese Sentiment Lexicon

.
Due to the huge popularity of microblogging services, microblogs have become important sources of customer opinions. Sentiment analysis systems can provide useful knowledge to decision support systems and decision makers by aggregating and summarizing the opinions in massive microblogs automatically. The most important component of sentiment analysis systems is sentiment lexicon. However, the performance of traditional sentiment lexicons on microblog sentiment analysis is far from satisfactory, especially for Chinese. In this paper, we propose a data-driven approach to build a high-quality microblog-specific sentiment lexicon for Chinese microblog sentiment analysis system. The core of our method is a unified framework that incorporates three kinds of sentiment knowledge for sentiment lexicon construction, i.e., the word-sentiment knowledge extracted from microblogs with emoticons, the sentiment similarity knowledge extracted from words' associations among all the messages, and the prior sentiment knowledge extracted from existing sentiment lexicons. In addition, in order to improve the coverage of our sentiment lexicon, we propose an effective method to detect popular new words in microblogs, which considers not only words' distributions over texts, but also their distributions over users. The detected new words with strong sentiment are incorporated in our sentiment lexicon. We built a microblog-specific Chinese sentiment lexicon on a large microblog dataset with more than 17 million messages. Experimental results on two microblog sentiment datasets show that our microblog-specific sentiment lexicon can significantly improve the performance of microblog sentiment analysis.
.
https://www.researchgate.net/publication/301902793_Towards_Building_a_High-Quality_Microblog-Specific_Chinese_Sentiment_Lexicon
.

No comments: