Saturday, July 24, 2021

Sentiment Analysis Concept

.

sourceTextcitation
Liu, B. (2010). Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing.
Sentiment Analysis, also called Opinion Mining, is one of the most recent research topics within the field of Information Processing. Textual information retrieval techniques are mainly focused on processing, searching or mining factual information. Facts have an objective component; however, there are other textual elements which express subjective characteristics. These elements are mainly opinions, sentiments, appraisals, attitudes, and emotions, which are the focus of Sentiment Analysis.Serrano-Guerrero, J., Olivas, J.A., Romero, F.P., & Herrera-Viedma, E. (2015). Sentiment analysis: A review and comparative analysis of web services. Inf. Sci., 311, 18-38.
Sentiment analysis (SA) is a process of studying public opinion about an entity.Sharma, D., Sabharwal, M., Goyal, V., & Vij, M. (2020). Sentiment Analysis Techniques for Social Media Data: A Review.
Sentiment analysis is the computational examination of end user’s opinion, attitudes and emotions towards a particular topic or product.Singh, N., Tomar, D., & Sangaiah, A.K. (2020). Sentiment analysis: a review and comparative analysis over social media. Journal of Ambient Intelligence and Humanized Computing, 1-21.
Sentiment analysis, also called opinion mining, is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes.Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies.
Sentiment analysis or opinion mining is the computational study of opinions, sentiments and emotions expressed in text.Liu, B. (2010). Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing.
Cambria, E., Das, D., Bandyopadhyay, S., & Feraco, A. (2017). A Practical Guide to Sentiment Analysis.
Sentiments and opinions have also been used interchangeably, perhaps because most NLP research on opinions has focused on detecting their subjective part, which has been referred to as sentiment.Kim, S., & Hovy, E. (2004). Determining the Sentiment of Opinions. COLING.
Nandal, N., Tanwar, R., & Pruthi, J. (2020). Machine learning based aspect level sentiment analysis for Amazon products. Spatial Information Research, 1-7.
For decades, the research area was mostly ignored until massive amounts of opinions available on the Web gave birth to modern sentiment analysisMäntylä, M., Graziotin, D., & Kuutila, M. (2018). The evolution of sentiment analysis - A review of research topics, venues, and top cited papers. ArXiv, abs/1612.01556.
Since the year 2002, research in sentiment analysis has been very active.Zhang, L., & Liu, B. (2017). Sentiment Analysis and Opinion Mining. Encyclopedia of Machine Learning and Data Mining.
Tsytsarau, M., & Palpanas, T. (2011). Survey on mining subjective data on the web. Data Mining and Knowledge Discovery, 24, 478-514.
Dave, K., Lawrence, S., & Pennock, D. (2003). Mining the peanut gallery: opinion extraction and semantic classification of product reviews. WWW '03.
The aim of sentiment
analysis is to determine the attitudes of a writer or a speaker
for a given topic.
Kaur, H., Mangat, V., & Nidhi (2017). A survey of sentiment analysis techniques. 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 921-925.
The aim of sentiment analysis is to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a documentKidd, T., & Morris, L.R. (2017). Handbook of Research on Instructional Systems and Educational Technology. p.339
Generally speaking, sentiment analysis aims to determine the attitude of a writerwith respect to some topic or the overall contextual polarity of a document. Theattitude may be his or her judgment or evaluation, affective state (that is to say, theemotional state of the author when writing), or the intended emotional commu-nication (that is to say, the emotional effect the author wishes to have on thereader).
The attitude may be his or her judgment or evaluation, affective state i.e. the emotional state of the author when writing, or the intended emotional communication i.e. the emotional effect the author wishes to have on the reader
1) Affect is a predecessor to feelings and emotions. 2)Feelings are person-centered, conscious phenomena. 3)Emotions are preconscious social expressions of feelings and affect influenced by culture. 4)Sentiments are partly social constructs of emotions that develop over time and are enduring. 5) Opinions are personal interpretations of information that may or may not be emotionally charged.Munezero, M., Montero, C., Sutinen, E., & Pajunen, J. (2014). Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in Text. IEEE Transactions on Affective Computing, 5, 101-111.
In the past years, sentiment analysis has been mainly considered as a classification problem in the setting of machine learning, e.g., polarity classification of sentiments to one of two categories, namely, positive and negative.Liu, H., & Haig, E. (2017). Fuzzy information granulation towards interpretable sentiment analysis. Granular Computing, 2, 289-302.
In the early days, a simple classification according to the semantic polarity (positiveness, negativeness or neutralness) of a document was predominant, whereas in the meantime, research activities have shifted towards a more sophisticated modeling of sentiments.Buechel, S., & Hahn, U. (2017). Readers vs. Writers vs. Texts: Coping with Different Perspectives of Text Understanding in Emotion Annotation. LAW@ACL.
Sentiment Analysis can be considered a classification process as illustrated in Fig. 1.Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5, 1093-1113.
, research activities have shifted towards
a more sophisticated modeling of sentiments. This
includes the extension from only few basic to more
varied emotional classes sometimes even assigning real-valued scores (Strapparava and Mihalcea,
2007), the aggregation of multiple aspects of an
opinion item into a composite opinion statement
for the whole item (Schouten and Frasincar, 2016),
and sentiment compositionality on sentence level
(Socher et al., 2013).
Buechel, S., & Hahn, U. (2017). Readers vs. Writers vs. Texts: Coping with Different Perspectives of Text Understanding in Emotion Annotation. LAW@ACL.
Branching from the field of SA whose core intent is to analyze human language by extracting opinions, ideas, and thoughts through the assignment of polarities either negative, positive, or neutral is the subfield of emotion detection (ED), which seeks to extract finer-grained emotions such as happy, sad, angry, and so on, from human languages rather than coarse-grained and general polarity assignments in SA.Acheampong, F.A., Wenyu, C., & Nunoo-Mensah, H. (2020). Text‐based emotion detection: Advances, challenges, and opportunities.
Wang, Y., Luo, J., Niemi, R., Li, Y., & Hu, T. (2016). Catching Fire via "Likes": Inferring Topic Preferences of Trump Followers on Twitter. ICWSM.
Wolny, W. (2016). Emotion Analysis of Twitter Data That Use Emoticons and Emoji Ideograms. ISD.
opinion mining is possible on four different levels, namely document level, sentence level, aspect level, and concept level. Document level (Moraes et al. 2013) of opinion mining is the most abstract level of sentiment analysis and so is not appropriate for precise evaluations. The result of this level of analysis is usually general information about the documents polarity which cannot be very accurate. Sentence level opinion mining (Marcheggiani et al. 2014) is a fine-grain analysis that could be more accurate. Since the polarity of the sentences of an opinion does not imply the same polarity for the whole of opinion necessarily, aspect level of opinion mining (Xia et al. 2015) have been considered by researchers as the third level of opinion mining and sentiment analysis. Concept level opinion mining is the forth level of sentiment analysis which focuses on the semantic analysis of the text and analyzes the concepts which do not explicitly express any emotion (Poria et al. 2014). Several recent surveys and reviews on sentiment analysis consider these levels of opinion mining from this point of view (Medhat et al. 2014; Ravi and Ravi 2015; Balazs and Velasquez 2016; Yan et al. 2017; Sun et al. 2017;Lo et al. 2017).Hemmatian, F., & Sohrabi, M. (2017). A survey on classification techniques for opinion mining and sentiment analysis. Artificial Intelligence Review, 1-51.
In general, sentiment analysis has been investigated mainly at three levels [1]. In document level the main task is to classify whether a whole opinion document expresses a positive or negative sentiment. This level of analysis assumes that each document expresses opinions on a single entity. In sentence level the main task is to check whether each sentence expressed a positive, negative, or neutral opinion. This level of analysis is closely related to subjectivity classification, which distinguishes objective sentences that express factual information from subjective sentences that express subjective views and opinion.Devika, M., Sunitha, C., & Ganesh, A. (2016). Sentiment Analysis: A Comparative Study on Different Approaches☆. Procedia Computer Science, 87, 44-49.
There is no fundamental difference between document and sentence level classifications because sentences are just short documentsLiu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies.
There is no fundamental difference between document and sentence level classifications because sentences are just short documentsLiu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies.
Document level and the sentence level analyses do not discover what exactly people liked and did not like. Aspect level performs finer-grained analysis. Instead of looking at language constructs (documents, paragraphs, sentences, clauses or phrases), aspect level directly looks at the opinion itself.Devika, M., Sunitha, C., & Ganesh, A. (2016). Sentiment Analysis: A Comparative Study on Different Approaches☆. Procedia Computer Science, 87, 44-49.
Ito, T., Tsubouchi, K., Sakaji, H., Yamashita, T., & Izumi, K. (2020). Word-Level Contextual Sentiment Analysis with Interpretability. AAAI.
There are many factors that make sentiment analysis difficult compared to traditional text classification. (1) Domain dependency, (2) Spam, (3) Limitation of classification filtering, (4) Asymmetry in availability of opinion mining software, (5) Incorporation of opinion with implicit and behavior data, (6) Natural language processing overheads.Mukkarapu, C.S., & Vemula, R.K. (2014). Opinion Mining and Sentiment Analysis: A Survey.
Since opinion mining is a relatively new filed, thus there are several challenges to be faced. According to Reference [4] current techniques are just primitive for opinions and comparisons identification and extraction. Mainly these challenges are related to the authenticity of the extracted data and the methods used in it.Seerat, B., & Azam, F. (2012). Opinion Mining: Issues and Challenges (A survey). International Journal of Computer Applications, 49, 42-51.
Besides the typical challenges known from natural language processing and text processing, many challenges for opinion mining in social media sources make the detection and processing of opinions a complicated task:(1)Noisy texts, (2)Language variations, (3) Relevance and boilerplate, (4) Target identification.Petz, G., Karpowicz, M.A., Fürschuß, H., Auinger, A., Stríteský, V., & Holzinger, A. (2013). Opinion Mining on the Web 2.0 - Characteristics of User Generated Content and Their Impacts. CHI-KDD.
There are several challenges in the field of sentiment analysis. The most common challenges are given here. Firstly, Word Sense Disambiguation (WSD), a classical NLP problem is often encountered. Secondly, addressing the problem of sudden deviation from positive to negative polarity. Thirdly, negations, unless handled properly can completely mislead. Fourthly, keeping the target in focus can be a challenge [7].Chandrakala, S., & Sindhu, C. (2012). Opinion Mining and Sentiment Classification: A Survey. SOCO 2012.
Although most works approach it as a simple categorization problem, sentiment analysis is actually a suitcase research problem that requires tackling many NLP tasksCambria, E., Poria, S., Gelbukh, A., & Thelwall, M. (2017). Sentiment Analysis Is a Big Suitcase. IEEE Intelligent Systems, 32, 74-80.
Such NLP problems are organized into three layers: syntactics, semantics, and pragmatics.

.

Read More

Tuesday, December 29, 2020

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Third Edition draft Daniel Jurafsky Stanford University James H. Martin University of Colorado at Boulder Copyright ©2020. All rights reserved. Draft of December 30, 2020. Comments and typos welcome!
Summary of Contents 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Regular Expressions, Text Normalization, Edit Distance . . . . . . . . . 2 3 N-gram Language Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Naive Bayes and Sentiment Classification . . . . . . . . . . . . . . . . . . . . . . . 55 5 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6 Vector Semantics and Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 7 Neural Networks and Neural Language Models . . . . . . . . . . . . . . . . . 127 8 Sequence Labeling for Parts of Speech and Named Entities . . . . . . 148 9 Deep Learning Architectures for Sequence Processing . . . . . . . . . . . 173 10 Contextual Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 11 Machine Translation and Encoder-Decoder Models . . . . . . . . . . . . . 203 12 Constituency Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 13 Constituency Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 14 Dependency Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 15 Logical Representations of Sentence Meaning . . . . . . . . . . . . . . . . . . . 305 16 Computational Semantics and Semantic Parsing . . . . . . . . . . . . . . . . 331 17 Information Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 18 Word Senses and WordNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 19 Semantic Role Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 20 Lexicons for Sentiment, Affect, and Connotation . . . . . . . . . . . . . . . . 393 21 Coreference Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 22 Discourse Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 23 Question Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 24 Chatbots & Dialogue Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492 25 Phonetics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526 26 Automatic Speech Recognition and Text-to-Speech . . . . . . . . . . . . . . 548 Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
Read More

Tuesday, October 13, 2020

6 practices for efficient Selenium Web Browser Automation

 

Given that Selenium is one of the most widely used frameworks for running automated tests on browsers, it is also one of the commonly discussed topics in testing circles. Selenium’s powerful open-source features and adoption across multiple browsers make it an exceptionally useful tool for browser automation.


Selenium allows the running of automated tests in multiple programming languages against browsers and devices by using a cloud Selenium Grid, similar to what BrowserStack provides. And since Selenium tests have become an indispensable part of the software testing pipeline, it makes sense to outline a few best practices.


This article will discuss five practices that contribute to efficient Selenium web browser automation, allowing thorough verification of websites within challenging timelines.


1. Correct Usage of Locators

The purpose of Selenium is to automate user actions, thus interacting with the browser in order to navigate, click, type, and run multiple operations to check objects within the DOM. Interacting with each web element on a website requires identifying those elements, using Locators in Selenium.


Selenium Locators are of numerous types:


Class

ID

Link Text

Tag Name

XPath

Selecting the right locator can make a test script flexible and offer a greater chance of success. Inversely a badly picked locator can result in a brittle test that breaks at the first sign of a UI change. Use unique Classes or IDs as locators, since it is unlikely that they will change without someone on the QA or dev team being informed. However, locators like link text can change quite often – for example, a dynamic button that appears only when a user is logged.


2. Use Data-Driven Testing

Data-Driven testing allows testers to use the same test and the same code for different input types and values. When combined with Selenium, it allows devs and QAs to insert modifications. Consequently, this can be used for system functional testing and browser compatibility testing.


Leverage Selenium’s proprietary test accelerator for automation in order to reduce the time required for each automation cycle to complete. Selenium also comes with more than 90 functional libraries for clients to use for the purpose of initiating the automation process.


3. Choose the Selector Order

Selectors like XPath and CSS are based on element locations. They operate slower when compared to locators like ID, Name, and Link Text. Name and ID are especially effective because they operate in a straightforward and direct way. CSS acts mostly as a combination of ID and Name. XPath, because of its complexity, should be used only as a last resort.


In degrees of their difficulty of use, Selenium locators stand in the following order: XPath < CSS < Links Text < Name < ID. Start with ID in the test code and make XPath the last selector.


4. Use Page Objects

Page Objects enhance test maintenance and reduce code duplication. Additionally, it is an object-oriented class (OOC) that serves as an interface to the page of the application under test. In other words, PageObject acts as an object-oriented design pattern, and web pages are defined as classes. The different web elements on each page become the variables for this pattern. User interactions (which are automated during testing) are implemented as methods.


Page Objects help create robust frameworks by resisting minor UI tweaks. They also help to separate test code and page code.

It ensures that services are not spread throughout the test but rather that there is a repository for all services offered by the page.

They are reliable, and easier to maintain.

They keep the script readable and code reusable.

They eliminate code duplication.

Try Running Tests on Cloud Selenium Grid for Free


5. Use Selenium Waits. Avoid Thread.Sleep

Instead of sleep, use Selenium Wait commands. When Thread.sleep() is used, the code will pause for the specified period of time, no matter what. However, with the Implicit Wait command, Selenium polls the DOM until an element is found or a condition is fulfilled. Its time is, by default, set to zero.


Obviously, using Implicit Wait is more effective than using Thread.sleep(). Why force the code to wait any amount of time it doesn’t have to? Every second counts within tight deadlines and waits are an excellent tool to establish better time management for Selenium tests.


6. Use Java Runtime Environment JRE 1.6

If, at the beginning of an integration test, the following error shows up


=java.lang.NoSuchFieldError:

java/util/concurrent/TimeUnit.HOURS.


It means that the latest version of Java is required to proceed with testing. Since the Selenium server is programmed with Java, a runtime error indicates that an upgrade is necessary. Download the latest version from the official Selenium website.


With the Java command present in the PATH, use command java -jar selenium-server-standalone-2.x.x.jar to start the Selenium server and replace 2.x.x with the actual Java version.


Automation testing with Selenium is a good way to create a stable, true, and reliable UI automation process. However, always pair Selenium tests with a multitude of real browsers and devices. Without being run in real user conditions, the result of any tests run will be inconclusive at best.


Simply use BrowserStack’s real device cloud to access thousands of real browsers and devices for testing purposes. Sign up, choose the required device-browser-OS combination, and start testing websites for free.

https://www.browserstack.com/guide/selenium-web-browser-automation

Read More

Modern Web Automation With Python and Selenium

 

TOPICS:

Motivation: Tracking Listening Habits

Setup

Test Driving a Headless Browser

Groovin’ on Tunes

Exploring the Catalogue

Building a Class

Collecting Structured Data

What’s Next and What Have You Learned?


Motivation: Tracking Listening Habits#

Suppose that you have been listening to music on bandcamp for a while now, and you find yourself wishing you could remember a song you heard a few months back.


Sure, you could dig through your browser history and check each song, but that might be a pain… All you remember is that you heard the song a few months ago and that it was in the electronic genre.


“Wouldn’t it be great,” you think to yourself, “if I had a record of my listening history? I could just look up the electronic songs from two months ago, and I’d surely find it.”


Today, you will build a basic Python class, called BandLeader that connects to bandcamp.com, streams music from the “discovery” section of the front page, and keeps track of your listening history.


The listening history will be saved to disk in a CSV file. You can then explore that CSV file in your favorite spreadsheet application or even with Python.


If you have had some experience with web scraping in Python, you are familiar with making HTTP requests and using Pythonic APIs to navigate the DOM. You will do more of the same today, except with one difference.


Today you will use a full-fledged browser running in headless mode to do the HTTP requests for you.


A headless browser is just a regular web browser, except that it contains no visible UI element. Just like you’d expect, it can do more than make requests: it can also render HTML (though you cannot see it), keep session information, and even perform asynchronous network communications by running JavaScript code.


If you want to automate the modern web, headless browsers are essential.


https://realpython.com/modern-web-automation-with-python-and-selenium/

Read More

Tuesday, October 22, 2019

A Survey of Sentiment Lexicons


.
A Survey of Sentiment Lexicons
Sagar Ahire Published 2015

Abstract This is a survey paper that introduces sentiment lexicons and explains the state of the art in the field of sentiment lexicons. Different kinds of lexicons are covered, varying in aspects such as coverage, methods of creation, lexical unit and granularity. It aims at giving a representative sampling of the field of sentiment lexicons.
.

.
https://pdfs.semanticscholar.org/2522/de6022acf2bc7d5c12a9467d4c41f6358920.pdf
Read More

Sentiment Analysis: Concept, Analysis and Applications


.
Sentiment analysis is contextual mining of text which identifies and extracts subjective information in source material, and helping a business to understand the social sentiment of their brand, product or service while monitoring online conversations. However, analysis of social media streams is usually restricted to just basic sentiment analysis and count based metrics. This is akin to just scratching the surface and missing out on those high value insights that are waiting to be discovered. So what should a brand do to capture that low hanging fruit?
With the recent advances in deep learning, the ability of algorithms to analyse text has improved considerably. Creative use of advanced artificial intelligence techniques can be an effective tool for doing in-depth research. We believe it is important to classify incoming customer conversation about a brand based on following lines:
  1. Key aspects of a brand’s product and service that customers care about.
  2. Users’ underlying intentions and reactions concerning those aspects.
These basic concepts when used in combination, become a very important tool for analyzing millions of brand conversations with human level accuracy. In the post, we take the example of Uber and demonstrate how this works. Read On!

Text Classifier — The basic building blocks

Sentiment AnalysisSentiment Analysis is the most common text classification tool that analyses an incoming message and tells whether the underlying sentiment is positive, negative our neutral. You can input a sentence of your choice and gauge the underlying sentiment by playing with the demo here.
Intent AnalysisIntent analysis steps up the game by analyzing the user’s intention behind a message and identifying whether it relates an opinion, news, marketing, complaint, suggestion, appreciation or query.
Analyzing intent of textual data
Contextual Semantic Search(CSS)
Now this is where things get really interesting. To derive actionable insights, it is important to understand what aspect of the brand is a user discussing about. For example: Amazon would want to segregate messages that related to: late deliveries, billing issues, promotion related queries, product reviews etc. On the other hand, Starbucks would want to classify messages based on whether they relate to staff behavior, new coffee flavors, hygiene feedback, online orders, store name and location etc. But how can one do that?
We introduce an intelligent smart search algorithm called Contextual Semantic Search (a.k.a. CSS). The way CSS works is that it takes thousands of messages and a concept (like Price) as input and filters all the messages that closely match with the given concept. The graphic shown below demonstrates how CSS represents a major improvement over existing methods used by the industry.
Existing approach vs Contextual Semantic Search
A conventional approach for filtering all Price related messages is to do a keyword search on Price and other closely related words like (pricing, charge, $, paid). This method however is not very effective as it is almost impossible to think of all the relevant keywords and their variants that represent a particular concept. CSS on the other hand just takes the name of the concept (Price) as input and filters all the contextually similar even where the obvious variants of the concept keyword are not mentioned.
For the curious people, we would like to give a glimpse of how this works. An AI technique is used to convert every word into a specific point in the hyperspace and the distance between these points is used to identify messages where the context is similar to the concept we are exploring. A visualization of how this looks under the hood can be seen below:
Visualizing contextually related Tweets
Time to see CSS in action and how it works on comments related to Uber in the examples below:
Similarly, have a look at this tweet:
In both the cases above, the algorithm classifies these messages as being contextually related to the concept called Price even though the word Price is not mentioned in these messages.

Uber: A deep dive analysis

Uber, the highest valued start-up in the world, has been a pioneer in the sharing economy. Being operational in more than 500 cities worldwide and serving a gigantic user base, Uber gets a lot of feedback, suggestions, and complaints by users. Often, social media is the most preferred medium to register such issues. The huge amount of incoming data makes analyzing, categorizing, and generating insights challenging undertaking.
We analyzed the online conversations happening on digital media about a few product themes: Cancel, Payment, Price, Safety and Service.
For a wide coverage of data sources, we took data from latest comments on Uber’s official Facebook page, Tweets mentioning Uber and latest news articles around Uber. Here’s a distribution of data points across all the channels:
  1. Facebook: 34,173 Comments
  2. Twitter: 21,603 Tweets
  3. News: 4,245 Articles
Analyzing sentiments of user conversations can give you an idea about overall brand perceptions. But, to dig deeper, it is important to further classify the data with the help of Contextual Semantic Search.
We ran the Contextual Semantic Search algorithm on the same dataset, taking the aforementioned categories in account (Cancel, Payment, Price, Safety, and Service).

FACEBOOK

Sentiment Analysis
Breakdown of Sentiment for Categories
Noticeably, comments related to all the categories have a negative sentiment majorly, bar one. The number of positive comments related to Price have outnumbered the negative ones. To dig deeper, we analyzed intent of these comments. Facebook being a social platform, the comments are crowded random content, news shares, marketing and promotional content and spam/junk/unrelated content. Have a look at the intent analysis on the Facebook comments:
Intent analysis of Facebook comments
Intent analysis of Facebook comments
Thus, we removed all such irrelevant intent categories and reproduced the result:
Filtered Sentiment Analysis
There is noticeable change in the sentiment attached to each category. Especially in Price related comments, where the number of positive comments has dropped from 46% to 29%.
This gives us a glimpse of how CSS can generate in-depth insights from digital media. A brand can thus analyze such Tweets and build upon the positive points from them or get feedback from the negative ones.

TWITTER

Sentiment Analysis
A similar analysis was done for crawled Tweets. In the initial analysis Payment and Safety related Tweets had a mixed sentiment.
Category wise sentiment analysis
To understand real user opinions, complaints and suggestions, we have to again filter the the unrelated Tweets(Spam, junk, marketing, news and random):
Filtered sentiment
There is a remarkable reduction in number of positive Payment related Tweets. Also, there is a significant drop in the number of positive Tweets for the category Safety(and related keywords.)
Additionally, Cancel, Payment and Service (and related words) are the most talked about topics in the comments on Twitter. It seems that people talked most about drivers cancelling their ride and the cancellation fee charged to them. Have a look at this Tweet:
Brand like Uber can rely on such insights and act upon the most critical topics. For example, Service related Tweets carried the lowest percentage of positive Tweets and highest percentage of Negative ones. Uber can thus analyze such Tweets and act upon them to improve the service quality.

NEWS

Sentiment Analysis for News headlines
Understandably so, Safety has been the most talked about topic in the news. Interestingly, news sentiment is positive overall and individually in each category as well.
We classified news based on their popularity score as well. The popularity score is attributed to the share count of the article on different social media channels. Here’s a list of top news articles:
  1. Uber C.E.O. to Leave Trump Advisory Council After Criticism
  2. #DeleteUber: Users angry at Trump Muslim ban scrap app
  3. Uber Employees Hate Their Own Corporate Culture, Too
  4. Every time we take an Uber we’re spreading its social poison
  5. Furious customers are deleting the Uber app after drivers went to JFK airport during a protest and strike

Conclusion

The age of getting meaningful insights from social media data has now arrived with the advance in technology. The Uber case study gives you a glimpse of the power of Contextual Semantic Search. It’s time for your organization to move beyond overall sentiment and count based metrics. Companies have been leveraging the power of data lately, but to get the deepest of the information, you have to leverage the power of AI, Deep learning and intelligent classifiers like Contextual Semantic Search and Sentiment Analysis. At Karna, you can contact us to license our technology or get a customized dashboard for generating meaningful insights from digital media. You can check the demo here.

ParallelDots AI APIs, is a Deep Learning powered web service by ParallelDots Inc, that can comprehend a huge amount of unstructured text and visual content to empower your products. You can check out some of our text analysis APIs and reach out to us by filling this form here or write to us at apis@paralleldots.com.
.
https://towardsdatascience.com/sentiment-analysis-concept-analysis-and-applications-6c94d6f58c17
Read More

Monday, October 21, 2019

Gavagai Sentiment Analysis and Opinion Mining


.
Sentiment Analysis and Opinion Mining

Sentiment analysis, also known as opinion mining, is a practice of gauging the sentiment expressed in a text, such as a post in social media or a review on Google. Analysts typically code a solution (for example using Python), or use a pre-built analytics solution such as Gavagai Explorer.
What is Sentiment Analysis?
Sentiment analysis or opinion mining is a notoriously difficult sub-field of Natural Language Processing and Data Science. At the most fundamental level, the task is to take a piece of text and automatically score it for the opinions and sentiments contained within.

“I had the most wonderful stay” (= positive/satisfaction).
“I’m really disappointed with the battery life of my device” (= negative/dissatisfaction).
These examples are relatively easy to deal with. However, we soon run into problematic cases.

The phone was well packaged but I had to wait a whole week for delivery.
It is obvious for a human to infer that the customer is dissatisfied with the delivery speed. But, taking a step back, where it is actually mentioned that waiting a week for delivery is bad? There are no overtly negative words.

It is also important to separate the satisfaction with the packaging from the dissatisfaction with the delivery. These are different, unrelated aspects of the product.Sentiment AnalysisThere is an abundance of other difficulties with automatic sentiment analysis, including, but not limited to: lexical ambiguity, domain dependent model overfitting, lack of training data, lack of sufficiently-varied training data.

Why is Sentiment Analysis important?
Automated Sentiment Analysis is essential for properly understanding and quantifying the opinions expressed in the text. With large amounts of data, understanding the feedback in any meaningful way becomes time-consuming and expensive. On an Internet-wide scale, resorting to manual categorisation is impossible.

For online data, the insight lies in how people online are talking about your brand. For proprietary data, such as customer satisfaction or employee satisfaction reviews, the key business insight is in properly gauging the satisfaction level of respondents.

How does Gavagai handle Sentiment Analysis?
The most common sentiment analysis solutions in the industry use a machine learning (or deep learning) approach. An algorithm makes generalisations from large, annotated sets of data which are applied to customer texts. These models function as a ‘black box’ with no possibility of explanation or interpretation. Such an approach does also not transfer well to unseen data from other domains or industries.

Most services offer a binary classification (positive/negative) or a ternary classification (positive/negative/neutral). At Gavagai, we offer a wide spectrum of eight different sentiments: positivity, negativity, scepticism, love, hate, fear, desire and violence. This provides a more nuanced understanding of texts and comments.

We rely on a heuristic-based method which is explainable, interpretable and scalable. It has also proven to work well on gold standard benchmarks from academia. In experiments for customers, the method performs well across a range of different data types, freeing us from the classic Machine Learning problem of overfitting. (This is where model learns patterns that are too specific to the data it was trained on. This is at the expense of generalising well to unseen data. Dealing with new data is extremely important for commercial sentiment analysis).

A more advanced task is to identify how expressed opinions actually relate to the different entities in the text.

The food was delicious but the service was appalling.
In this last example, it is helpful if we can attach the sentiment of ‘delicious’ to ‘food’ and the sentiment of ‘appalling’ to ‘service’. We use a topical sentiment detection algorithm to attach sentiments in the text to the topics they describe. This is sometimes called aspect-based sentiment analysis.

Gavagai Explorer works with sentiment analysis in Azerbaijani, Albanian, Arabic, Bengali, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Farsi, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Korean, Latvian, Lithuanian, Malay, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Ukrainian, Urdu, and Vietnamese.
.
https://www.gavagai.io/text-analytics/sentiment-analysis-opinion-mining/
Read More