Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Sentiment Analysis for Customer Review: Case Study of GO-JEK Expansion Abstract Background: Market prediction is an important thing that needs to be analyzed deeply. Business

Sentiment Analysis for Customer Review: Case Study of GO-JEK Expansion

Abstract

Background: Market prediction is an important thing that needs to be analyzed deeply. Business intelligence becomes an important analysis procedure for analyzing the market demand and satisfaction. Since business intelligence needs a deep analysis, sentiment analysis becomes a powerful algorithm for analyzing customer review regarding to the business intelligence analysis.

Objective: In this study, we perform a sentiment analysis for identifying the business intelligence analysis in GO-JEK.

Methods: We use Twitter posts collected from the Twint library which consists of 3111 tweets. Since the dataset did not provide a ground truth, we perform Microsoft Text Analytic for determining positive, neutral, and negative sentiment. Before applying Microsoft Text Analytic, we conduct a pre-processing step to remove the unwanted data such as duplicate tweets, image, website address, etc.

Results: According to the Microsoft Text Analytic, the results are 666 positive sentiment numbers, 2055 neutral sentiment numbers, and 127 negative sentiment numbers.

Conclusion: According to these results, we conclude that most GO-JEK customers are satisfied with the GO-JEK services. In this research, we also develop classification model to predict the sentiment analysis of new data. We use some classifier algorithms such as Decision Tree, Nave Bayes, Support Vector Machine and Neural Network. In the result, the system shows that the decision tree provides the best performance.

I.NTRODUCTION

Predicting market demand and customer satisfaction requires an extensive market research to predict the market demand and customer satisfaction. Agile software development (ASD) published for helping engineer in business management problem, has been modified by applying it in the business intelligence field for producing the fast analytics in data science. Recently, business intelligence known as BI has become an important analytical procedure for analyzing market demand and satisfaction. BI is related as a technology used to analyze the business data regarding to the decision making process based on prediction analysis and provided the data enhancement process followed by an analytical technology which is widely used in several aspects such as market intelligence, e-commerce, e-health etc. [1, 2]. Currently, researchers have been accomplishing studies and concluding that BI affords two things which are process and product. Process defines as a procedure that is used to produce the important information regarding the business process. Product defines as information about empowering organizations to predict the service, customer satisfaction, competitor, marketplace, technology, etc. These two important things become a necessary part in market research and have a powerful analysis result to predict the market demand and customer satisfaction [1, 3].

Regardless of enthusiasm in the advantages of BI, conducting the BI analysis process needs a deep analysis. Sentiment analysis is one of approaches that is widely used to support the BI analysis process. Sentiment analysis is defined as a text data analysis that provides a deep analysis about sentiments, opinions, even expressed emotions and it allows us to predict the chances regarding to the analysis result.

Sentiment analysis is widely used in several cases, particularly in the review analytics cases, for example, in the analysis of praise and complaint sentences. According to the previous study, praise sentences indicate a positive subset which consists of adjectives, intensifiers, etc. Meanwhile, the complaint sentences showed a negative subset. However, in the real world, praise and complaint sentences have been delivered in the complex form. Hence, a deep analysis such as sentiment analysis may help to resolve this problem [5]. Other cases, sentiment analysis also successfully gives an effective recommendation in analyzing the customer review about airport services [6, 7, 8]. Sentiment analysis also has a powerful performance for giving recommendation on travel destination [9, 10, 11, 12, 13, 14]. Considering these facts, it can be concluded that sentiment analysis is suitable for giving recommendation and it may also give a suitable result in predictive analytics for business intelligence. Despite the powerful performance of sentiment analysis, the complexity of data also should be considered. Twitter posts are one of sentiment data that have high complexity and widely used for sentiment analysis. Twitter is a social media that is often used to share information, news and opinions. Previous study has reported the number of Twitter user is rapidly growth up in the last decade which turned into a popular media to express the public opinions.

In this study, a business intelligence based on customer review analytics was proposed to investigate and analyze the business solution in GO-JEK application case. We want to analyze the performance of sentiment analysis in evaluating customer satisfaction with GO-JEK services. We also want to know how the customer review to the GO- JEK services and want to evaluate what the services that are satisfied enough and what the services that need to be improved. In the business process, these analyses might become a suggestion for improving the services. Hence, conducting sentiment analysis to evaluate the GO-JEK is important for the business process.

GO-JEK was chosen because several countries were against GO-JEK to be operated in their countries such as Malaysia. Several opinions, reactions and perceptions are popped up about this case. Hence, we performed predictive analytics for business intelligence based on customer reviews in order to analyze how the GO-JEK customer responses about their services and to investigate the chance of GO-JEK to be operated in Malaysia. For the dataset, we use Twitter posts considering the trend of Twitter usage for expressing public opinions.

To address these research gaps, we presented a sentiment analysis on the customer review collected by Twitter posts. To handle sentiment classification task, we can use available commercial cloud services tools. Some cloud services that have been employed by other researchers are Amazon Web Service [16, 17], IBM Cloud [16], Google Cloud Platform [16] and Microsoft Text Analytics [16, 18, 19]. However, Arijit and William in [18] found that Microsoft Text Analytics was 10-20% better in identifying positive and negative sentiment when evaluated in various tweet datasets. Furthermore, Qaisi and Aljarah wrote that Microsoft could benefit from Twitter page by growing promotion offers to the customers to enhance their loyalty and satisfaction [19]. Therefore, according to the evidence stated by previous works, we performed Microsoft Text Analytics for making a ground truth data. Moreover, machine learning classifiers were approached for predicting the customer satisfaction analysis. We also analyzed the business process according to sentiment analysis result for making market decision and increasing the services.

I.METHODS

This research is conducted in four processes as depicted in Fig. 1. These all steps are described in the following subsection.

A.Dataset

In this study we used Twitter posts collected from the Twint library. Data collection was done by mining data using the following rules:

1.Twitter posts are filtered only in the countries that allows GO-JEK to operate such as Singapore, Thailand and Vietnam.

2.Twitter posts are collected since January 1, 2019.

3.All Twitter posts that contains of GOJEK tweeted by the Twitter users in Singapore, Thailand and Vietnam.

4.All Twitter posts that contains both "GOJEK" and "Singapore" keywords.

5.All Twitter posts that contains both "GOJEK" and "Thailand" keywords.

6.All Twitter posts that contains both "GOJEK" and "Vietnam" keywords. According to rules above, we got the data as shown in the Table 1.

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed
TABLE 1. DATASET Country Singapore Thailand Vietnam Number of tweets 1948 704 459Data Collection (T weet crawling result) Pre-processing Step (Removing unwanted data) Sentiment Analysis Step (using Microsoft Text Analytics) Top Words Extraction Development of Classification Model (neural network, SVM, Naive Bayes, decision tree) Fig. 1. Overall stages in this work.TABLE 2 DATA DISTRIBUTION FOR CLASSIFICATION PROCESS Data Total Negative 127 Neutra 127 Positive 127 II. RESULTS A. Pre-processing and Sentiment Analysis We collected 3111 tweets from January 1, 2019 that contain keyword of "Gojek" and "Singapore", "Gojek" and "Vietnam", and keyword "Gojek" and "Thailand". After collecting the data, we perform the pre-processing step by removing any unwanted data. The pre-processing step successfully removed 163 which were duplicated tweets. In the next step, we perform Microsoft Text Analytics API to give label for each data. This process obtained 2055 neutral sentiments, 666 positive sentiments and 127 negative sentiments considering the number of sentiment score provided by Microsoft Text Analytics API. These results are illustrated in Fig. 2. An example of labelled tweet is drawn in Table 3. The sentiment score of each label provided by Microsoft Text Analytics API is drawn in Table 4. TABLE 3 EXAMPLE OF LABELLED TWEET Tweet Score Label Fintech Viral GOJEK C: Claimed Singapore Unicom, BKPM Boss 0.3 Apologize 30 July 2019 22:20 - CNBC Indonesia Honestly I hate GOJEK drivers. They ALWAYS cancel on me. Little shits 0.014 In Indonesia grab is cheaper. While in Singapore GOJEK is cheaper. 0.818 Interesting- TABLE 4 SENTIMENT SCORE FOR LABELLING PROCESS Label Score Negative 0 to legs then 0.5 Neutral Exactly 0.5 Positive More than 0.5 to 1Positive . Neutral . Negative Negative 127 Positive 666 Neutral 2055 Fig 2. Sentiment population. B. Top Words Extraction In this stage, we extracted the top positive and negative sentiment. This extraction process was done by determining the highest number of sentiments in both positive and negative sentiment. Negative sentiment was extracted to determine which services that were often complained by customer. Positive sentiment was extracted to determine which services that should be maintained. Results of this step are illustrated in Table 5 and Table 6. Table 5 shows the top three keywords indicating the negative sentiment. However, in the result, we found keywords that ambiguous, for example "via YouTube", "like", "di", therefore we dropped them TABLE S TOP NEGATIVE SENTIMENT Keyword Score Cancel 9 Hate driver 3 tidak biga" 3TABLE 5 TOP NEGATIVE SENTIMENT Keyword Score Cancel 9 Hate driver tidak biga" 3 TABLE 6 TOP POSITIVE SENTIMENT Keyword Score Ride bailing 50 Available consumer 37 Consumer commodities 33 C. Development of Classification Model After giving label in each data and analyzing them, we used the labeled data for developing the classification model. This model was used to predict the new data. We applied machine learning approach to develop the best classification model. In this research work, we applied four machine learning algorithm which are neural network, support vector machine, Naive Bayes and decision tree. To evaluate the performance of each model, we used precision, recall and fl- score calculation. These parameters were often used to validate the performance of classification model and have good performance in text classification cases. In our experimental result, we achieved performance evaluation as shown in Fig. 3. The comparison results of each algorithm also shown in Fig. 3. From the performance evaluation result, neural network algorithm achieves precision of 0.52, recall of 0.51 and fl-score of 0.51. While support vector machine obtains precision of 0.54, recall of 0.53 and fl-score of 0.53. Naive Bayes obtains precision of 0.52, recall of 0.52 and fl-score of 0.52. Finally, decision tree obtains precision of 0.55, recall of 0.55 and fl-score of 0.55.0,56 0.55 0,55 0,55 0,55 0,54 0,54 0,53 0,53 0.53 0,52 0.52 0.52 0,52 0,52 0,51 0,51 0,51 0.5 0.49 Precission Recall F1-Score Neural Network Naive Bayes Decision Tree SVM Fig 3. Classification result of each model

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction to Management Science

Authors: Bernard W. Taylor

11th Edition

132751917, 978-0132751919

More Books

Students also viewed these General Management questions