This article exhibits how to use the library VADER for doing the sentiment analysis.
Sentiment analysis is a metric to that conveys how positive or negative
or neutral the text or data is. It is performed on textual data to help
businesses monitor brand and product sentiment in customer feedback, and
understand customer needs. It is
time-efficient, cost-friendly solution to analyse huge data.
Python avails great
support for doing sentiment analysis of data. Few of the libraries available
for this purpose are: NLTK, TextBlob and VADER.
For doing sentiment
analysis of Indic languages such as Hindi we need to do following tasks.
1.
Read the text file which is in Hindi.
2.
Translate the sentences in Hindi to the
sentences in English as the python libraries do support text-analysis in the English
language. (Even if you give the Hindi
sentences to such functions the ‘compound score’ which is metric of the
sentiment if the sentence is calculated in a wrong manner. So before computing
this metric conversion to the equivalent sentence in the English language is
appropriate.) The Google Translator helps in this task.
3.
Do sentiment analysis of the translated text
using any of the libraries mentioned above.
Step 1: Import the necessary libraries / packages.
# codecs provides access to the internal Python
codec registry
import codecs
# This is to translate the text from Hindi to
English
from deep_translator import GoogleTranslator
# This is to analyse the sentiment of text
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
Step 2: Read the file data. The ‘codecs’ library provides access to the internal Python codec registry. Most standard codecs are text encodings, which encode text to bytes. Custom codecs may encode and decode between arbitrary types.
# Read the hindi text into 'sentences'
with codecs.open('SampleHindiText.txt',
encoding='utf-8') as f:
sentences
= f.readlines()
§ Step 3: Translate the sentences read into the English so that VADER library can process the translated text for sentiment analysis. The polarity_scores() returns the sentiment dictionary of the text which includes the ‘'compound'’ score that tells about the sentiment of the sentence as given below.
* positive sentiment: compound score >= 0.05
* Neutral sentiment : compound score > -0.05 and compound score < 0.05
* Negative sentiment : compound score <= -0.05
for sentence in sentences:
translated_text = GoogleTranslator(source='auto',
target='en').translate(sentence)
#print(translated_text)
analyzer
= SentimentIntensityAnalyzer()
sentiment_dict = analyzer.polarity_scores(translated_text)
print("\nTranslated
Sentence=",translated_text, "\nDictinary=",sentiment_dict)
if
sentiment_dict['compound'] >= 0.05 :
print("It is a Poistive Sentence")
elif
sentiment_dict['compound'] <= - 0.05 :
print("It is a Negative Sentence")
else
:
print("It is a Neutral Sentence"))
· The source file 'SampleHindiText.txt' is as given below.
गोवा की यात्रा बहुत अच्छी रही।
समुद्र तट बहुत गर्म थे।
मुझे समुद्र तट पर खेलने में बहुत मजा आया।
मेरी बेटी बहुत गुस्से में थी।
·
The output of the code is shown as below.
No comments:
Post a Comment