h5 text-align: center; page-break-inside: avoid; orphans: 2; widows: 2;

h5 { margin-top: 0.28cm; margin-bottom: 0.14cm; direction: ltr; text-align: center; orphans: 2; widows: 2; page-break-after: auto; }h5.western { font-family: “Calibri”,serif; font-size: 13pt; font-style: italic; }h5.cjk { font-family: “Times New Roman”; font-size: 13pt; font-style: italic; }h5.ctl { font-family: “Times New Roman”; font-size: 13pt; font-style: italic; }h2 { margin-top: 0.21cm; margin-bottom: 0.11cm; direction: ltr; text-align: left; page-break-inside: avoid; orphans: 2; widows: 2; }h2.western { font-family: “Times New Roman”,serif; font-size: 10pt; font-style: italic; font-weight: normal; }h2.cjk { font-family: “MS Mincho”; font-size: 10pt; font-style: italic; font-weight: normal; }h2.ctl { font-family: “Times New Roman”; font-size: 10pt; font-style: italic; font-weight: normal; }h1 { margin-top: 0.28cm; margin-bottom: 0.14cm; direction: ltr; font-variant: small-caps; text-align: center; page-break-inside: avoid; orphans: 2; widows: 2; }h1.western { font-family: “Times New Roman”,serif; font-size: 10pt; font-weight: normal; }h1.cjk { font-family: “MS Mincho”; font-size: 10pt; font-weight: normal; }h1.ctl { font-size: 10pt; font-weight: normal; }p { text-indent: 0.51cm; margin-bottom: 0.21cm; direction: ltr; line-height: 95%; text-align: justify; orphans: 2; widows: 2; }p.western { font-family: “Times New Roman”,serif; }p.cjk { font-family: “MS Mincho”; }

A Model for Context-based
Similarity Measurement of opinions by Using Dynamic Weighting Scheme

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

1Dr. Dharmendra
Sharma

Associate Professor, SDBCE

Department of Computer Science and Engineering

Indore, Madhya Pradesh, India

[email protected]

2Ashish Sharma

Professor, WPC

Department of Computer Science and Engineering

Indore, Madhya Pradesh, India

[email protected]

Abstract—
In this
paper, we represent the model for context based similarity
measurement of opinions by using dynamic weighting scheme. The
semantic of
opinion is
decided by a context at run time. All weighing scheme generally use a
static weigh for opinion representation. It is important to
weight each
element of each opinion by a context. In this paper we present a
model to evaluate the similarity between opinions by using context at
run time. We test our model with review of jio users. We found that
the relevancies of opinions are very from context.

Keywords—
Semantic computing;
dynamic
weighting;relative
semantics; context;opinion mining.

Introduction

Opinion
mining is crucial for both individuals and companies. Individuals may
want to see the opinion of other customers about a product to analyze
it before buying it. Companies want to analyze the feedback of
customers about their products to make future decisions. So,
analyzing customer’s opinion and their response is important.
Mining is used on product reviews that are available on different
blogs, web forums, and product review sites to evaluate opinions of
customers. By doing so, new customers are able to find views of
others about a product and can decide which product to buy by the
help of opinion of customers already using the product. In addition
comparison of same feature of products by different vendors is done.
In this way companies can focus on improving the features of their
product that are not popular among customers. This leads to overcome
the requirements of marketing intelligence and product benchmarking
in the production industry. It’s an emphasis of the marketing
places to identify and subsequently satisfy consumer needs. So, in
order to examine consumer needs and to implement effective marketing
strategies aimed at satisfying these needs, marketing managers need
relevant, current information about consumers, competitors and other
forces in the marketplace. There is not much study on opinions in the
past, as a significant part of consumer information, which is present
in company’s own databases, has been ignored in the past and there
was less opinionated text available before the existence of World
Wide Web. Opinion mining involves Text Mining and Language Processing
(NLP) and Text Classification. Text Mining has a great potential to
overcome the current deficiencies. Unfortunately, natural language
processing (NLP) encounters a range of difficulties due to the
sophisticated nature of human language. Moreover, the area of opinion
mining involves the problem of Text Classification, which is totally
different from the usual Text Mining. In usual Text Classification,
the focus is on identifying topic, whereas for opinion mining,
Sentiment Classification is done which focuses on the assessing
writer’s sentiment toward the topic. Emotions are not
satisfactorily analyzed with keyword based methods

In this paper, we represent the model for context based similarity
measurement of opinions by using dynamic weighting scheme. The
semantic of opinion is decided by a context at run time.

Related Work

Researchers
focus on extracting the affective content of a textual document from
the detection of expressions of “bag of sentiment words” at
different levels of granularity. The challenge here is to correctly
classify a document’s viewpoint (or polarity) as positive, negative
or somewhere in-between. In 1, Bing Liu et al. described a model
where the task of feature-based opinion summarization is performed by
first mining the product features that have been commented on by
customers using association mining technique, then identifying
opinion sentences in each review and deciding whether each opinion
sentence is positive or negative using a set of seed adjectives along
with their orientations that grows later using WordNet and finally
summarizing the results. T. Ahmad et al. 2 developed an opinion
mining system where the features and opinions are extracted using
semantic and linguistic analysis of text documents; the polarity of
the opinion sentences is discovered using polarity scores given by
SentiWordNet and the generated summary is presented using a
visualization module in a comprehendible way. L. Zhao et al. in 3
introduced a fine-grain approach for opinion mining, which uses an
ontology structure as an essential part of the feature extraction
process by taking into account the relations between concepts. The
approach involves data processing which includes POS-tagging and word
segmentation. Then feature extraction is performed with integrated
ontology that boosts the process accuracy. Then polarity
identification is performed with the help of SentiWordNet and finally
sentiment analysis is done to deal with negation and other semantics.
In 4, W. Zhang et al. developed a system called Weakness Finder
that helps the manufacturers to find their product weakness from
Chinese reviews by using aspect based sentiment analysis. The system
extracts and group explicit features by using Morpheme based method
and Hownet based similarity measure. Next it identifies and groups
implicit features with collocation selection method for each aspect.
Finally the polarity is determined by sentence based sentiment
analysis method. The authors here assumes that the weakness of any
product can be found out easily because the weakness would be
probably the most unsatisfied aspect discussed in the customers’
reviews. A. Dengel et al. in 5 presented an extractive aspect-based
sentiment summarization system which consist of an aspect detector
for feature extraction that occurs frequently, a clustering module to
cluster all the documents that have the occurrence of same aspect
word within them in one group, a hybrid polarity detection system
along with their generated feature set for determining opinion
orientation and a textual and graphical summary generator module
which uses an unsupervised polarity detection and ranking algorithm
developed by them for summary generation. In 6, A. Bagheri et al.
proposed a novel unsupervised and domain-independent model for
detecting explicit and implicit aspects in reviews for sentiment
analysis. In the model, first a generalized method is proposed to
learn multi-word aspects and then a set of heuristic rules is
employed to take into account the influence of an opinion word on the
detected aspect. Second a new metric based on mutual information and
aspect frequency is proposed to score aspects with a new
bootstrapping iterative algorithm which works with an unsupervised
seed set. Third, two pruning methods based on the relations between
aspects in reviews are presented to remove incorrect aspects. Finally
the model employs an approach which uses explicit aspects and opinion
words to identify implicit aspects. R. Kumar et al. in 7 provided a
method to mine different product features and opinion words based on
customer opinion expressed in the review using a semantic based
approach based on typed dependency relations. They tried to identify
frequently and infrequent features from the given customer review
using the typed dependency relations between each word present in the
sentence and the opinion lexicon consisting of a list of subjective
positive and negative words. They also tried to resolve the problem
of pronoun resolution by replacing pronoun with appropriate product
feature. The authors considered adjectives, verbs and even nouns as
opinion words during their identification. Finally a summary
consisting of positive and negative opinion sentences related to each
product feature is generated. K. Bafna et al. in 8 proposed a
dynamic, domain-dependent system for feature-based opinion
summarization of customers’ opinions for online products. Firstly,
identification of features of a product from customers’ opinion is
done using association rule mining and probabilistic power equation.
Next, for each feature, its corresponding opinions are extracted and
their orientation (positive/negative) is detected after forming
feature-opinion pair by assigning the opinion word to nearest
feature. At last, featurebased summarizations of the reviews are
generated. In 9, M. Dalal et al. presented a semi-supervised
approach for mining online user reviews to generate comparative
featurebased statistical summaries. It includes phases like
preprocessing, feature extraction, followed by sentiment
classification and summarization. They performed basic cleaning tasks
like sentence boundary detection and spell-error correction in the
preprocessing phase. Then after performing POS tagging using Link
Grammar Parser, frequently occurring nouns (N) and noun phrases (NP)
are considered as the possible opinion features based on multiword
approach which are extracted along with the associated adjectives
describing them, as indictors of their opinion orientation. Once
features and opinion words are extracted, the sentiment polarity of
the opinions is determined using SentiWordNet. In 10, D. Wang et
al. developed SumView, a Web-based review summarization system, to
automatically extract the most representative expressions and
customer opinions in the reviews on various product features. The
system focuses on delivering the majority of information contained in
the review documents by selecting the most representative review
sentences for each extracted product feature. Once the product
reviews are crawled, POS tagging and stop word removal processes are
performed. Then the term-sentence matrix is constructed where each
row represents a term and each column represents a sentence. Product
features are extracted using association rule mining which users can
select as per their wish and requirement. Once selected, the proposed
feature-based weighted non-negative matrix factorization algorithm is
performed to group the sentences into feature relevant clusters.
Finally, the sentence with the highest probability in each cluster is
selected to be presented in the summary for each feature. D.
Toshniwal et al. in 11 proposed an Aspect Base Sentiment Analysis
System (ASAS), which handles the context dependent opinion words. The
authors used an online dictionary for classifying the context
dependent opinion word and then used natural linguistic rules to
assign polarity to them which later become a training data set. Next
for classification of the remaining opinion words, they used opinion
words and features together because the same opinion word can have
different polarity in the same domain for different features. Finally
the system generated a short summary for a particular product based
on each feature. M. Zaveri et al. in 12 proposed a technique that
extends the feature-based classification approach to incorporate the
effect of various linguistic hedges by using fuzzy functions to
emulate the effect of modifiers, concentrators, and dilators. The
authors presented a Fuzzy Opinion Classification technique where the
user reviews are classifies as very positive, positive, neutral,
negative, very negative. For this classification, they first extract
the features, associated descriptors, and hedges, then they identify
the polarity and initial value of the feature descriptors based on
SentiWordNet score and finally calculates overall sentiment score
using fuzzy functions to incorporate the effect of linguistic hedges.
S. Joseph et al. in 13 proposed a new syntactic based approach for
aspect level opinion mining which uses syntactic dependency,
aggregate score of opinion words, SentiWordNet, and aspect table
together for opinion mining of restaurant reviews. The core tasks
involved are aspect identification, aspect based opinion word
identification and its orientation detection. In the proposed method,
aspect and the associated opinion words are extracted using
dependency parsing, polarity of opinions is determined using
SentiWordNet and finally adjective, adverb-adjective, adverb-verb
combinations are produced which shows the positiveness / negativeness
of each aspect. In 14, Z. Hai et al. employed a corpus-statistics
association measure to identify features, including explicit and
implicit features, and opinion words from reviews. The authors first
extract explicit features and opinion words via an association-based
bootstrapping method (ABOOT) which starts with a small list of
annotated feature seeds and then iteratively recognizes a large
number of domain-specific features and opinion words by discovering
corpus statistics association between each pair of words on a given
review domain. Next they provided a natural extension to identify
implicit features by employing the recognized known semantic
correlations between features and opinion words.

However,
our
method
differs
in
the
purpose
from
other
methods
clearly.
The purpose
of our method
is to evaluate the relevancy of opinions by using dynamic
weights.

Methodology

In
this section, we represent
the method calculating the weight of opinion based on context and the
procedure of evaluating the relation between opinions. For the data
set we use facebook page of jio user to aggregate the data. Our
weighting scheme consists of four different steps.

Data Filteration

Before creating the document
Term Matrix we use following steps to filter the data

1) Convert text into lower case.

2) Remove stop word, number and
punctuation character from text.

3) Calculate the inverse
document frequency of terms.

4) Create opinion term
matrix.

5) Remove sparse term from
step 4.

B. Opinion Matrix
Creation

1) Create document term vector
for each opinion. Each
opinion is
represented
in
such
above.

2) Merge all opinion vectors
into a single document term matrix T

3) Eliminate the feature that
appeared in all opinions (Here,
the
features
that
appear
in every
opinion are
eliminated,
because
it
is not
the
features
that
represent
a opinion
explicitly)

C. Dynamic weight
calculation:

The
inner
product
of the
context
and
the
opinion matrix
T is
calculated.
This
changes
the
Matrix
T into
matrix
T’
according
to the context.

T ‘
= T
?
Context

where
?
mean
the multiplication of
corresponding elements.

D. Evaluate Similarity
between Opinion

To evaluate the similarity
between opinions on matrix T’ any type methods
are
acceptable
such
as inner
product,
distance,
norm,
cosine
similarity, etc. In
this paper, we use an
inner product.

Experimental System Setting

We use
facebook page for
collection of opinions.
We
aggregate 1233 opinions.
This
experiment
system
is implemented
by R
including
tm package.
We use the context as “wifi” and “speed”.

In this
experiment,
we use the context as “wifi” and “speed” to evaluate the
similarity between opinions.
We show
the
difference
in the
inner product of
the
case
of context
“WiFi” and
“Speed”.

Experimental Result

We
have shown the
result
in the
case
of context “wifi”
in TABLE
1. Form the table we can
observe that opinion one is closer to opinion one, two and three,
opinion two is closer to one, six and seven

TABLE
I: INNER PRODUCT OF OPINION WITH CONTEXT
“WiFi”

Opinion

1

2

3

4

5

6

7

1

35.01

16.25

10.30

8.28

2.76

2.61

2.58

2

16.25

4.78

3.23

1.07

2.56

7.23

6.24

3

10.30

3.23

4.56

7.36

3.12

1.67

0.34

4

8.28

1.07

7.36

3.67

7.45

6.23

12.3

5

2.76

2.56

3.12

7.45

1.36

7.42

2.23

6

2.61

7.23

1.67

6.23

7.42

5.24

6.03

7

2.58

6.24

0.34

12.3

2.23

6.03

3.43

Next,
we
show the
result
in the
case
of context “Speed”
in
TABLE
II.
From table 2 we can observe that the similarities between
opinions are very with context.

TABLE
2: INNER PRODUCT OF OPINION WITH CONTEXT
“Speed”

Opinion

1

2

3

4

5

6

7

1

5.34

7.81

3.45

8.56

6.59

2.31

3.32

2

7.81

4.23

2.31

0.24

1.73

2.74

6.89

3

3.45

2.31

6.31

5.26

3.33

0.27

4.56

4

8.56

0.24

5.26

8.23

9.26

5.23

3.01

5

6.59

1.73

3.33

9.26

8.23

6.51

3.42

6

2.31

2.74

0.27

5.23

6.51

4.28

5.78

7

3.32

6.89

4.56

3.01

3.42

5.78

8.31

If
the context is “speed” the opinion 1 is closer to 4,2 and 5 as
their inner product value are 8.56, 7.21 and 6.59.

Conclusion

In this paper, we
represent the model for context based similarity measurement of
opinions by using dynamic weighting scheme. The semantic of opinion
is decided by a context at run time. We
use
facebook page for
collection of opinions.
We
aggregate 1233 opinions.
In this
experiment,
we use the context as “wifi” and “speed” to evaluate the
similarity between opinions. From the result we found that 431
opinions are similar to each other with context “Wifi”. On the
other hand if we change the context as “speed” then we found 736
opinions similar to each other.
References

1 M. Hu and B. Liu, “Mining and summarizing customer reviews,”
Proc. 2004 ACM SIGKDD Int. Conf. Knowl. Discov. data Min. KDD 04,
vol. 04, pp. 168-177, 2004.

2 M. Abulaish, Jahiruddin, M. N. Doja, and T. Ahmad, “Feature and
opinion mining for customer review summarization,” Lect. Notes
Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.
Notes Bioinformatics), pp. 219–224, 2009.

3 L. Zhao and C. Li, “Ontology based opinion mining for movie
reviews,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes
Artif. Intell. Lect. Notes Bioinformatics), pp. 204–214, 2009.

4 W. Zhang, H. Xu, and W. Wan, “Weakness Finder: Find product
weakness from Chinese reviews by using aspects based sentiment
analysis,” Expert Syst. Appl., vol. 39, no. 11, pp. 10283–10291,
2012.

5 S. A. Bahrainian and A. Dengel, “Sentiment Analysis and
Summarization of Twitter Data,” Comput. Sci. Eng. (CSE), 2013 IEEE
16th Int. Conf., pp. 227–234, 2013.

6 A. Bagheri, M. Saraee, and F. De Jong, “Care more about
customers: Unsupervised domain-independent aspect detection for
sentiment analysis of customer reviews,” Knowledge-Based Syst.,
vol. 52, August, pp. 201–213, 2013. 7 R. K. V and K. Raghuveer,
“Dependency driven semantic approach to Product Features Extraction
and Summarization Using Customer Reviews,” pp. 225–238, 2013.

8 K. Bafna and D. Toshniwal, “Feature based Summarization of
Customers’ Reviews of Online Products,” Procedia Comput. Sci.,
vol. 22, pp. 142–151, 2013.

9 M. K. Dalal and M. a. Zaveri, “Semisupervised Learning Based
Opinion Summarization and Classification for Online Product Reviews,”
Appl. Comput. Intell. Soft Comput., vol. 2013, pp. 1–8, 2013.

10 D. Wang, S. Zhu, and T. Li, “SumView: A Web-based engine for
summarizing product reviews and customer opinions,” Expert Syst.
Appl., vol. 40, no. 1, pp. 27–33, 2013. 11 H. Kansal and D.
Toshniwal, “Aspect based Summarization of Context Dependent Opinion
Words,” Procedia Comput. Sci., vol. 35, pp. 166–175, 2014.

12 M. K. Dalal and M. a. Zaveri, “Opinion Mining from Online User
Reviews Using Fuzzy Linguistic Hedges,” Appl. Comput. Intell. Soft
Comput., vol. 2014, no. 1, pp. 1–9, 2014. 13 T. Chinsha and S.
Joseph, “A syntactic approach for aspect based opinion mining,”
Semant. Comput. (ICSC), 2015 IEEE Int. Conf., pp. 24–31, 2015.

14 Z. Hai, K. Chang, G. Cong, and C. C. Yang, “An
Association-Based Unified Framework for Mining Features,” Acm TIST,
vol. 6, no. 2, 2015.

15 K. Khan, B. Baharudin, A. Khan, and A. Ullah, “Mining opinion
components from unstructured reviews: A review,” J. King Saud Univ.
– Comput. Inf. Sci., vol. 26, no. 3, pp. 258–275, 2014.

16 H. Kim and K. Ganesan, “Comprehensive review of opinion
summarization,” Illinois Environ. pp. 1–30, 2014