• Nie Znaleziono Wyników

An Ontology-Based Approach to Opinion Mining Tools Selection

N/A
N/A
Protected

Academic year: 2021

Share "An Ontology-Based Approach to Opinion Mining Tools Selection"

Copied!
10
0
0

Pełen tekst

(1)

AN ONTOLOGY-BASED APPROACH TO OPINION MINING TOOLS SELECTION JAROSŁAW WĄTRÓBSKI

Summary

Ontologies aim to become a succesful and modern tools for knowledge manage-ment and conceptualization. In opposite to data bases, ontologies represent an idea of open world, enabling to build knowledge-based models which are ready to use and reuse. Development of information tools is resulted in the practical implementation of an ontology, enabling to develop knowledge-based model, which is not only open, but also understood by computer machine-readable software, while maintaining the ca-pabilities of the semantic tagging which creates great potential for using this knowledge in the Internet.This article presents an attempt to an ontology-based ap-proach to opinion mining methods and tools selection. The elaborated taxonomy and ontology explicitly emphasize the practical opportunities both in the areas of concep-tualization of domain knowledge as well as search and access to domain knowledge. Keywords: ontology, opinion mining, methods and tools selection, sentiment analysis,

ontology-based approach Introduction

Nowadays, if one wants to buy a consumer product, frequently asks for opinions using, apart from traditional forms, e-channels and public forums on the Web about the product. Opinions are central to almost all human activities because they are key influencers of our behaviours. While users were merely information consumers in the traditional Web, they play a much more active role in the Social Web since they are now also data providers [15,17]. That is why both organizations and individuals always want to find consumer or public opinions about their products and services [6,9,11].

The mass involved in the process of creating Web content has led many public and private organizations and enterprises to focus their attention on analysing this content in order to determine the general public’s opinions as regards a number of topics [1,18]. Given the current Web size and growth rate, automated and semi-automated techniques supporting opinion mining processes are crucial if practical and scalable solutions are to be obtained [13,16].

Opinion mining is a highly active research field that comprises natural language processing, computational linguistics and text analysis techniques with the aim of extracting various kinds of added-value and informational elements from users’ opinions [3,5,12]. However, current opinion mining approaches are featured by a number of mathematical, statistical, and artificial intelligence methods, especially determining different goals and requiring domain knowledge [2,15]. To ad-vantage the knowledge collection of various opinion mining approaches, an attempt for an ontology-based approach to opinion mining tool selections is proposed. The aim is to provide an experimental ontology containing the set of selected opinion mining methods and tools as well as the specified set of attributes.

(2)

This paper is constructed as follows: Section 1 provides current and future research trends of opinion mining approaches. Section 2 presents the background of ontologies and their practical ap-plicability. In Section 3, taxonomical view of opinion mining solutions is presented, and in the af-termath of this works, in Section 4, an ontology-based approach is described. Conclusions present the final results of this work.

1. Literature review – current state-of-the-art and future research trends

An analysis of literature provides many definitions and descriptions of the term of opinion min-ing. This topic represents a large problem space. Some works refers to opinion mining using senti-ment analysis term, which bases on analysing people’s opinions, sentisenti-ments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes [4, 7, 8]. It seems that there are slightly different tasks, but often-times they are treated as opinion mining practices. Basically, opinion mining and sentiment analysis represent the same field of study. The term sentiment analysis perhaps first appeared in [11, 14], and the term opinion mining first appeared in [4, 12, 19]. However, research on opinion mining started with identifying opinion (or sentiment) bearing words, e.g., great, amazing, wonderful, bad, and poor. Many researchers have worked on mining such words and identifying their semantic orienta-tions (i.e., positive or negative) [20, 22, 25].

Current researches on opinion mining field are focusing on improving the accuracy of algorithm for opinion detection [21] as well as reduction of human effort needed to analyze content [23]. An-other important aspect contains identification of highly rated experts [21, 25, 30] and visual mapping of bipolar opinion [24, 31]. Some works referred to the aspects of computer-generated reference corpuses in political/governance field [4, 8, 24, 26, 28]. The problem of semantic analysis through lexicon/corpus of words with known sentiment for sentiment classification was taken by [2, 10, 27]. Yet there are also hopeful signs to adopt tools and other mechanisms to provide real-time opinion mining [12], cross-platform opinion mining [1, 4, 29], and also bipolar assessment of opinions [5]. There are some activities allowing to enhance discoverability of content through Linked Data [35] and elaborating comment and opinion recommendation algorithm [10,33]. Promising approach seems to be multilingual reference corpora [31,33] and collaborative sharing of annotating/labelling resources [3, 34]. Moreover, the applications of autonomous machine learning and artificial intelli-gence [8, 14, 32] and usable, peer-to-peer opinion mining tools for citizens [6] as well as non-bipolar assessment of opinion [35] are crucial tasks for further development. Another important thing is, according to [26], that the ontologies will act as a semantic domain for the information systems and will be very useful in e-commerce.

Current solutions for opinion mining and sentiment analysis are fastly evolving, typically by reducing the amount of human effort needed to classify comments [9]. Following the ideas there are plenty of promising challenges and expecting outcomes in opinion mining field. Guided by the num-ber of existing approaches, an attempt to ontology-based approach for opinion mining is proposed. The main aim is to systematize, classify and handle of existing knowledge about various approaches dedicated to opinion mining. Due to the problem of information overload, manually browsing a large number of research papers and other data from Web resources may not be feasible. The proposed ontology-based approach will gather and summarize the knowledge about various methodologies, tools and other applications in one place. The practical implication of this work is that an effective attempt to handle knowledge about opinion mining solutions.

(3)

2. An applicability of an ontology for opinion mining

Knowledge representation and ontologies have actually gained importance in the last two dec-ades. Ontologies play a major role in supporting the information exchange and sharing by extending syntactic interoperability of the Web to semantic interoperability. From a formal point of view, on-tology can be regarded as a vocabulary of terms and relationships between those terms in a given domain [34]. The general aim of ontology is to give knowledge about specific domain that is under-standable by both developers and Input computers and necessary for knowledge representation and knowledge exchange. In other words, ontologies are meta-data schemas, providing a controlled vo-cabulary of concepts, each with an explicitly defined and machine process-able semantics. By de-fining shared and common domain theories, ontologies help both people and machines to communi-cate and support the exchange of semantics and not only syntax [19]. To sum up, ontologies offer instruments to model and share the knowledge among various applications in a specific domain.

Opinion mining domain is fast evolving domain, producing new methodologies, approaches and tools or adapting existing ones to current research problems. These solutions have different features according to achieve their deferent goals. According to their various specifications, the ap-proaches are vary between aim, used algorithm or classifier, learning time, speed of learning, toler-ance for missing values, resisttoler-ance for noise, overfitting, and level of explanation. Despite all the work which is done in the field of opinion mining, there is visible lack of comprehensive approach to compare the existing solutions. Therefore, the application of ontology-based approach helps in avoidance of a knowledge acquisition bottleneck, offering both people and machines to communi-cate concisely, supporting the exchange of semantics.

3. A taxonomic multi-dimensional view of selected approaches

The comparative analysis contains a set of 10 selected methods and tools supporting opinion mining and a set of 7 attributes with assigned possible values [1, 5, 13, 10, 12, 19, 23, 33, 34, 35]. Based on this analysis, a taxonomic multi-dimensional view of these selected approaches was elab-orated. Thus, the class hierarchy contains 7 attributes and 44 sub-attributes, and it is presented as follows on the Figure 1.

(4)

Apart from predefined class hierarchy, the set of 10 opinion mining approaches was selected (Figure 2). Each of them is specified by the various set of attributes.

Figure 2. The set of selected approaches

4. An attempt to an ontology-based approach to opinion mining tool selections

The taxonomical elaboration was a preliminary study before an ontology construction. It re-quires defining a set of object properties, joining together attributes and relations between them. Thus, a set of object properties contains 2 elements: has Attribute and is Attribute of. The object property has Attribute has defined a domain: Opinion Mining approach, and a range: Attributes, whereas the object property is Attribute of domain are Attributes, and a range: Opinion Mining approach. Moreover, it is required that these object properties should be disjoined.

The ontology contains the main class Attributes, which covers the following classes and sub-classes: Aim (General aim: Distribution, General aim: Location, General aim: Polarization, General aim: Classification, General aim: Converting, General aim: Creating, General aim: Determination, General aim: Calculation, General aim: Reasoning), Learning time (Learning time: Small, Learning time: Very big, Learning time: Medium, Learning time: Big), Overfitting (Overfitting: Big, Over-fitting: Not applicable, OverOver-fitting: Medium, OverOver-fitting: Small), Tolerance for missing values (Tol-erance: Big, Tol(Tol-erance: Medium, Tol(Tol-erance: Very big, Tol(Tol-erance: Small), Algorithm Classifier (NLP, Feature-based classifier, Perceptron with specific weights, List of user rules, Bayes theorem, WordNet, Machine of supporting vectors, C4.5, Algorithm kNN), Explanation (Explanation: Big, Explanation: Not applicable, Explanation: Medium, Explanation: Small), Speed of learning (Speed of learning: Long, Speed of learning: Short, Speed of learning: Very short, Speed of learning: Not applicable, Speed of learning: Medium), Resistance for noise (Resistance: Big, Resistance: Small, Resistance: Medium, Resistance: Very big, Resistance: Not applicable) [1, 5, 13, 10, 12, 19, 23, 33, 34, 35]. The previous elaborated taxonomy was performed into the ontology form using Protégé software. The class hierarchy is presented with details on the Figure 3. The visualisation using OWLViz tool is provided on Figure 4.

(5)
(6)
(7)

Moreover, the ontology contains the class Opinion Mining Approach, which is disjoined with the class Attribute. This class is a superclass for class: Name of OM Approach, enfolding the 10 methods and tools dedicated to opinion mining: SVM, Bayesian network, Naive Bayes classifier, Neuron network, Rule-based classifier, Decision trees, Ontology-based approach, kNN, Classifier of maximum entropy, Lexicon-based approach. The visualization of this part of the ontology is shown on Figure 3, using OWLViz tool (Figure 5).

Figure 5. A part of class hierarchy 5. Conclusions

Ontologies are modern tools developed to provide knowledge about specific domains that are understandable by both the computers and developers. Furthermore, ontologies improve the process of information retrieval and reasoning thus results in making data interoperable between different applications. Apart from that, ontologies enable for knowledge reusing, what has a crucial meaning for dynamically developed opinion mining domain.

This research aims to present an attempt of an ontology-based approach to opinion mining tool selections. The proposed ontology has been elaborated for 10 various opinion mining methods and tools. The ontology is based on attributes and features that have been discussed and presented in [9]. The proposed ontology contains the set of 10 approaches and 44 attributes. However, this ontology can be refined by adding additional approaches and attributes as well. Due to the rapid expansion of the Internet and e-commerce, and consequently, the high increment of the number of approaches supporting this process, the ontology-based approach seems to be the most appropriate method to gather and handle knowledge of opinion mining approaches, if we consider other options that could have been chosen. Thereby the results of this work can be disseminated by public availability of the elaborated solution, enabling refining and reusing of it. This research will also make the decision making process efficient not only for the researches but also for the consumers and organizations.

(8)

Bibliography

[1] Abbasi A., Chen H., Salem A., Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums, ACM Transactions on Information Systems, 26(3), June 2008, Article 12.

[2] Binali H., Potdar V., Wu C., A state of the art opinion mining and its application domains, IEEE International Conference on Industrial Technology, Gippsland, VIC, 2009, pp. 1–6. [3] Cadilhac A., Benamara F., Aussenac-Gilles N. Ontolexical resources for feature-based

opinion mining: a case-study, 2010.

[4] Carenini G., Raymond T. Ng, Zwart E., Extracting Knowledge from Evaluative Text, In Proceedings of the 3rd international conference on Knowledge captur, 2005.

[5] Castellanos M., Wang D.U., Processing and DW2.0 in Operational Business Intelligence Information Systems, Lecture Notes in Computer Science, pp.33–45.

[6] Cheng X., Xu F., Fine-grained Opinion Topic and Polarity Identification, In Proceedings of the Sixth International Language Resources and Evaluation (LREC' 08), Marrakech, Morocco 2008.

[7] Choi Y., Cardie C., Riloff E., Patwardhan S., Identifying sources of opinions with conditional random fields and extraction patterns, In Proceedings of HLT/EMNLP 2005.

[8] Ding, X. and Liu, B. "The Utility of Linguistic Rules in Opinion Mining", in Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, Netherlands, 23–27 July 2007, pp. 811–812.

[9] Dittenbach D., Berger H., Merkl D., Improving domain ontologies by mining semantics from text, in Proceedings of the First Asia-Pacific Conference on Conceptual Modelling (APCCM2004), 2004, pp. 91–100.

[10] Eirinaki P., Singh J., Feature-based opinion mining and ranking, Journal of Computer and System Sciences, 78(4), 2011, pp.1175–1184.

[11] Felden C., Chamoni P., Execution towards a Business Process Intelligence Processing, 47(6), pp.195–206.

[12] Gamon M., Aue A., Corston O.S., Ringger E., Pulse: Mining Customer Opinions from Free Text, In Proceedings of International symposium on intelligent data analysis N°6, Madrid 2005.

[13] Hu M., Liu B., Mining and summarizing customer reviews, in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, NewYork, USA, 2004, pp.168–177.

[14] Kantardzic M., Data Mining: Concepts, Models, Methods, and Algorithms, ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Wilet-IEEE Express, 2002.

[15] Konys A. Wątróbski J., Różewski P., Approach to Practical Ontology Design for Supporting COTS Component Selection Processes, ACIIDS 2013 – A. Selamat et al. (Eds.): ACIIDS 2013, Part II, LNAI 7803, Springer, Heidelberg, 2013, 245–255.

[16] Konys A., A Framework for Analysis of Ontology-Based Data Access, in: Computational Collective Intelligence, 8th International Conference, ICCCI 2016, Part II, Nguyen, N.-T., Iliadis, L., Manolopoulos, Y., Trawiński, B. (Eds.), Lecure Notes in Computer Science, Springer International Publishing, 2016, pp. 397–408.

(9)

[17] Konys A., A Tool Supporting Mining Based Approach Selection to Automatic Ontology Construction, IADIS Journal on Computer Science and Information Systems, 2015, pp. 3–10, ISSN: 1646-3692.

[18] Konys, A., An Ontology-Based Knowledge Modelling for a Sustainability Assessment Domain, Sustainability 2018, 10, 300.

[19] Lau R.Y.K. et al., Automatic Domain Ontology Extraction for Context-Sensitive Opinion Mining, ICIS 2009 Proceedings. Paper 35. 2009.

[20] Lejeune M. A. M., Measuring the Impact of Data Mining on Churn, Management, Internet Research, ABI/INFORM Global, vol. 11, no. 5. Bradford, 2001, pp. 375–388.

[21] Negash, S., Gray, P., Business intelligence (Chapter 45). In: F. Burstein & C., Holsapple (eds.) Handbook of decision support systems 2. Springer Link 2008, 175–193.

[22] Popescu A.M., Etzioni O., Extracting Product Features and Opinions from Reviews, In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing 2005.

[23] Read J., Hope D., Carroll J., Annotating Expressions of Appraisal in English, The Linguistic Annotation Workshop, ACL 2007.

[24] Soo-Min K., Hovy E., Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text, In Proceedings of ACL/COLING Workshop on Sentiment and Subjectivity in Text, Sydney, Australia 2006.

[25] Strapparava C., Valitutti A., WordNet-Affect: an Affective Extension of WordNet, Proceedings of LREC 04, 2004.

[26] Sukumaran S., Sureka S., Integrating Structured and Unstructured Data Using Text Tagging and Annotation, Business Intelligence Journal 2006, 11(2), pp. 8–16.

[27] Tho Q.T., Hui S.C., Fong A., Cao T.H., Automatic Fuzzy Ontology Generation for Semantic Web, IEEE Transactions on Knowledge and Data Engineering, 18(6), June 2006, pp. 842– 856.

[28] Turney P.D., Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, Proceedings of 2006 International Conference on Intelligent User Interfaces (IUI06).

[29] Wei W., Gulla J.A., Sentiment Learning on Product Reviews via Sentiment Ontology Tree, Proceedings of the Association for ComputationalnLinguistics (ACL), pp.404–413, 2010. [30] Wiebe J., Wilson T., Cardie C., Language Res Eval, 2005, 39: 165.

[31] Xu R. et al., Learning Knowledge from Relevant Webpage for Opinion Analysis, in Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Sydney, Australia, pp. 307–313. 2008.

[32] Yaakub R. M, Li. Y., Feng,Y., Integration of Opinion into Customer Analysis Model, in proceedings of Eighth IEEE International Conference on e-Business Engineering 2011, pp. 90–95.

[33] Zhang Q., Segall R. S., Web Mining: a Survey of Current Research, Techniques, and Software, 2008.

[34] Wątróbski J., Jankowski J., Knowledge management in MCDA domain, in 2015 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2015, pp. 1445–1450.

[35] Zhou L., Chaovalit P., Ontology-supported polarity mining, J. Am. Soc. Inf. Sci. Technol. 59, 1, January 2008, 98–110.

(10)

ONTOLOGIA REPREZENTACJA WIEDZY W OBSZARZE OPINION MINING Streszczenie

Ontologie udowodniły, że stanowią skuteczne narzędzie zarządzania i konceptua-lizacji wiedzy dziedzinowej. W przeciwieństwie do baz danych ontologie, reprezentu-jąc idee „otwartego świata”, pozwalają budować modele wiedzy o charakterze sze-roko dostępnym i gotowym do ponownego wykorzystania i łączenia. Rozwój narzędzi informatyki spowodował, ze implementacja praktyczna ontologii pozwala opracować model wiedzy, który jest nie tylko otwarty, lecz jednocześnie zrozumiany przez opro-gramowanie komputerowe/czytany maszynowo zachowując jednocześnie możliwości tzw. tagowania semantycznego, co stwarza duży potencjał wykorzystania tejże wiedzy w sieci Internet. W artykule prezentowana jest próba budowy ontologii dla obszaru technik i metod opinion mining. Opracowana taksonomia oraz przedstawiona ontolo-gia wyraźnie ukazują możliwości praktyczne zarówno w obszarach samej konceptua-lizacji wiedzy dziedzinowej jak też wyszukiwania i dostępu do zgromadzonej wiedzy dziedzinowej.

Słowa kluczowe: opinion mining, zarządzanie wiedzą, ontologie Jarosław Wątróbski

University of Szczecin

Faculty of Economics and Management ul. Mickewicza 64, 71-101 Szczecin, Poland e-mail: jwatrobski@wneiz.pl

Cytaty

Powiązane dokumenty

The motion segment L4–L5 under consideration consists of two vertebral bodies and the intervening facet joints, intervertebral disc, posterior elements and spinal ligaments [1]..

A basic idea of the simplification of a motion segment modelling is to replace the complex structure of the intervertebral disc by one connector-type element of complex

Ternopil Ivan Pul'uj National Technical University, Faculty of Engineering and Food Technology, Department of designing machines tools and machines, group HVm-51.:

The aim of the present paper is to study some properties of an abstract nonlinear analogue of Volterra equation.. Sufficient conditions have been obtained

4.5.. Denote this difference by R.. In a typical problem of combinatorial num- ber theory, the extremal sets are either very regular, or random sets. Our case is different. If A is

W i l k i e, Some model completeness results for expansions of the ordered field of real numbers by Pfaffian functions, preprint, 1991. [10] —, Model completeness results for

[r]

We examined the effect of training with the use of the computer mathematical game “Kalkulilo” on such mathematical abilities as numerosity assessing, number magnitudes comparison