• Nie Znaleziono Wyników

SENTIMENT ANALYSIS OF TWITTER DATA USING EMOTICONS AND EMOJI IDEOGRAMS

N/A
N/A
Protected

Academic year: 2021

Share "SENTIMENT ANALYSIS OF TWITTER DATA USING EMOTICONS AND EMOJI IDEOGRAMS"

Copied!
9
0
0

Pełen tekst

(1)

Studia Ekonomiczne. Zeszyty Naukowe Uniwersytetu Ekonomicznego w Katowicach ISSN 2083-8611 Nr 296 · 2016 Informatyka i Ekonometria 6

Wiesław Wolny

University of Economics w Katowice Faculty of Informatics and Communication Department of Informatics

wieslaw.wolny@ue.katowice.pl

SENTIMENT ANALYSIS OF TWITTER DATA USING EMOTICONS

AND EMOJI IDEOGRAMS

Summary: Twitter is an online social networking service where worldwide users publish their opinions on a variety of topics, discuss current issues, complain, and express positive or negative sentiment for products they use in daily life. Therefore, Twitter is a rich source of data for opinion mining and sentiment analysis. However, sentiment analysis for Twitter messages (tweets) is regarded as a challenging problem because tweets are short and in- formal. This paper focuses on this problem by the analyzing of symbols called emotion tokens, including emotion symbols (e.g. emoticons and emoji ideograms). According to observation, these emotion tokens are commonly used. They directly express one’s emo- tions regardless of his/her language, hence they have become a useful signal for sentiment analysis on multilingual tweets. The paper describes the approach to performing sentiment analysis, that is able to determine positive, negative and neutral sentiments for a tested topic.

Keywords: Twitter, sentiment analysis, symbol analysis, SAS.

Introduction

Microblogging websites such as Twitter (www.twitter.com) have evolved to become a great source of various kinds of information. This is due to the nature of microblogs on which people post real time messages regarding their opinions on a variety of topics, discuss current issues, complain, and express positive or negative sentiment for products they use in daily life.

As the audience of microblogging platforms and social networks grows every day, data from these sources can be used in opinion mining and sentiment

(2)

1

a f

• o i w s r e w m s t c

1

a t t w a o r J t i ( s 164

ana foll

• W

• H

• W or n info wha subj rapi e.g.

wor med spel ta a cha

1. R

at m task the wel a ve opin rese J. R the ing (tex sam 4

alysi ow Wh How Wh

Po not.

orm at th

O ject id g , re rks.

dia.

llin Th as T arac

Rel Se man k [1 phr H ll as ery nion earc Read

sen em xts w mple

is t ing hat d

w p hat w

olit So matio

hey pin ts o grow

evie . M . Al ng, g

his Twi ters

late enti ny le

], [ rase How

s the dif n m ches d in ntim moti

with es.

ask g qu do p posi

wou ical ocia

on y lik nion of s wth ews Most

lmo gram

arti tter s, e

ed w ime

eve 2] i e lev weve e na ffere mini

s. A n [1 men

con h h

ks. F uesti

peo itive uld l pa al or can ke/d ns a stud h of s, f t N ost mm icle r ha .g.

wo ent a els o it h vel er, t atur ent ing A v 10]

nt cl ns f app

For ion ople e (o peo artie rgan n be disli and dy o

f th foru LP all mati e pr asht

rk ana of g

as b [5]

the re o tas and very

use lass from py e

r ex ns:

e thi or n ople es m niza e ob ike,

its of s he f um ba for cal, ropo

tags

alysi gran bee

, [6 inf of th sk. W

d se y br

ed e sific m U emo

xam ink nega e pr may atio btai , an rel sent field dis ased rms , an oses s, e

is is nula n h 6] or

form he m Wit enti road emo cati Usen

otic mple

abo ativ refe y be ons

ined nd th late tim d co scu d m

of nd p s m e.g.

t

s a arity hand r ev mal micr

th t ime d o otic on.

net cons

e, m out ve) a

er o e in ma d fr heir ed c ment

oin ssio meth

f soc pun meth

#h to i

gro y. S dled ven l an rob the ent ver cons

Fo new s) a

W

manu t ou

are our p

nter ay a rom

r op conc t an ncid ons hods

cial ctu hod happ iden

owin Star

d at pol nd s log gro ana rvie s su or th wsg and

Wies

ufa ur pr

peo pro rest ask p m so

pini cep naly de w

, b s pe l m

atio of py, ntify

ng a rting t the larit spec ggin owin

alys ew

uch his p grou

“ne sław

actu rod ople oduc ted

peo ocia ions pts s ysis with blog erfo medi

on e pro

#fa fy p

area g fr e se ty o cial ng d ng sis h

of h as

pur ups egat

w W

ring duct

e ab ct to to k ople al n

s on such an h th gs,

orm ia a erro ovid ail, osit

a of rom ente of w lize dom pop hav the

“:- rpos

. Th tive

Woln

g co t (se

bou o be kno e’s o netw

n m h a nd o hose mic m w

are v ors.

ding em tive

f the m be

ence word ed l main pula ve b ex -)” a

se, t he d e” (

ny

omp ervi ut ou

e lik ow

opin work many as se opin

e of crob witho ver g se moti e, n

e N eing e le ds a

ang n ma atio beco xisti

and the data (tex

pan ice,

ur p ke?

if p nio ks, y as enti nion f th blog out ry n enti

con ega

Natu g a evel

and guag

ake on o ome ing d “:

aut aset xts w

nies com prod

? peop

n o as spec

ime n m he s gs, pa nois men ns, ativ

ural doc [3]

phr ge e sen

of b e a wo -(”

tho t w with

s m mp duc ple on c the cts ents mini soci Tw artic sy a nt a e.g e an

Lan cum

], [4 rase tha ntim blog

fiel ork to rs c as d h sa

ay pany

ct?

sup urr eir u

of t s an ing.

ial witt cula and ana

. :- nd n

ngu ment 4] a es ( at is men gs a ld o

wa form coll

divi ad o

be y, et

ppo ent user thei nd e Th me ter, ar s ful lysi ), : neu

uage t lev and (e.g s us nt an and

of in as p

m a lect

ided or a

inte tc.)

ort t deb rs p ir li emo he i dia

an succ

ll o is u -(, utra

e Pr vel mo ., [7 sed

naly soc nter pres a tr ted d in angr

eres

?

thei bate post ife.

otio ince a on nd s cess of al usin :-|

l tw

roce cla ore 7], [

in ysis cial rest sent rain

tex nto ry e

sted

ir p es.

t ev ons

epti n th soci s in ll k ng s and wee

essi assi rec [8]) tw s in l ne

t fo nted ning xts c

“po emo

d in

prog All very

are ion he W

ial n so kind uch d em

ts.

ing fica cent

).

eets Tw two or m in g se

cont osit otic

n th

gram l thi yda e th an Web

net ocia ds o h da moj

tas atio tly a

s, a witte orks man

[9]

t fo tain tive cons he

m is ay he nd b, t- al of a-

ji

sk on at as er s, ny ].

or n- e”

s)

(3)

c t t u m t

2

d t F

t

” n a b s s o t T a a a coll train time used men this

2. D

data that Figu

twe

”@U nom abo by t som sho othe to m The allo ana a Tw

R lect nin ent d h nt/n s art

Dat Tw a. T t ca ure

Th eet

UE mics out m

the me e

rten U er u mar ese ows Tw alysi

witt Se

ese ting g d sit hash non- ticle

ta d witt The an b 1. E

he w E_K s in men

use exte ned User user rk t

sym sea witt is f ter

entim

earc g tra data

es f htag

-sen e.

des ter Tw be u

Exa

firs as Kato

n K ntio er f erna d usi s o rs in

opi mbo

arch ter from

me men

cher aini a [1

for gs to

ntim

scri has witte used

amp

st p fo owic Kato

on o for t

al s ing f T n th ics.

ols g hing me m o essa

nt an

rs h ing 1], col o cr men

ipti s its er m d to

le o

piec orwa ce”

owic of t this sour g spe Twit

his Th giv g an essa othe age

naly

have dat [12 llec reat nt c

ion s o mes com of a

ces ard are ce.

them s me

rce.

ecia tter

ma his i

es e nd f ages er fi is

ysis

e al ta.

2]. B cting te tr

lass

n an wn ssag mpo

twe

of ded

e a Us m. # essa . Le al w use anne is p easy filte s ha field 140

of T

lso Sev Bar

g tr rain sific

nd c co ges ose eet

f in fr twi sing

#K age eng web e th er a prim

y w erin ave ds o 0 ch

Twit

beg vera rbar

rain ning cati

col onve

are e a t

form rom itter g tw atow e, so

th o bsite he “ auto mari way ng o ma of r

hara tter

gun al r rosa ning g da ion,

llec enti e ca twe

mat m a r na witte

wic o-ca of t es li

“@

oma ily to i of in any rese acte

dat

n to rese a an g da ata, , ra

ctio ions alled et (

tion a p ame

er n ce, alle twe ike

@” s atica

don iden nfor y un earc

ers.

ta u

o in earc nd F ata.

, bu athe

on s th d tw (Fig

n ”U pre e of

nam

#stu ed h eets bit sym ally ne ntif rma niqu ch.

. Av using

nves cher Fen Da ut th r th

hat wee gure

UE vio f Ra mes

ude hash is l tly.c mbol y al

to i fy T atio ue a

Fir vera

g em

stig rs h ng [

avid hey han

ren ets.

e 1)

E in ous adio

in enci htag lim com

l to lerts incr Twit on o attr rst i age

moti

gate have 13]

dov lim 3-w

nder Th ).

n K po o K

tw i an gs a mited

m.

o re s th reas tter on a ibu is le

len icon

va e re ] ex v, Ts mite way

rs i ere

Kato ost.

Kato weet nd # and

d, th efer hem se t r us any utes, eng ngth

ns a

ariou elie xplo sur, ed t y po

it d are

owic . “ owic t se

#ko

“bi her to m. U the er n

sub , w gth.

h o and e

us ed o

oite , an thei olar

disti e so

ce

“@R ce a ends once it.ly refo

oth User

vis nam bjec whic

Th of tw

emo

way on e

d e nd R

r ex rity

nct ome

retw Rad and s in ert y/1z ore l her rs u sibi mes

ct.

h d he m

wee oji i

ys emo

xist Rap xpe cla

t fro e pa

wee dioK d Un

nfor are zPn lon use usua lity and diffe max et is

ideo

of otic

ting ppop erim assi

om artic

eted Kat nive rma a t nJm g li ers.

ally y of d to eren xim s 14

ogra

aut cons

g Tw por ment

fica

oth cula

d” m tow ersi atio tags c” i inks Re y us

f th opic

ntia mum 4 w

ams

tom s to witt rt [1 ts t atio

her ar f

me wice

ity on t s pr is a s ar efer se h heir cs a ates m le word

matic o de ter 14]

o se on, a

tex feat

ans e”

of E he rov a lin

re o rrin hash twe and

tw ngt ds [

16

call efin sen als enti as i

xtua ture

s th an Eco

use vide

nk t ofte ng t htag eets

thu witte th o [15]

5

ly ne n-

o i- in

al es

he nd o- er ed to en to gs s.

us er of

].

(4)

Wiesław Wolny 166

This is different from the domains of other research, which were mostly focused on reviews which consisted of multiple sentences. The second attribute is availa- bility of data. With the Twitter API or other tools, it is much easier to collect millions of tweets for training.


2.1. Emoticons

There are two fundamental data mining tasks that can be considered in con- junction with Twitter data: text analysis and symbol analysis. Due to the nature of this microblogging service (quick and short messages), people use acronyms, make spelling mistakes, use emoticons and other characters that express special meanings. Emoticons are metacommunicative pictorial representation of a facial expression pictorially represented using punctuation and letters or pictures; they express the user’s mood.

The use of emoticons can be tracked back to the 19th century. The first docu- mented person to have used the emoticons :-) and :-( on the Internet was Scott Fahlman from Carnegie Mellon University in a message dated 19 September 1982.

Some emoticons as a characters are included in the Unicode standard – three in the Miscellaneous Symbols block, and over sixty in the Emoticons block [16].

Emoticons can be categorized as:

• Happy emoticons : :-) :) :D :o) :] :3 :c) :> =] 8), etc.

• Sad emoticons: >:[ :-( :( :-c :c :-< :っC :< :-[ :[ :{, etc.

• Neutral emoticons: >:\ >:/ :-/ :-. :/ :\ =/ =\ :L =L :S >.<, etc.

More symbols and meanings like angry, crying, surprise can be found on Wikipedia site [17], which can be used to determine their emotional state. The top 20 of emoticons collected from 96 269 892 tweets is presented in [18].

2.2. Emoji ideograms

Emoji were originally used in Japanese electronic messages and spreading outside of Japan. The characters are used much like emoticons, although a wider range is provided. The rise of popularity of emoji is due to its being incorporated into sets of characters available in mobile phones. Apple in IOS, Android and other mobile operating systems included some emoji character sets. Emoji char- acters are also included in the Unicode standard [19]. Emoji can be categorized into same categories as emoticons. Emoji can be even translated to English using http://emojitranslate.com/.

(5)

Sentiment analysis of Twitter data using emoticons and emoji ideograms 167 2.3. Data collection

The main problem is how to extract the rich information that is available on Twitter and how can it be used to draw meaningful insights. To achieve this, first we need to build an accurate sentiment analyzer for tweets, which is what this solution aims to achieve. As a software to data analyze can be used SAS Text Miner, SAS Visual Analytics or other tools. The challenge remains to fetch cus- tomized Tweets and clean data before any text or symbol mining. SAS Visual Analytics allows direct import of Twitter data, but to use SAS Text Miner and other tools, data have to be downloaded and converted.

Twitter allows developers to collect data via Twitter REST API [20] and The Streaming API [21]. Twitter has numerous regulations and rate limits im- posed on its API, and for this reason it requires that all users must register an account and provide authentication details when they query the API. This regis- tration requires users to provide an email address and telephone number for veri- fication, once the user account is verified the user will be issued with the authen- tication detail which allows access to the API.

Unfortunately Twitter API exports data only in JSON format, which need to be translated to readable for databases or analytical software format. A combina- tion of Twitter API, scripts for converting JSON to CSV [22], SAS Macro [23]

or Excel Macro [24] can be used to extract information from twitter and create an input dataset for the analysis. The entire process of data acquisition can be fully automated by scheduling the run of Visual Basic for Applications (VBA) or SAS macros. Since opinions have targets, further pre-processing and filtering of collected data can be done using @twitter_names and #hashtags as a targets in the way described in [20]. This method is more precise and provides better result than other text mining approaches.

3. Sentiment analysis

Sentiment analysis which is also known as opinion mining, focuses on dis- covering patterns in the text that can be analyzed to classify the sentiment in that text. The term sentiment analysis probably first appeared in [25], and the term opinion mining first appeared in [26]. However, the research on sentiments and opinions appeared earlier.

Liu stated that “Sentiment analysis is the field of study that analyses peo- ple’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions to- wards entities such as products, services, organizations, and their attributes. It

(6)

Wiesław Wolny 168

represents a large problem space. There are also many names and slightly differ- ent tasks, e.g., sentiment analysis, opinion mining, opinion extraction, sentiment mining, subjectivity analysis, affect analysis, emotion analysis, review mining, etc.” [27, p. 7]. Sentiment analysis has grown to be one of the most active re- search fields in natural language processing. It is also widely studied in data mining, Web mining, and text mining. In fact, it has spread from computer sci- ence to management sciences and social sciences due to its importance to busi- ness and society.

Sentiment analysis is predominantly implemented in software which can autonomously extract emotions and opinions from a text. It has many real world applications it allows companies to analyze how their products or brand is being perceived by their consumers, politicians may be interested in knowing how people will vote in elections, etc. It is difficult to classify sentiment analysis as one specific field of study as it incorporates many different areas such as linguis- tics, Natural Language Processing (NLP), and Machine Learning or Artificial Intelligence. As the majority of the sentiment that is uploaded to the internet is of an unstructured nature it is a difficult task for computers to process it and extract meaningful information from it. Some of the most effective machine learning algorithms, e.g., support vector machines, naïve Bayes and conditional random fields, produce no human understandable results.

Emotions are closely related to sentiments. Emotions can be defined as a subjective feelings and thoughts. People’s emotions have been categorized into some distinct categories. However, there is still not a set of agreed basic emo- tions among researchers. Based on [28], people have six primary emotions, i.e., love, joy, surprise, anger, sadness, and fear, which can be sub-divided into many secondary and tertiary emotions. Each emotion can also have different intensi- ties. Emotions in virtual communication differ in a variety of ways from those in face-to-face interactions due to the characteristics of computer mediated com- munication. Computer mediated communication may lack many of the auditory and visual cues normally associated with the emotional aspects of interactions.

While text-based communication eliminates audio and visual cues, there are other methods for adding emotion. Emoticons, or emotional icons, can be used to display various types of emotions.

For purposes of this work, sentiment can be defined as a personal positive, neutral or negative opinion. Classification is done in supervised learning using lexicon-based approach. The sentiment lexicon contains a list of sentiment emot- icons and emoji ideograms. Opinions can be gathered by searching Twitter posts using Twitter API. Each tweet can labelled, using emoticons and emoji icons, as

(7)

Sentiment analysis of Twitter data using emoticons and emoji ideograms 169

positive, negative, neutral or junk. The “junk” label means that the tweet cannot be understood. In order to use this method an assumption must be made, this assumption is that the emoticon in the tweet represents the overall sentiment contained in that tweet. This assumption is quite reasonable as the maximum length of a tweet is 140 characters so in the majority of cases the emoticon will correctly represent the overall sentiment of that tweet. This kind of evaluation is commonly known as the document-level sentiment classification because it con- siders the whole document as a basic information unit.

Model can be developed on a sample of data; this can be used to classify sentiments of the tweet. Manual classification will be done on a sample of tweets. Accuracy of model can be tested against validating sample. Tweets as- signed manually will be divided into 2 parts – 80% of data should be taken in Model sample and 20% of data should be taken as validating sample. Results obtained will be compared with the manually assigned classification.

Conclusions

Microblogging like twitter nowadays became one of the major types of the communication. The large amount of information contained in these web-sites makes them an attractive source of data for opinion mining and sentiment analy- sis. Most text based methods of analysis may not be useful for sentiment analy- sis in these domains. To make a significant progress, we still need novel ideas.

Using twitter names and hashtags to collect training data can provide better re- sults. Also adding symbol analysis using emoticons and emoji characters can significantly increase the precision of recognizing of emotions. The most suc- cessful algorithms will be probably integration of natural language processing methods and symbol analysis.

References

[1] S. Das, M. Chen, Yahoo! for Amazon: Extracting Market Sentiment from Stock Mes- sage Boards, “Proceedings of the Asia Pacific Finance Association Annual Confer- ence (APFA)” 2001, Vol. 35.

[2] P.D. Turney, Thumbs up or Thumbs down?: Semantic Orientation Applied to Unsu- pervised Classification of Reviews [in:] Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, PA 2002, pp. 417-424.

(8)

Wiesław Wolny 170

[3] M. Hu, B. Liu, Mining and Summarizing Customer Reviews [in:] Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’04, ACM, New York, NY 2004, pp. 168-177.

[4] S. Kim, E. Hovy, Determining the Sentiment of Opinions [in:] COLING '04 Proceed- ings of the 20th international conference on Computational Linguistics, Geneva 2004.

[5] A. Agarwal, F. Biadsy, K. McKeown, Contextual Phrase-Level Polarity Analysis Using Lexical Affect Scoring and Syntactic n-Grams, Proceedings of the 12th Con- ference of the European Chapter of the ACL, Athens 2009, pp. 24-32.

[6] T. Wilson, J. Wiebe, P. Hoffmann, Recognizing Contextual Polarity in Phrase- -Level Sentiment Analysis [in:] Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Association for Computational Linguistics, MIT Press, Cambridge, MA 2005, pp. 399-433.

[7] A. Esuli, F. Sebastiani, Sentiwordnet: A Publicly Available Lexical Resource for Opinion Mining, “Proceedings of LREC” 2006, Vol. 6.

[8] V. Hatzivassiloglou, K.R. McKeown, Predicting the Semantic Orientation of Ad- jectives [in:] Proceedings of the 35th Annual Meeting of the Association for Compu- tational Linguistics and Eighth Conference of the European Chapter of the Associ- ation for Computational Linguistics, Association for Computational Linguistics, Madrid 1997, pp. 174-181.

[9] B. Pang, L. Lee, Opinion Mining and Sentiment Analysis, “Foundations and Trends in Information Retrieval” 2008, Vol. 2(1-2), pp. 1-135.

[10] J. Read, Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification [in:] Proceedings of the ACL Student Research Work- shop (ACLstudent ‘05), Association for Computational Linguistics, Stroudsburg, PA 2005, pp. 43-48.

[11] A. Pak, P. Paroubek, Twitter as a Corpus for Sentiment Analysis and Opinion Mining,

“LREC” 2010, Vol. 10.

[12] A. Bifet, E. Frank, Sentiment Knowledge Discovery in Twitter Streaming Data, Discovery Science, Springer, Berlin-Heidelberg 2010.

[13] L. Barbosa, J. Feng, Robust Sentiment Detection on Twitter from Biased and Noisy Data [in:] Proceedings of the 23rd International Conference on Computational Linguis- tics: Posters, Association for Computational Linguistics, Bejjing 2010, pp. 36-44.

[14] D. Davidov, O. Tsur, A. Rappoport, Enhanced Sentiment Learning Using Twitter Hashtags and Smileys [in:] Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Association for Computational Linguistics, Bejjing 2010, pp. 241-249.

[15] A. Go, R. Bhayani, L. Huang, Twitter Sentiment Classification Using Distant Su- pervision, CS224N Project Report, Stanford 2009, pp. 1-12.

[16] Unicode Miscellaneous Symbols, http://www.unicode.org/charts/PDF/U2600.pdf (accessed: May 2015).

[17] List of emoticons, Wikipedia, http://en.wikipedia.org/wiki/List_of_emoticons (accessed: May 2015).

(9)

Sentiment analysis of Twitter data using emoticons and emoji ideograms 171 [18] N. Berry, DataGenetics, http://www.datagenetics.com/blog/october52012/index.html

(accessed: May 2015).

[19] Unicode Emoji, Draft Unicode Technical Report #51, http://www.unicode.org/

reports/tr51/ (accessed: May 2015).

[20] Twitter REST API, The Search API, https://dev.twitter.com/rest/public/search (accessed: May 2015).

[21] Twitter, The Streaming APIs, https://dev.twitter.com/streaming/overview (accessed:

May 2015).

[22] S. Falko, How to Import Twitter Tweets in SAS DATA Step Using OAuth 2 Authentica- tion Style, http://blogs.sas.com/content/sascom/2013/12/12/how-to-import-twitter- tweets-in-sas-data-step-using-oauth-2-authentication-style (accessed: May 2015).

[23] S. Garla, G. Chakraborty, %GetTweet: A New SAS® Macro to Fetch and Summa- rize Tweets, Paper 324-2011, Oklahoma State University, Stillwater, OK 2001.

[24] Twitter Text Mining Using SAS, Social Media Analytics, http://www.analytics- tools.com/2012/06/social-media-analytics-twitter-text.html (accessed: May 2015).

[25] T. Nasukawa, J. Yi, Sentiment Analysis: Capturing Favorability Using Natural Language Processing [in:] Proceedings of the K-CAP-03, 2nd International Con- ference on Knowledge Capture, Sanibel Island, FL 2003, pp. 70-77.

[26] K. Dave, S. Lawrence, D.M. Pennock, Mining the Peanut Gallery: Opinion Ex- traction and Semantic Classification of Product Reviews [in:] Proceedings of In- ternational Conference on World Wide Web, Budapest 2003, pp. 519-528.

[27] B. Liu, Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers, Williston, VT 2012.

[28] W.G. Parrott (ed.), Emotions in Social Psychology: Essential Readings, Key Read- ing in Social Psychology, Psychology Press, Philadelphia, PA 2001.

ANALIZA WYDŹWIĘKU DANYCH Z TWITTERA Z WYKORZYSTANIEM EMOTIKONÓW I EMOJI

Streszczenie: Twitter jest ogólnoświatowym serwisem, w którym użytkownicy publiku- ją swoje opinie na różne tematy, dyskutują na temat bieżących wydarzeń oraz wyrażają pozytywne bądź negatywne opinie o produktach, których używają w codziennym życiu.

Z tego powodu Twitter jest potężnym źródłem danych do badania opinii i analizy wy- dźwięku. Jednak analiza wydźwięku komunikatów na Twiterze (tweetów) uważana jest za problem, będący zarazem wyzwaniem, z powodu niewielkiej objętości tekstu tweetów i często nieformalnego charakteru ich języka. Artykuł skupia się na analizie symboli zna- nych jako emotikony i emoji. Zgodnie z przeprowadzonymi badaniami, symbole te są powszechnie używane w komunikacji za pomocą Twittera. Wyrażają one bezpośrednio emocje niezależnie od języka, dlatego mogą być używane w wielojęzycznych tekstach.

W artykule przedstawiono podejście do analizy wydźwięku umożliwiającej określenie pozytywnego, negatywnego lub neutralnego wydźwięku badanych tekstów.

Słowa kluczowe: Twitter, analiza wydźwięku, analiza symboli, SAS.

Cytaty

Powiązane dokumenty

Based on the introduced hybrid surrogate model- ling strategy, we developed an emulator for storage tank volume and CSO flow time series prediction based on upcoming rainfall

nowego Kodeksu w RFN : (wykład na Wydziale Prawa Kanonicznego ATK Profesora Schulza z Paderborn). Prawo Kanoniczne : kwartalnik prawno-historyczny

2 J.S. Bystroń, Szkoła i świat jako zjawisko społeczne, Warszawa 1934, za: S.. Sama pedagogika jako nauka nie traci nic ze swej „naukowości”, wybierając określoną

Pomiędzy 35 a 40% Amerykanów godzi wiarę w Boga-Stwórcę z teorią ewo- lucji, zaś jedynie 9–13% twierdzi, że proces ewolucji przebiegał bez udziału Boga (twier- dzenie to nie

klasa białych kołnierzyków z wyższych stanowisk była bardziej skłonna do uznania za legalne spożywanie alkoholu przez oso- by w 18 roku życia niż osoby należące do

In larval tissues in FlyAtlas, the Group 2 modules (green, green yellow) which are down-regulated in response to 0.25SY larval or adult diet are highly expressed overall, and they

Uzyskany w wyniku terapii stan słuchu określano jako: brak poprawy słuchu (&lt;10 dB), poprawa słuchu (10-25 dB), znaczna poprawa słuchu (&gt;25 dB) lub powrót słuchu do

Niewątpliwie, rozprawa ta stanowi nowe odczytanie isagogi Juniłiusza Afrykańczyka, wydobywa nowe treści i każę zauważyć oryginalność afrykań­ skiego autora