Social Media Mining: An Introduction

(Axel Boer) #1

P1: Sqe Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-05 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:23


118 Data Mining Essentials

Table 5.2. Naive Bayes Classifier (NBC) Toy Dataset

No. Outlook (O) Temperature (T) Humidity (H) Play Golf (PG)
1 sunny hot high N
2 sunny mild high N
3 overcast hot high Y
4 rain mild high Y
5 sunny cool normal Y
6 rain cool normal N
7 overcast cool normal Y
8 sunny mild high?

Example 5.4.Consider the dataset in Table5.2.
We predict the label for instance 8 (i 8 ) using the naive Bayes classifer
and the given dataset. We have

P(PG=Y|i 8 )=

P(i 8 |PG=Y)P(PG=Y)
P(i 8 )
=P(O=Sunny,T=mild,H=high|PG=Y)

×

P(PG=Y)


P(i 8 )
=P(O=Sunny|PG=Y)×P(T=mild|PG=Y)

×P(H=high|PG=Y)×

P(PG=Y)


P(i 8 )

=


1


4


×


1


4


×


2


4


×


4
7
P(i 8 )

=


1


28 P(i 8 )

. (5.22)


Similarly,

P(PG=N|i 8 )=

P(i 8 |PG=N)P(PG=N)
P(i 8 )
=P(O=Sunny,T=mild,H=high|PG=N)

×

P(PG=N)


P(i 8 )
=P(O=Sunny|PG=N)×P(T=mild|PG=N)

×P(H=high|PG=N)×

P(PG=N)


P(i 8 )

=

2


3


×


1


3


×


2


3


×


3
7
P(i 8 )

=


4


63 P(i 8 )

. (5.23)

Free download pdf