P1: Sqe Trim: 6.125in×9.25in Top: 0.5in Gutter: 0.75in
CUUS2079-05 CUUS2079-Zafarani 978 1 107 01885 3 January 13, 2014 19:23
118 Data Mining Essentials
Table 5.2. Naive Bayes Classifier (NBC) Toy Dataset
No. Outlook (O) Temperature (T) Humidity (H) Play Golf (PG)
1 sunny hot high N
2 sunny mild high N
3 overcast hot high Y
4 rain mild high Y
5 sunny cool normal Y
6 rain cool normal N
7 overcast cool normal Y
8 sunny mild high?
Example 5.4.Consider the dataset in Table5.2.
We predict the label for instance 8 (i 8 ) using the naive Bayes classifer
and the given dataset. We have
P(PG=Y|i 8 )=
P(i 8 |PG=Y)P(PG=Y)
P(i 8 )
=P(O=Sunny,T=mild,H=high|PG=Y)
×
P(PG=Y)
P(i 8 )
=P(O=Sunny|PG=Y)×P(T=mild|PG=Y)
×P(H=high|PG=Y)×
P(PG=Y)
P(i 8 )
=
1
4
×
1
4
×
2
4
×
4
7
P(i 8 )
=
1
28 P(i 8 )
. (5.22)
Similarly,
P(PG=N|i 8 )=
P(i 8 |PG=N)P(PG=N)
P(i 8 )
=P(O=Sunny,T=mild,H=high|PG=N)
×
P(PG=N)
P(i 8 )
=P(O=Sunny|PG=N)×P(T=mild|PG=N)
×P(H=high|PG=N)×
P(PG=N)
P(i 8 )
=
2
3
×
1
3
×
2
3
×
3
7
P(i 8 )
=
4
63 P(i 8 )