The Internet Encyclopedia (Volume 3)

(coco) #1

P1: c-143Braynov-2


Braynov2 WL040/Bidgoli-Vol III-Ch-05 July 11, 2003 11:43 Char Count= 0


52 PERSONALIZATION ANDCUSTOMIZATIONTECHNOLOGIES

Figure 1: A customized screen of MyYahoo.

two major components of a user profile: behavioral and
factual. The factual component contains demographic
and transactional information such as age, income, edu-
cational level, and favorite brands. Engage Technologies
(http:// http://www.engage.com), for example, sells software that
helps companies gather and use factual profiles. The
behavioral component contains information about the
online activities of the customer. It is usually stored in
different forms such as logic-based descriptions, classifi-
cation rules, and attribute–value pairs. The most common
representation of behavioral information is association
rules. Here is an example of an association rule: “When
shopping on weekends, consumer X usually spends more
than $100 on groceries” (Adomavicius & Tuzhilin, 1999).
The rules can be defined by a human expert or ex-
tracted from transactional data using data mining meth-
ods. Broad Vision (http://www.broadvision.com) and Art
Technology Group (http://www.atg.com), among others,
sell software that helps users build and use rule-based
profiles.
The rule-based profile-building process usually con-
sists of two main steps: rule discovery and rule validation.
Various data mining algorithms such as Apriori (Agrawal

& Srikant, 1994) and FP-Growth (Han, Pei, Yin, & Mao, in
press) can be used for rule discovery. A special type of asso-
ciation rules, profile association rules, has been proposed
by Agrawal, Sun, and Yu (1998). A profile association rule
is one in which the left-hand side consists of customer
profile information (age, salary, education, etc.), and the
right-hand side of customer behavior information (buying
beer, using coupons, etc.). Agrawal et al. (1998) proposed
a multidimensional indexing structure and an algorithm
for mining profile association rules.
One of the problems with many rule discovery methods
is the large number of rules generated, many of which, al-
though statistically acceptable, are spurious, irrelevant, or
trivial. Post-analysis is usually used to filter out irrelevant
and spurious rules. Several data mining systems perform
rule validation by letting a domain expert inspect the rules
on a one-by-one basis and reject unacceptable rules. Such
an approach is not scalable to large numbers of rules and
customer profiles. To solve the problem, Adomavicius and
Tuzhilin (2001) proposed collective rule validation. Rules
are collected in a single set to which several rule valida-
tion operators are applied iteratively. Because many users
share identical or similar rules, those can be validated
Free download pdf