The Internet Encyclopedia (Volume 3)

P1: c-143Braynov-2

Braynov2 WL040/Bidgoli-Vol III-Ch-05 July 11, 2003 11:43 Char Count= 0

52 PERSONALIZATION ANDCUSTOMIZATIONTECHNOLOGIES

Figure 1: A customized screen of MyYahoo.

two major components of a user profile: behavioral and factual. The factual component contains demographic and transactional information such as age, income, edu- cational level, and favorite brands. Engage Technologies (http:// http://www.engage.com), for example, sells software that helps companies gather and use factual profiles. The behavioral component contains information about the online activities of the customer. It is usually stored in different forms such as logic-based descriptions, classifi- cation rules, and attribute–value pairs. The most common representation of behavioral information is association rules. Here is an example of an association rule: “When shopping on weekends, consumer X usually spends more than $100 on groceries” (Adomavicius & Tuzhilin, 1999). The rules can be defined by a human expert or ex- tracted from transactional data using data mining methods. Broad Vision (http://www.broadvision.com) and Art Technology Group (http://www.atg.com), among others, sell software that helps users build and use rule-based profiles. The rule-based profile-building process usually consists of two main steps: rule discovery and rule validation. Various data mining algorithms such as Apriori (Agrawal

& Srikant, 1994) and FP-Growth (Han, Pei, Yin, & Mao, in press) can be used for rule discovery. A special type of association rules, profile association rules, has been proposed by Agrawal, Sun, and Yu (1998). A profile association rule is one in which the left-hand side consists of customer profile information (age, salary, education, etc.), and the right-hand side of customer behavior information (buying beer, using coupons, etc.). Agrawal et al. (1998) proposed a multidimensional indexing structure and an algorithm for mining profile association rules. One of the problems with many rule discovery methods is the large number of rules generated, many of which, al- though statistically acceptable, are spurious, irrelevant, or trivial. Post-analysis is usually used to filter out irrelevant and spurious rules. Several data mining systems perform rule validation by letting a domain expert inspect the rules on a one-by-one basis and reject unacceptable rules. Such an approach is not scalable to large numbers of rules and customer profiles. To solve the problem, Adomavicius and Tuzhilin (2001) proposed collective rule validation. Rules are collected in a single set to which several rule validation operators are applied iteratively. Because many users share identical or similar rules, those can be validated

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources