40 Data Mining
Mirza I. RahmanandOmar H. Dabbous
Data mining has been defined as ‘The nontrivial
extraction of implicit, previously unknown and
potentially useful information from data’ (Frawley
et al., 1992). However, an easier way to grasp the
concept of data mining is to think of it as a process
that uses automated, analytic tools to search large
databases, in order to discern useful information.
40.1 Introduction
The goal of data mining is to simplify the process
for sorting through vast amounts of data to generate
valuable and actionable information in support of a
business proposition. Given the large volume of
data that is collected in a variety of industries and
the speed with which it is being accumulated, dig-
ging through those databases to get to the kernels of
knowledge may be impossible if done manually.
The development of powerful computers, along
with software that contains data-mining algo-
rithms, provides the individual with an additional
tool to better do his/her job.
Some common uses of data mining are in the
marketing of specific products to customers who
show a propensity for purchasing a particular pro-
duct. Although it may be intuitive that customers
who buy a particular product may be more apt to
purchase a similar type of product if it is marketed
to them, searching for hidden correlations between
disparate products (e.g. soap and tea purchases)
may generate new avenues to co-market or at
least place products in close proximity to one
another. Additionally, high-value customers can
be segmented from the general customer popula-
tion to allow for a more focused marketing
approach to these customers.
Data mining has been used in several industries
and in many different ways, including banking to
detect credit card fraud, by retailers in direct mail
marketing campaigns and in the sales of goods
from wholesalers to retailers. In the pharmaceuti-
cal industry, data mining has been used in sales and
marketing to focus on the types of customers the
company wants to focus on, in reviewing sales
force performance, in examining clinical and non-
clinical toxicology data for potential claims to
pursue and now in the review and assessment of
post-marketing safety surveillance data. This
last topic will be discussed in depth, later in this
chapter.
As important as it is to define what data mining
is, it is equally important to statewhat it is not. Data
mining is not a panacea for business problems. It is
simply another tool to be used in seeking solutions
to business problems. There will still be a need for
Principles and Practice of Pharmaceutical Medicine, 2nd Edition Edited by L. D. Edwards, A. J. Fletcher, A. W. Fox and P. D. Stonier
#2007 John Wiley & Sons, Ltd ISBN: 978-0-470-09313-9