The
primary task of association mining is to detect frequently co-occurring groups
of items in transactional databases. The intention is to use this knowledge for
prediction purposes: if bread, butter, and milk often appear in the same
transactions, then the presence of butter and milk in a shopping cart suggests
that the customer may also buy bread. More generally, knowing which items a
shopping cart contains, we want to predict other items that the customer is
likely to add before proceeding to the checkout counter. This paradigm can be
exploited in diverse applications. For example, in the domain discussed in each
“shopping cart” contained a set of hyperlinks pointing to a Web page in medical
applications, the shopping cart may contain a patient’s symptoms, results of
lab tests, and diagnoses; in a financial domain, the cart may contain companies
held in the same portfolio; and Bollmann-Sdorra et al. proposed a framework
that employs frequent itemsets in the field of information retrieval.
In
all these databases, prediction of unknown items can play a very important
role. For instance, a patient’s symptoms are rarely due to a single cause; two
or more diseases usually conspire to make the person sick. Having identified
one, the physician tends to focus on how to treat this single disorder,
ignoring others that can meanwhile deteriorate the patient’s condition. Such
unintentional neglect can be prevented by subjecting the patient to all
possible lab tests. However, the number of tests one can undergo is limited by
such practical factors as time, costs, and the patient’s discomfort.