New Probabilistic Interest Measures for Association Rules

Hahsler, Michael and Hornik, Kurt ORCID: https://orcid.org/0000-0003-4198-9911 (2006) New Probabilistic Interest Measures for Association Rules. Research Report Series / Department of Statistics and Mathematics, 38. Department of Statistics and Mathematics, WU Vienna University of Economics and Business, Vienna.

[img]
Preview
PDF
document.pdf

Download (1MB)

Abstract

Mining association rules is an important technique for discovering meaningful patterns in transaction databases. Many different measures of interestingness have been proposed for association rules. However, these measures fail to take the probabilistic properties of the mined data into account. In this paper, we start with presenting a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world database from a grocery outlet to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left hand side of rules and that lift performs poorly to filter random noise in transaction data. Based on the probabilistic framework we develop two new interest measures, hyper-lift and hyper-confidence, which can be used to filter or order mined association rules. The new measures show significant better performance than lift for applications where spurious rules are problematic.

Item Type: Paper
Keywords: data mining / association rules / measures of interestingness / probabilistic data modeling
Divisions: Departments > Finance, Accounting and Statistics > Statistics and Mathematics
Depositing User: Repository Administrator
Date Deposited: 21 Aug 2006 11:29
Last Modified: 24 Oct 2019 13:41
URI: https://epub.wu.ac.at/id/eprint/1286

Actions

View Item View Item

Downloads

Downloads per month over past year

View more statistics