Wednesday, May 6, 2015

Mining Contrast Sets

1. Introduction

  • Motivation
    • understanding differences between groups
  • Task
    • provide an efficient algorithm for mining contrast contrast sets and pruning rules to reduce complexity
    • provide post processing techniques to present subsets that are surprising
    • control the false positives
    • be statistically sound
  • Goal
    • To find contrast-sets whose support differs meaningfully (statistically) across groups
      • $\exists i,j$  $P(cset = true | G_i) \neq P(cset = true | G_j) $, $max_{ij} | sup(cset, G_i) - sup(cset, G_j)| \geq min dev$
2. Naive Approach
  • Add an attribute to the set (group type) and use Association Rule Mining to find the differences
  • Problems
    • this will not return group differences
    • the results will be difficult to interpret


No comments:

Post a Comment