Experiences with the learning approach
nToo much manual search in preparing training data
nHard to spot challenging and covering sets of duplicates in large lists
nEven harder to find close non-duplicates  that will capture the nuances
Active learning is a generalization of this!
examine instances that are similar on one attribute but dissimilar on another