Analyzing selected instances
n
Fraction of duplicates in selected
instances: 44% starting with only 0.5%
n
Is the gain due to increased fraction of
duplicates?
n
Replaced non-duplicates in selected set with
random non-dups
n
Result
à
only 40% accuracy!!!
n