Collective entity disambiguation
- Comparisons between annotations made by CSAW,
Cucerzan's algorithm, and 
Milne 
and Witten's algorithm.
 
- Web documents
crawled for CSAW evaluation in SIGKDD 2009 paper.
 
- Ashish Kulkarni released a revision with some ground truth mistakes cleaned up.
 
- Ground 
truth annotations on the above documents collected from
volunteers.
 
- Minor errata with corrections (thanks to the CS728 class for
pointing these out):
- In section 1.2, we wrote about Wikify!: "even random
disambiguation results in an F1 score of 0.82".  This is incorrect.
Choosing the most frequent sense/entity gave the F1 score of
0.82.  Random selection gave F1 closer to 0.5.
 
- The unnumbered display equation in section 2.4.2 (just before
section 2.5 begins) claims to express (a
non-negative) relatedness as per Milne and Witten, but the
numerator is clearly negative.
The M&W 2008
paper gives a formula above Figure 2 that is non-negative,
but whereas the lhs is called relatedness, the
rhs decreases with increasing relatedness.  In fact, their
earlier AAAI
paper displays the same formula on page 3,
called sr.  A plausible formula can be
found here,
in section 4.3, called "mw_coh".