Organization of Web Information
CS728, Spring 2011
Javascript is needed to view this page. Please report any
dead links to me.
- 2011/01/04 Course intro
- 2011/01/07
Postponed by order on account of TechFest.
- 2011/01/11
- 2011/01/14
- 2011/01/18
- CRFs: details of feature encoding and dynamic programs.
- Max-margin training of CRFs.
- Need for supporting more general graphs.
- 2011/01/21
- Loss-augmented inference examples.
- Modeling non-local effects:
context-free grammars, distant shared tokens, long segments.
- Segment CRFs, dictionary features.
- PCFGs .
- 2011/01/25
- 2011/01/28
- Collin's use of voted perceptron for CRF training.
- Multiple sequence alignment: hardness, heuristics.
- Tree alignment
.
- 2011/02/01
- Guest lecture by TA, on GATE and UIMA.
- Additional reference: Case study in wrapper generation ---
- 2011/02/04
- Guest lecture by Vinayak Borkar on Hyracks.
- 2011/02/08
- Forms of type and entity catalogs.
- Overview of open-domain vs. closed-domain entity recognition.
- Hearst patterns .
- KnowItAll
(also see CPANKOW,
).
- Named relation recognition.
- DIPRE .
- 2011/02/11
- Snowball .
- Relation clues: surface tokens, POS tags, parse tree,
dependency graph
- Feature, rule, (path) kernel-based methods
.
- String kernel tutorial.
- Open-domain binary relation harvesting:
TextRunner.
- 2011/02/15
- 2011/02/18
- Tree kernels
- Cucerzan's leave-one-out
- Milne and Witten's anchor method
- 2011/02/26
Midterm exam, 5:30--8pm.
Check the solutions
for bugs.
- 2011/03/01
- Embedding entities in category/type space.
- CSAW.
- Annotating tables with relations, types, entities.
- Overview of coreference resolution.
- 2011/03/04
- Pairwise decisions and correlation clustering
- Active
learning committees for coreference resolution.
- Attribute-mediated dependences
.
- 2011/03/08
- Attribute-mediated dependences, continued.
- Lecture slides.
- Overview of bridging the gap between structured
and unstructured search systems.
- 2011/03/11
- The gap between structured and unstructured search systems.
- Data representation, user's knowledge of schema,
query language choices, response unit.
- BANKS
group Steiner trees, spreading activation.
- 2011/03/15
- WHIRL .
- XQuery ,
XIRQL
and ELIXIR.
- 2011/03/18
Lecture postponed.
- 2011/03/22
- 2011/03/23
- NetRank
--- learning weights for edge types.
- 2011/03/25
- NetRank --- learning circulations.
- Random walk with restarts --- low rank decomposition
.
- HubRank --- indexing for spreading activation.
- 2011/03/29
- Centerpiece subgraphs
.
- Descendant of answer type in proximity to keyword matches:
IR4QA
- Extending to multiple answer types:
- Other graph query mechanisms: SPARQL, NAGA.
- 2011/04/01
Lecture postponed (WWW 2011).
- 2011/04/05
- Guest lecture
by Ashwin Machanavajjhala from Yahoo!:
A holistic approach to Web scale information extraction.
- 2011/04/08
- Learning lexical proximity:
IR4QA.
- Additional reading:
- 2011/04/12
- NAGA's corpus, query and response models
- NAGA's ranking semantics based on probabilistic IR
- 2011/04/15
- Evidence aggregation in entity search
- Laplacian smoothing of snippet scores
QinLZWXL2008RankingRelationalObjects
- Graphical model and mincut
- Ranking considerations in EntityRank
- Cascade of two logistic regression models
- Ranking quantity responses
QCQ.
- 2011/04/18
- Indexing considerations in entity and relation search
- Nextword and SIP index
- Contextual indices
SSQ
.
- Detour to set expansion:
Bayesian
sets, WangC2007SEAL.
- 2011/04/19
Office hours for clarification
and discussion.
2011/05/01
Final exam, 9:30am--1pm.
Check the solutions
for bugs.