Hypertext retrieval and mining
(Graduate elective)
CS610, Spring 2002

Staff

Timing

Slot 6 (M11:30, Th8:30, F11:30) M11:30a-1p, W9p-10:30p F12.
No designated office hours; appointments by email.

First round of projects due 2002-03-15.

Evaluation

Prereq quiz ...........................  2
Midterm ............................... 30
Hw2 ...................................  4
Proj1   Understanding   3
        Plans and goals 3
        Progress        4 ............. 10
Proj2   Progress        4
        Results         6
        Future work     4 ............. 14
Final exam ............................ 40

Prerequisites

Calendar

2002-01-03 Course intro and prereq quiz
2002-01-04 Crawling: system block diagram
2002-01-07 Crawling: event queue, data structures

2002-01-10 Building an inverted index
2002-01-11 Relevance ranking, recall/precision, vector space
2002-01-16 Index compression, spamming, find-similar
2002-01-17 Shingles, probabilistic models

2002-01-23 Clustering: HAC, k-means, SOM
2002-01-24 Multidimensional scaling and FastMap
2002-01-28 Random projections and LSI
2002-01-30 Mixture models and EM
2002-01-31 EM, Multiple-cause mixture model
2002-02-04 MCMM, aspect models and PLSI
2002-02-06 PLSI, feature selection, MDL principle
2002-02-11 MDL principle, application to clustering
2002-02-13 Collaborative filtering

2002-02-18 Supervised learning, nearest neighbor classifiers
2002-02-20 Feature selection: mutual information
2002-02-25 Midterm week: no lecture
2002-02-27 Midterm week: no lecture
2002-03-04 Feature selection: Markov blankets
2002-03-06 Bayesian classification, naive, networks
2002-03-11 Shrinkage and multi-topic classification
2002-03-13 CS610 midterm exam
2002-03-18 Support vector machines
2002-03-20 Maximum entropy classifiers
2002-03-25 Holiday
2002-03-27 Rule based learning, FOIL

2002-04-01 Semi-supervised learning using EM
2002-04-03 Relaxation labeling, co-training

Links