Project Ideas for CS 632

  1. Optimization Related
    1. Modify Prasan's Volcano optimizer code to: introduce attribute equivalence classes and use them for join size estimation
    2. Modify Prasan's Volcano optimizer code to: infer functional dependencies (implement ideas from: David E. Simmen, Eugene J. Shekita, Timothy Malkemus: Fundamental Techniques for Order Optimization. SIGMOD Conference 1996)
    3. Modify Prasan's Volcano optimizer code to: Implement histograms and use them for selection/join size estimation and costing. Also implement techniques for constructing histograms by sampling.
    4. Implement a prototype of the Cascades pattern matching framework:
      • each rule registers a pattern for inputs on which it applies e.g. join with left input as join, or selection with join underneath.
      • whenever the DAG changes and gives a new opportunity to match a registered pattern, a "binding" is created and passed to the rule which gives back a rewritten pattern
      • the rewritten pattern is then merged back into the DAG, potentially triggering other pattern matches.
      • Since a change may trigger multiple pattern matches, a queue is maintained for pattern matches that should trigger rule invocations.
      Use Prasan's optimizers DAG data structures and rules as the infrastructure when building this. (Could lead to an MTP that implements the Cascades framework fully.)
    5. Adaptive join techniques: switch on the fly from INL to INL with sorting, etc. Implement on PostgreSQL.
  2. Access Methods
    1. Write optimized B-Trees. Standard B-trees fare quite poorly on bulk inserts/updates, and the "standard" solution is to drop indices and rebuild them, which is out of the question for very large databases and for databases that cannot have downtime. Solutions have been proposed in the past [Incremental Organization for Data Recording and Warehousing, H.V. Jagadish, P.P.S. Narayan, S. Seshadri, S. Sudarshan and Rama Kanneganti, VLDB 1997] and [Goetz Graefe, Write-Optimized B-Trees. VLDB 2004]. A partial implementation in PostgreSQL was done by Kriti Puniyani in Autumn 2005, the goal here is to complete it and make it really solid, and to work on and test policies for tuple migration.
  3. Delite based projects
  4. Planet lab based projects
  5. Novelty detection
Under construction ....