CS631 Homework 2

DUE: Fri Aug 18, 2006 in class

NOTE: All assignments must be handwritten in your own handwriting, on large sized paper (please avoid notebook size paper). Assignments must be done individually. If you are stuck on a problem after trying hard, you can consult a friend, but in that case please mention against that problem who you discussed it with. There are negative points for doing this, but there will be negative points for copying without attribution, so please be honest.

Write your name and rollnumber on every sheet.

----------------------

1. Suppose we have a relation student(rollno, name).
   (a) Suppose I wish to find all names that start with a given
	prefix (example: all names that start with Su).  Can I do this 
	with a B+-tree on the name attribute? Explain.
   (b) Can I use the same B+-tree to find all names that contain 
	a given substring somewhere inside the name?  Explain.
   (c) A suffix tree creates multiple index entries for each indexed
	string, to allow efficient retrieval of all strings containing 
        a given substring (e.g. all names containing the substring
        "kumar").  Suggest what all entries should be associated with a 
        given string to allow such substring matching.
	Note: Your solution must contain modifications to existing B+ tree structure.

2. Consider an empty B+-tree index where internal node contains 
	3-6 pointers and each leaf node contains 3-5 keys.
   (a) Suppose that data entries with keys 32,31,..,1 are inserted in the
   order shown (descending). What would the space utilization of the 
   tree be?
   (b) To avoid the problems above, B+-trees can be built bottom up,
        level by level.  Give pseudocode to build the leaf level 
        from a sorted sequence of values, as well as pseudocode to build
        the parent of a level, from the nodes at the lower level.

3. Database System Concepts (5th Edition)  Exercise 12.25: (Note: easy question, don't get confused :-)
 
      Show how to compute existence bitmaps from other bitmaps.  Make sure
      that your technique works even in the presence of null values, by
      using a bitmap for the value null.

4. Database System Concepts (5th Edition)  Exercise 12.26:

      Our description of static hashing assumes that a large
      contiguous stretch of disk blocks can be allocated to a static
      hash table.  Suppose you can allocate only C contiguous
      blocks.  Suggest how to implement the hash table, if it
      can be much larger than C blocks.  Access to a block should
      still be efficient.