Solutions for assignment-2
---------------------------
8 marks
*******
                                                                                                                             
1) a) 2 Marks
Yes. It can be done using a B+ tree on name attribute.
- Start traversing B+ tree with search key = prefix
- The leaf node you end with will be the node with name lexicographically lesser than prefix string or name containing
prefix.
- Start scanning sequentially all leaf nodes from this node till we get a node with name value lexicographically greater
than prefix string.
                                                                                                                             
                                                                                                                             
1) b) 1 Mark 
No. Names containing a given substring may be scattered at any location in the B+-tree, so there is no benefit of using the
tree over a full relation scan.                                                                                                                              
                                                                                                                             
1) c) 5 Marks
B+ tree structure can be modified to make it capable of finding all names containing a given substring.
                                                                                                                             
Concept: Substring can be prefix of a suffix
                                                                                                                             
Construction:
     For each name in the set
   - Find all suffix strings of the given string
   - Index them independently in the B+ tree.
   - So they will occupy independent leaf nodes.
   - But the record pointer of all these leaves will point to same record as that of original string.
                                                                                                                             
Search:
-------
   - Given a string we can locate all strings which have this string as its prefix using the part a).
   - Pointers in the buckets associated with these leaf nodes will give us all record ids which have name containing this
string as a substring.
                                                                                                                             
---------------------------------------------------------------------------------------
                                                                                                                             
6 marks
********
                                                                                                                             
2) a)   2 Marks
Utilization will be approximately 3/5 = 60%
                                                                                                                             
                                   15,24
                                     |
       6,9,12                      18,21                   27,30
         |                           |                       |
                                                                                                                             
1,2,3,4,5  ... 12,13,14    15,16,17 ... 21,22,23      24,25,26 ...30,31,32

2) b)  4 Marks
We have 'n' data values to be inserted already sorted in ascending order.
To make maximum utilization of node space:
                                                                                                                             
   Build leaf level:
                                                                                                                             
   If each leaf node can contain maximum of 'm' data entries then divide
   the input in slices of size 'm'
   Fill up nodes with their maximum capacity
   Total number of leaf nodes needed will be ceiling(n/m)
                                                                                                                             
   Build parent level from nodes at lower level
                                                                                                                             
   1. Propagate first entries of all leaf nodes starting from second node to
      second level.
   2. Group these entries into nodes of size 'm+1'
   3. Delete and send 'm+1'st entry from each node to one level up.
   4. Repeat steps 3 & 4 till we get only one node at the current level as a root of B+ tree
                                                                                                                             
(Database implementations often leave some free space in each node to better handle future insertions)
---------------------------------------------------------------------------------------
                                                                                                                             
2 Marks
*******


3)
- To calculate existence bitmap it is sufficient to have all bitmaps associated with any one column in the relation.
- For attributes which can have null value, there will be a separate null bitmap associated with that column.
- Existence bitmap will just be OR of all bitmaps associated with a given column including null bitmap.
                                                                                                                             
----------------------------------------------------------------------------------
                                                                                                                             
4 Marks
*******
                                                                                                                             
4)
                                                                                                                             
Alterative (1)
--------------
Here we can allocate 'C' contiguous blocks at a time.
So we can use same hash function 'h' as that of, 'static hashing without any constraint on  contiguous memory allocation.
For that we will have to keep an additional B+ tree to keep track of hash
function value and corresponding actual block address.
If for a key 'k' we get hash value 'h(k)', then 'h(k)' will act as key in the
B+ tree. The address of block corresponding to actual key 'k' will be stored
there.  Alternatively, we can use a  file in the file system to store blocks; file systems
also use tree structures to record and look up blocks in a file.
                                                                                                                             
                                                                                                                             
Alternative (2)
---------------
As we can allocate 'C' contiguous blocks at a time. So if the hash table
requires 'k*C' number of blocks then we can split the value of hash function
into 2 parts. First part of 'log k' bits will decide the set of 'C' blocks to
which the record belongs and the second half of 'log C' bits will decide to which
of block from that set the record belongs. We will have to keep mapping of set
number and the starting physical address of that set of blocks.
                                                                                                                             

Alternative (2a)
---------------
We can apply two different hash functions h1 & h2.
Given a key first  hash function will decide the set to which it belongs: h1(k)
           second hash function will decide the block to which it belongs: h2(h1(k))