Query Optimization at IITB

Pyro Optimizer

The Pyro query optimization project, which is based on the Volcano/Cascades framework, has run for many years, with early work focusing on multiquery optimization (circa 2000-2002) and parametric query optimization (circa 2001-2003), and optimization of nested queries (circa 2005).

PyroJ Optimizer

More recently we have been working on handling query optimization for parallel databases. Since much of the new generation software in this area was based on Java, one of the first steps was to create PyroJ, a Java version of Pyro. A key decision in the architecture of PyroJ was to decouple the optimizer from the underlying algebra, so we could easily couple the optimizer with any desired evaluation system. The design and implementation of this architecture was driven almost entirely by Sapna Jain, although the initial PyroJ code was written by Aishwarya Ganesan and Subhasish Saha.

Parallel Query Optimization

The first target was the Hyracks parallel query processing project, and in particular the Hivestrixs subsystem which evaluates Hive SQL queries. We now have a prototype optimizer for this system, which we are in the process of making more robust (as of mid 2015). This optimizer supports partitioning properties, which are key to finding optimal parallel execution plans; Sapna Jain implemented the partitioning properties, while Abhishek Gupta and Pushkar Khadilkar contributed to integration of the PyroJ optimizer with Hyracks, and to implementing multi-query optimization extensions.

More recently we were motivated to address the problem of optimization of response time; this is a special case of multi-objective query optimization, and both of these add non-trivial complexity to the optimization problem since they violate the principle of optimality. We have extended algorithms proposed earlier for System R style optimizers to work on the Volcano framework.

Join Order EnumerationSoftware

Publications

See http://www.cse.iitb.ac.in/~sudarsha/pubs.html for details of publications on the above topics. For some topics there are no publications yet, but there are MTech theses describing our work.