Our two main research goals are: (a) Improving machine independent optimizations in GCC, and (b) Improving retargetability mechanisms.
Improving machine independent optimizations in GCC
It is interesting to note that though Partial Redundancy Elimination was proposed in 1979, and was recognized by the compiler community as a very powerful optimization around 1990, it was implemented in the GCC framework much later. We believe that the complexity of intermediate representation used by GCC coupled with a lack of automatic construction of optimizers may have played a role in several cutting-edge optimizations not finding place in GCC.
We propose to facilitate:
Automatic construction of machine independent optimizers from high level specifications: The proposed optimizer generator will employ systematic analyses at all levels: local, global and interprocedural. It will use simple and generic, but efficient and precise analysis algorithms. This technology would make it much easier to add optimizations to GCC. Generation of optimizers will allow more and more people to experiment with optimizations. This is useful because many modern optimizations are derived from the domain knowledge of the applications rather than the knowledge of compiler writing. Specification of optimizations will allow reasoning at a higher level. The guarantee of semantic equivalence of the specification and its implementation would be a result of the automation in that it would depend on the correctness of the generation mechanism and would be invariant across different optimizers which are generated.
Currently, specification driven optimization for bit vector frameworks at instraprocedural level has been supported through gdfa.
Flow and Context sensitive interprocedural data analysis: Currently GCC does not have flow and context sensitive interprocedural analysis capability and the GCC community realizes that this is one of the main draw backs of GCC compared to proprietary production compilers. This project will provide a much better technology than the corresponding technology of the proprietary production compilers.
We already have devised a flow and context sensitive analysis method based on call strings. Our variant of call string method is orders of magnitude better in terms of time and space requirements compared to the classical method. This method has been presented at the International Conference on Compiler Construction 2008 held in Budapest Hungary as a part of ETAPS 2008. Efforts are underway to include this method in gdfa and test its scalability for pointer analysis of large programs.
The main limitations of GCC is its naive model of retargetability. This model uses simplistic full tree matching algorithms for instruction selection and hence expects a compiler developer to visualize and specify meaningful and efficient combinations of machine instructions to compute the operations which may exist in an intermediate representation of any program. To complicate matters further, the mechanism of specifying machine descriptions is quite adhoc.
As a consequence, the machine descriptions are difficult to construct, understand, and modify because of the verbosity, the amount of details and the repetitiveness. The sheer size of machine descriptions is formidable. For example, the directory gcc/config/i386 in gcc-4.3.1 contains 80,103 lines! Two main reasons for verbosity are (i) Use of similar RTL expressions as a part of other RTL expressions across instruction patterns, and (ii) Constrained use of Mode Iterators and Code Iterators.
Unfortunately, not much discussion seems to be centered around these concerns and most explanations of GCC which are publicly available (including those presented through special workshops and tutorials) describe the build and install process of GCC, the GCC front end, the IRs used by GCC and their manipulations by the optimization phases in GCC, and the structure of machine descriptions required by GCC. Although several dozen actual machine descriptions are readily available, one does not come across much information on the insights behind machine descriptions. As a consequence, the following simple questions seem to have remained unaddressed:
Our research addresses these concerns. In particular, we are trying to explore the following three possibilities for improving retargetability mechanism and simplifying machine descriptions.
We have devised a systematic methodology based on incremental construction of machine descriptions where the increments are driven by adding features to the source language starting from a language of empty functions. This methodology has been presented in GREPS 2007 Workshop held at Brasov in Romania on September 15, 2007.
We believe that this methodology is very useful because the mechanism of providing machine descriptions to the GCC framework is quite adhoc inspite of being very successful. As a consequence, the machine descriptions are difficult to construct, understand, maintain, and enhance because of the verbosity, the amount of detail, and the repetitiveness. The publicly available material fails to bring out the exact abstractions captured by the machine descriptions. There is no systematic way of constructing machine descriptions and there are no clear guidelines on where to begin developing machine descriptions and how to construct them systematically.
We are in the process of adding a few constructs that enable specification of composable rules and allow compact machine descriptions. In particular, these constructs will allow
The advantages of the proposed constructs are
Significant reduction in the size of MD files.
We are trying to incorporate tiling based algorithms for instruction selection to allow for cleaner and simpler machine descriptions. Since tiling based algorithms have the ability to match larger trees in terms of smaller trees, a compiler developer does not have to specify combinations of instructions. This results in much smaller specifications. Additionally, these algorithms are known to produce good quality code. Our preliminary experiments with using iburg for i386 shows that with these algorithms, we need around 200 rules as against over 1000 for i386. This technology would simplify retargeting GCC significantly since the machine descriptions would become much simpler and direct. The quality of generated code will become largely independent of the machine descriptions as it would depend on the algorithms rather than on the amount of details provided in the machine descriptions.