|
GCC Code Carnival
GCC Code Carnival is
a programming contest targeted at GCC users. We have floated various
projects based on our research and development. These have been divided
into open projects with high
level specifications and projects with concrete and
detailed specifications. This will give you hands on experience
with GCC and will open your doors to many opportunities. We have
provided preliminary descriptions of the projects here. We will
provide more detailed descriptions in the workshop (and also update
these pages). We will create a process of registration for the contest,
will decide on the mode of submission and associated timelines and
announce them during the workshop. The submissions will be judged for
the efficiency and clarity of code. Good quality submissions are likely
to be rewarded by cash awards.
We may add more projects and update project specifications. It is in
general not possible to create rules to foresee all possible situations
and we reserve the right to take decisions based on the situations and
alter the rules in the best interest of the spirit of the contest.
Open
Projects with High Level Specification
- Simplifying
Machine Descriptions by Eliminating C Code
This project aims to simplify machine description of GCC back-end. A
machine description file (*.md file) for a target machine contains
patterns describing the instruction of the target machine. Some of
these patterns contain large portions of C code to customize the
instructions. To reduce the size and improve the readability of the
machine description files, we propose to specify the customizations by
adding patterns that customize instructions using predicates and
constraints, instead of specifying in terms of C code. This conversion
can be made more effective if we are able to handle constraints at the
time of expansion.
We will provide the changes to GCC source code (if any) and sample
changes in i386 and spim machine descriptions. You are expected to
rewrite some machine descriptions; our choice is i386 but you are free
to choose some other machine descriptions too. The rewritten machine
descriptions should be validated by comparing the generating assembly
programs for the original and re-written machine descriptions.
- Dynamic Loading of Functions for
LTO
Link Time Optimization (LTO) gives GCC the capability of dumping its
internal representation (GIMPLE) to disk, so that all the different
compilation units that make up a single executable can be optimized as
a single module. This expands the scope of inter-procedural
optimizations to encompass the whole program (or, rather, everything
that is visible at link time).
A full functional LTO was included in GCC 4.6.0. When using LTO, gcc
writes the intermediate representation into the object file. At link
time, gcc reads it in and uses it to implement cross-module
optimizations. Currently, GCC supports LTO mode for inlining. It is
possible for a developer to write an interprocedural pass that uses LTO
facility of reading a large call graph consisting of all functions
whose GIMPLE IR is available at link time. It is possible to read the
call graph with summary information for the functions and avoid loading
the entire function bodies. However, if an analysis requires examining
function bodies then LTO framework loads bodies of ALL functions. This
increases the memory consumption significantly for large programs.
This project aims at providing the facility of dynamic loading and
unloading of function bodies. Then it would be possible for an analysis
to avoid reading the entire program all at the same time and bring the
function bodies in memory on demand. This would have to be done
efficiently.
We will provide a report describing our study and explain the
requirements. You are expected to implement APIs for dynamic loading
and unloading of the control flow graph of a given function. You must
show that this reduces the memory foot print compared to the situation
when all function bodies are loaded together.
- Improving Liveness-based
Pointer Analysis
Precise flow- and context- sensitive pointer analysis is generally
considered prohibitively expensive for large programs; most tools relax
one or both of the requirements for scalability. We have implemented liveness-based
flow- and context- sensitive pointer analysis (L-FCPA) interprocedural
points-to analysis in GCC-4.6.0. It computes points-to information
for only live pointers and limits the propagation of pointer
information to their live ranges. Interprocedural context-sensitivity
is achieved through a value-based termination of call strings.
We are in the process of changing the data structures by hiding them
behind well defined APIs (see the corresponding project with
detailed specifications). In this project we expect you to study
the code and find other sources of improvement. Some of these could be
towards improving the reach of the analysis by enriching it (eg. better
handling of field sensitivity, pointers to heap, function pointers and
pointer arithmetic) while some other could improve the time or space
efficiency of the method.
Projects
with Concrete and Detailed Specifications
- Improving Data Structures of
Liveness-based Pointer Analysis
Our liveness based pointer
analysis currently uses linked lists for storing pointer
information. Searching over large linked lists is
restricted to linear search. This proves to be a bottleneck while
compiling large benchmarks. We are in the process of changing the data
structures
by hiding them behind well defined APIs. We have a couple of naive
implementations of these APIs.
In this project we expect you to provide efficient implementation of
the APIs (eg. using HashMaps or some other efficient data structures).
We will provide the
refactored code with APIs.
- Rewriting Machine Descriptions
using SpecRTL
The current machine descriptions used by GCC for target specifications
are very complex, non-composable and repetitive in nature. We have
devised a specification language for RTL templates (SpecRTL)
occurring in machine descriptions. specRTL provides a mechanism
for combining RTL patterns to form a new pattern; thereby reducing the
repetition. Machine descriptions written using specRTL are simpler and
hence easy to read, construct, and maintain. For integrating specRTL
with GCC, we have a specRTL compiler to convert .md files written in
specRTL to generate conventional .md files. The most convenient feature
of this conversion is that it is
incremental and non-disruptive. It is possible to re-write a small
group of patterns in specRTL retaining the rest of the .md file in the
conventional form. This can be compiled using specRTL and gcc can be
built and tested. Once the results seem satisfactory, one can rewrite
another group of patterns.
This project aims at converting existing machine description files to
SpecRTL language. We have already done this for a significant part of
i386 machine description file. You are required to rewrite the
remaining part. You are also free to choose other machine descriptions.
We will provide the specRTL compiler.
|
|