Assignment on Parallelization and Vectorization in GCC


Problem Statement (Level 1)

  1. Given in file 1.c is a simple C code. Try to identify the kind of dependence that exists in the statement inside the loop. Try to verify the same by analyzing the dump generated by the data-dependence pass of GCC 4.5.0.

  2. Given in file 2.c is a C code. What kind of dependence exists in the statement? Can this code be vectorized and parallelized? Verify the same by inspecting the *.vect and *.parloops dump files.

  3. Given in file 3.c is a C code. What kind of dependence exists in the statement? Can this code be vectorized and parallelized? Verify the same by inspecting the *.vect and *.parloops dump files. How does your observation relate to the decisions of parallelization and vectorization?

  4. Given in file 4.c is a C code. What kind of dependence exists in the statement? Can this code be vectorized and parallelized? Verify the same by inspecting the *.vect and *.parloops dump files. Can you explain the decisions of parallelization and vectorization in this case based on the dependence?

  5. Given in file 5.c is a simple C code. Try to identify the kind of dependence that exists in the statement inside the loop. Can such a dependence be parallelized? Can this code be parallelized? Why?


Problem Statement (Level 2)

  1. Given in file 6.c is a simple C code. What kind of dependency exists in the statement inside the loop? Can this loop be vectorized? How could this loop be vectorized? Can you link your observation with 'vectorization factor', and its influence in vectorization? Modify the code to change the vectorization decision.

  2. In file 7.c, what dependency exists in the statements inside the loop? Can this loop be vectorized? Is any sort of dependency induced because of the second statement? Can you think of a better way to speed up the execution of this code without modifying the code?

  3. Given in file 8.c is a C code. The induction variable of inner loop is j. Is this code efficient in terms of array storage in C? How can you speed up the execution of the code without making any changes to the program?

  4. Try to parallelize the C code in file 9.c with both Graphite framework and Lambda framework. Which of the frameworks do you think gave the correct result?

  5. Parallelize the code in file 10.c with both Graphite framework and Lambda framework. Which of the frameworks do you think gave the correct result? What accounts for this difference?

  6. Given in file 11.c is a simple C code. Can this code be parallelized in Lambda framework? Why? Does Graphite Framework perform any better?

  7. The code in file 12.c is similar to the code in file 8.c, with a minor difference in the array subscript. Can you also transform the code as you did in Question 8? Reason out your observation.