Solution
--------

1. a. S1 : A[i+1] = A[i+2];
      Instantiating the statement instances for S1,
      when i is 0, A[1] = A[2];
      when i is 1, A[2] = A[3];

      The value is first read, then  written into in the next iteration.
      The dependence  is therefore  Write After Read  (anti dependence).
      The dependence is loop carried,  and the dependence distance is 1.
      Since the array A is of type int, the vectorization factor is 4.

   b. The  code can  be  vectorized  because A[i+2]  is  the source  for
      A[i+1], and  it lexically precedes  A[i+1]. In the dump  file, the
      vectorization decision is given as:

      test.c:5: note: LOOP VECTORIZED.
      test.c:2: note: vectorized 1 loops in function.

      The dependence is  loop-carried, and therefore the  code cannot be
      parallelized. This is reported in the dump file as:

      distance_vector: 1
      direction_vector: +
      FAILED: data dependencies exist across iterations

3. Initially, the  write access is misaligned  by 4 bytes, and  the read
   access is misaligned  by 8 bytes. This information is  present in the
   dump_file as:

   test.c:5: note: vect_compute_data_ref_alignment:
   test.c:5: note: misalign = 8 bytes of ref a[D.1702_4]
   test.c:5: note: vect_compute_data_ref_alignment:
   test.c:5: note: misalign = 4 bytes of ref a[i_3]
 
   Atleast one compile-time misalignment can be corrected by peeling. To
   align the reads  (a[D.1702_4]), we need to peel the  loop 2 times. To
   align the  writes (a[i_3]),  we need  to peel the  loop 3  times. The
   peeling decision is given in the dump file as:

   test.c:5: note: Try peeling by 2
   test.c:5: note: Alignment of access forced using peeling.
   test.c:5: note: Peeling for alignment will be applied.
   test.c:5: note: known peeling = 2.