Solution -------- 1. When i = 1, a[1] = a[2]; When i = 2, a[2] = a[4]; We see that the dependence distance is not constant. Hence fails to GCCcompute the dependence vector for the code, marks the program for andrun-time alias check. The relevant t in the dump file highlights parit: test.c:5: note: versioning for alias required: bad dist vector for a[D.1701_3] and a[i_12] test.c:5: note: mark for run-time aliasing test between a[D.1701_3] and a[i_12] 2. The statement S1 is a[i] = a[i*2]. In the vectorized code, we will have a[0,1,2,3] = a[0,2,4,6]; The reads are even index elements. But the loads in vector registers are contiguous. Therefore, to form the vector a[0,2,4,6], we need two vector loads - a[0,1,2,3] & a[4,5,6,7]. From these two vectors, the 4 even index elements are extracted and put in a vector register. This is achieved through VEC_EXTRACTEVEN_EXPR and VEC_EXTRACTODD_EXPR instructions provided in SSE, which operation on two vectors, and combine the even and odd index elements to give a new vector. The odd_expr vector is discarded, and the even_expr vector is retained to be used as the RHS. The relevant CFG code achieving this is: vect_perm_even.19_40 = VEC_EXTRACTEVEN_EXPR ; vect_perm_odd.20_41 = VEC_EXTRACTODD_EXPR ; D.1702_4 = a[D.1701_3]; MEM[(int[245] *)vect_pa.21_43] = vect_perm_even.19_40; Notice that for strided access, we need more than one vector load to load the correct vector. In general, for (i*n), we need n register loads to form the correct vector. This is why any strided accesses are expensive, and restricted to reads.