Starting from:

$30

CS314 Assignment 8

CS314 
Assignment 8
Problem 1 - Dependencies
Suppose we have the following program (a sequence of instructions), with each instruction labeled as
S<num>:
S1 : a := 4;
S2 : b := 2;
S3 : c := 5;
S4 : read ( d );
S5 : a := b + 3;
S6 : b := a - 3;
S7 : c := d * b ;
S8 : e := a + 6;
S9 : print ( c );
S10 : print ( e );
(a) Give the statement-level dependence graph for the above program. A node in the statement-level
dependence graph represents a statement, an edge represents dependence between the statements
(i.e. the nodes). Label each edge as a true data dependence, an anti data dependence, or an
output data dependence.
(b) Assume that each statement takes 1 cycle to execute. What is the execution time of the sequential code? What is the fastest parallel execution time of the program (i.e. the critical path)?
You may assume that I/O operations (read and print) can be done in parallel.
Problem 2 - Dependence Analysis
Give the source and sink references, the type (whether a dependence is true, anti, or output), and
the distance vectors for all dependences in the following loops.
(a) do i = 3, 100
a(i) = a(i - 1) + a(i + 1) - a(i - 2)
enddo
(b) do i = 2, 100
a(3 * i) = a(3 * i - 3) + a(3 * i + 3)
enddo
(c) do i = 1, 10
a(i) = a(5) + a(i)
enddo
Use aW (i) and aR(i) to annotate the write access to a(i) and the read access to a(i) respectively.
Problem 3 - Loop Parallelization
Given the following nested loop:
1
do i = 2 , 100
do j = 2 , 100
S1 : a (i , j ) = b ( i - 1 , j - 1) + 2
S2 : b (i , j ) = i + j - 1
enddo
enddo
(a) Give the statement-level dependence graph. Show the dependence graph for statement instances
in a part of the iteration space: i = 2 ... 5, j = 2 ... 5.
(b) In its current form, can any loop be parallelized? If so, which loop(s)? If not, justify your
answer.
(c) Provide one valid affine schedule for statements S1 and S2 such that p(S1)= C11*i + C12*j +
d1 and p(S2)= C21*i + C22*j + d2 in order to achieve synchronization-free parallelism. There
could be many possible solutions for {C11, C12, C21, C22, d1, d2}. You only need to provide one
feasible solution. (Hint: You can let d1 = d2 = 0.)
(d) Generate two-level loop code for the affine schedule you provided. Please use p as outermost
loop and i as innermost loop in the transformed loop. Calculate the loop bounds for p and i
using Fourier-Motzkin elimination. You might need to calculate the overlapping polyhedron for
S1 and S2 in order to eliminate the j loop. Please refer to the techniques for code generation in
lecture 20 and lecture 21.
2