International Journal of Parallel Programming
vol. 28, no. 1, pp. 1--46, February 2000
Instruction scheduling methods which use the concepts developed by the classical pipeline theory have been proposed for architectures involving deeply pipelined function units. These methods rely on the construction of state diagrams (or automatons) to (i) efficiently represent the complex resource usage pattern, and (ii) analyze legal initiation sequences, i.e., those which do not cause a structural hazard. In this paper, we propose a state-diagram based approach for modulo scheduling or software pipelining, an instruction scheduling method for loops. Our approach adapts the classical pipeline theory for modulo scheduling, and, hence, the resulting theory is called Modulo-Scheduled pipeline (MS-pipeline) theory. The state diagram, called the Modulo-Scheduled (MS) state diagram is helpful in identifying legal initiation or latency sequences, that improve the number of instructions initiated in a pipeline. An efficient method, called Co-scheduling, which uses the legal initiation sequences as guidelines for constructing software pipelined schedules has been proposed in this paper. However, the complexity of the constructed MS-state diagram limits the usefulness of our Co-scheduling method.
Further analysis of the MS-pipeline theory, reveals that the space complexity of the MS-state diagram can be significantly reduced by identifying primary paths We develop the underlying theory to establish that the reduced MS-state diagram consisting only of primary paths is complete; i.e., it retains all the useful information represented by the original state diagram as far as scheduling of operations is concerned. Our experiments show that the number of paths in the reduced state diagram is significantly lower --- by 1 to 3 orders of magnitude --- compared to the number of paths in the original state diagram.
The reduction in the state diagram facilitate the Co-scheduling method to consider multiple initiations sequences, and hence obtain more efficient schedules. We call the resulting method, enhanced Co-scheduling. The enhanced Co-scheduling method produced efficient schedules when tested on a set of 1153 benchmark loops.
Further the schedules produced by this method are significantly better than those produced by Huff's Slack Scheduling method, a competitive software pipelining method, in terms of both the initiation interval of the schedules and the time taken to construct them.