John Kalamatianos

Systematic Synthesis of Data Parallel Algorithms and its Application to Higher Order Moments Estimation

Thursday, November 12, 1998
3:30 PM
306 Egan

Abstract

Signal and image processing applications often require performing the same operation across the elements of very large data sequences in a data-parallel fashion. The available computing power, memory capacity and communication bandwidth of uniprocessor computer systems may not satisfy the high performance requirements of such applications as the sequence size increases. In such cases, Single Instruction Multiple Data (SIMD) parallel processing can be an attractive computational framework since it can reduce the time and memory requirements by matching closely the dependence characteristics of such algorithms. A systematic way of synthesizing data-parallel numerical algorithms, starting from a high-level specification of the problem to be solved, can be instrumental in producing fast and efficient implementations and in hiding the architectural details of the underlying machine from the programmer.

In this thesis, we establish the basis of a systematic methodology for synthesizing minimum latency, constant memory per processor, data-parallel algorithms. Fairly general architectural constraints are used to formulate a set of properties that can drive the systematic construction of a Dependence Graph (DG) and the selection of a space and time mapping opearators pair that when applied to the DG produces a model of the optimal data parallel algorithm. The methodology was validated by systematically deriving data parallel algorithms for the estimation of third and fourth order moments. We studied the speedup and scalability characteristics of the resulting algorithms and compared them to those produced by other parallelizing techniques as well as serial implementations. Our findings, supported by experiments on the MasPar SIMD machines, show that the synthesized parallel algorithms exhibit significantly reduced execution time and can deal with problems of much larger size than their serial counterparts. We believe that this work establishes the basis for an automated tool that will be able to generate optimal data-parallel algorithms from high-level specifications.

Thesis Committee:
Prof. Elias S. Manolakos (advisor)
Prof. Waleed Meleis
Dr. Haris Stellakis (GTE Labs)