Lab for High Performance Computing SERC, Indian Institute of Science
Home | People | Research | Awards/Honours | Publications | Lab Resources | Gallery | Contact Info | Sponsored Research
Tech. Reports | Conferences / Journals | Theses / Project Reports

PLASMA: Portable Programming for SIMD Heterogeneous Accelerators

Workshop on Language, Compiler, and Architecture Support for GPGPU, held in conjunction with HPCA/PPoPP 2010
Bangalore, India, January 9, 2010


  1. Sreepathi Pai, Supercomputer Education and Research Centre
  2. R. Govindarajan, Supercomputer Education and Research Centre; Department of Computer Science and Automation
  3. M. J. Thazhuthaveetil, Supercomputer Education and Research Centre; Department of Computer Science and Automation


Data-parallel accelerators have emerged as high-performance alternatives to general-purpose processors for many applications. The Cell BE, GPUs from NVIDIA and ATI, and the like can outperform conventional superscalar architectures, but only for applications that can take advantage of these accelerators' SIMD architectures, large number of cores, and local memories. Coupled with the SIMD extensions on general-purpose processors, these heterogeneous computing architectures provide a powerful platform to accelerate data-parallel programs. Unfortunately, each accelerator provides its own programming model, and programmers are often forced to confront issues of distributed memory, multithreading, load-balancing and computation scheduling. This necessitates a framework which can exploit different types of parallelism across heterogeneous functional units \hl{and supports} multiple types of high-level programming languages including stream programming or traditional shared or distributed memory programming framework or prototyping languages such as MATLAB.

Towards this goal, in this paper, we present PLASMA, a programming framework that enables the writing of portable SIMD programs. The main component of PLASMA is an intermediate representation (IR), which provides succinct and clean abstractions to enable programs to be compiled to different accelerators. With the assistance of a runtime, these programs can then be automatically multithreaded, run on multiple heterogeneous accelerators transparently and are oblivious of distributed memory. We demonstrate a prototype compiler and runtime that targets PLASMA programs to scalar processors, processors with SIMD extensions and GPUs.


Full Text