GPU going General Purpose
Thomas Menguy | February 5, 2009[here is a good recap/introduction about Graphical Processors, thanks Sylvain!]
In the desktop computing world, at the beginning, there were 3D graphical hardware accelerators which handled the fixed functionality provided by graphic libraries (either hardware vendor proprietary library or standard library like OpenGL and DirectX). The first generation of accelerators handled geometric primitives (triangle, quad, line, point) rasterization and texturing. Newer generation added hardware implementation for the complete graphical pipeline with geometric computation for 3D coordinates transformation and lighting.
Then, with the addition of the support for shading languages, graphical accelerators offered programmable steps in the graphical pipeline. It means that where older accelerators completely handled the color of each rendered pixel of a primitive, accelerators with programmable shader offer the possibility to write shader code that is executed on the GPU and which can control the rendering of each pixel of a primitive.
Support for shading languages is the feature that enabled General Purpose computing on the Graphical Processing Unit. Indeed, GPU are designed to perform tons of geometric computation on vectors. Vectors are used to represent geometric coordinates as well as colors. GPU also have very high data bandwidth compared to the CPU, so they can fetch texture and geometric data very fast.
This computing power was made available on nVidia hardware with the Cuda language which enables to write C code (with some restrictions) that is compiled for the GPU. This page presents some applications that run on the GPU with Cuda:http://www.nvidia.com/object/cuda_home.html
Recently, AMD released its Stream SDK which is a technology comparable to Cuda but for ATI hardware. This SDK includes a video converter application: Avivo, which is said to run 13 times faster on the GPU. At the same time, the Khronos group released the first official specifications of OpenCL, a library to program GP-GPU that is not tied to any GPU vendor. NVidia already announced they will provide an OpenCL implementation alongside their Cuda SDK. We can certainly bet AMD will also support OpenCL soon.
The OpenCL standard interface opens the door to significant optimization in a large range of applications by provide access to the GPU’s processing power. Not all applications because general-purpose is not, actually, all-purpose. GPU are efficient for specific kind of tasks. They are most useful for problems which involve big amount of data that can be processed in parallel. We will not see soon or late a compiler which runs on a GPU, but relevant applications could easily perform computation on GPU thanks to OpenCL, just the same way they currently make use of SIMD extensions available from the CPU like SSE. Apple is integrating OpenCL in MacOS X just right now.
In the embedded computing world, fixed functionality GPU accelerators are now present in most of the smartphones. For example iPhone includes a PowerVR MBX core licensed by Imagination Technologies which support OpenGL ES 1.1. Regarding graphical power this chipset is where desktop computers were a few years ago. There already exists chipsets which support OpenGL ES 2.0. Those graphical chipsets do include programmable shaders and enable GP-GPU programming on embedded devices just like GPU in desktops.