So I finally took the time to learn CUDA and get it installed and configured on my computer and I have to say, I’m quite impressed!

Here’s how it does rendering the Mandelbrot set at 1280 x 678 pixels on my home PC with a Q6600 and a GeForce 8800GTS (max of 1000 iterations):

Maxing out all 4 CPU cores with OpenMP: 2.23 fps

Running the same algorithm on my GPU: 104.7 fps

And here’s how fast I got it to render the whole set at 8192 x 8192 with a max of 1000 iterations:

Serial implemetation on my home PC: 81.2 seconds

All 4 CPU cores on my home PC (OpenMP): 24.5 seconds

32 processors on my school’s super computer (MPI with master-worker): 1.92 seconds

My home GPU (CUDA): 0.310 seconds

4 GPUs on my school’s super computer (CUDA with static output decomposition): 0.0547 seconds

So here’s my question – if we can get such huge speedups by programming the GPU instead of the CPU, why is nobody doing it??? I can think of so many things we could speed up like this, and yet I don’t know of many commercial apps that are actually doing it.

Also, what kinds of other speedups have you seen by offloading your computations to the GPU?

Solution:

Compatibility and portability is an issue. Not everyone has a beefy GPU (everyone can be counted on to have a GHz+ CPU, so you can rely on that being able to do a decent amount of work. The variation in GPU performance is huge by comparison. From anemic Intel integrated graphics to the latest SLI/Crossfire’d powerhouses from ATI and NVidia). your performance improvement just won’t be an improvement on all computers. Some systems will run the software slower, if at all, because they just don’t have the GPU power needed)

And of course, as others have mentioned, not every GPU vendor supports the same APIs. NVidia has done amazing things with CUDA from what I’ve seen, but no one else supports it. ATI and NVidia both support OpenCL, but Intel doesn’t, as far as I know.

There’s no API that everyone supports and which you can rely on being supported. So which API do you target? How do you make your app run on all your customers’ computers?
If you make GPU support an optional extra, it’s additional work for you. And if you require GPU support, you cut off a large number of your customers.

Finally, not all tasks are suited for running on a GPU. The GPU is very specialized for parallelizable number-crunching. It doesn’t speed up I/O-bound programs (which accounts for most of the sluggishness we see in our everyday computer usage), and as it doesn’t have direct access to system memory, you also get additional latency transferring data between RAM and GPU memory. In some cases, this is insignificant, in others, it might make the whole exercise pointless.

And finally, of course, is inertia. Large established software can’t just be ripped up and ported to run on a GPU. There’s often a huge fear of breaking things when working with existing codebases, so dramatic rewrites such as this tend to be approached very carefully. And hey, if we’ve spent the last 10 years making our software run as well as possible on a CPU, it’s probably going to take some convincing before we’ll believe it could run better on a GPU. Not because it’s not true, but because people are basically conservative, believe their own way is best, and dislike change.

Solution:

Latest Images

Trending Articles

Latest Images