Master's thesis by Jacob Jepsen (April 2014)
We present a framework that aids the programmer in the development of GPU-executable code. We implement a catalogue of common optimizations specific to the GPU architecture. Through the framework, the programmer can semi-automatically apply the optimizations to a computationally-intensive code section and generate an equivalent GPU-executable code section. Based on our experiments, the generated code can be up to one order of magnitude faster than the code from equivalent frameworks and optimized CPU code, and it can attain close to 25% of peak performance of the GPU. We also found that many of the transformations can be performed automatically, which makes our framework usable for both novices and experts in GPU programming. Finally, we contribute with our experiences in creating such frameworks.
The thesis is available for download: [pdf]
The source code is available as an archive: [zip]
The slides used in the defence: [pdf]