# Cache-Oblivious Searching and Sorting

M.Sc. Thesis by Frederik Rønn

## Abstract

Algorithms that use multi-layered memory hierarchies efficiently have
traditionally relied on detailed knowledge of the characteristics of
memory systems. The cache-oblivious approach changed this in 1999 by
making it possible to design memory-efficient algorithms for
hierarchical memory systems without such detailed knowledge. As a
consequence, one single implementation of a cache-oblivious algorithm
is efficient on any memory hierarchy. The purpose of the thesis is to
investigate the behavior of cache-oblivious searching and sorting
algorithms through constant-factors analysis and benchmarking.

Cache-oblivious algorithms are analyzed in the ideal-cache model, which
is an abstraction of real memory systems. We investigate the assumptions
of the model in order to determine the accuracy of cache-complexity
bounds derived by use of the model.

We derive the constant factors of the cache complexities of
cache-oblivious, cache-aware, and traditional searching and sorting
algorithms in the ideal-cache model. The constant factors of the work
complexities of the algorithms are derived in the pure-C cost
model. The analyses are verified through benchmarking of
implementations of all algorithms.

For the searching algorithms, our constant-factors analysis predicts
the benchmark results quite precisely --- considering both memory
performance and work complexity. For the more complex sorting
algorithms our results show the same pattern, though the similarities
between predicted and measured performance are not as significant.

Furthermore, we develop a new algorithm that lays out a
cache-oblivious static search tree in memory in linear time, which is
an improvement of the algorithms known so far.

We conclude that by combining the ideal-cache model and the pure-C model,
the relative performance of programs can be predicted quite precisely,
provided that the analysis is carefully done.

The following files are available for download:

- The thesis text in ps or pdf
format.
- The source code for the methods and algorithms presented in the thesis:
source.tgz.
- The benchmark results can be found here.
- PowerPoint presentation from oral defense (in Danish).