A Study of I/O-Efficient Maps and Sets

Master's thesis by Henrik Thorup Andersen (May 2016)


Maps and sets are ubiquitous in modern computing, and I/O-efficient implementations of these are of central interest to high performance applications. We present a novel variant of the B-tree, achieving the same asymptotical bounds on all operations for two different sets of parameters, in practice tuning for both disk and cache. We also discuss hopscotch hashing, presenting a new (partly empirical) analysis, to counter the faulty analysis presented in the original paper. We provide implementations and perform experiments to evaluate the performance of these data structures in comparison with the more established I/O-effective solutions to the map/set problem, the traditional B-tree and the linear-probing hash map.

The whole thesis is available for download: [pdf]