next up previous contents
Next: Programming considerations Up: Tuning the cache behavior Previous: Motivation.- Performance of the

Analysis

  When a siftdown moves down the tree, both childrens of the current element are accessed. However, if the cache line size is larger than two elements (the two childrens), there will be elements loaded into the cache which are not used at all. This wasted effort can be reduced by increasing the fanout of the heap so that more elements will be accessed on each level of the tree.

It is interesting to observe that increasing the fanout can improve the performance of heaps even when the effect of the cache is not taken into consideration. In [2] a detailed analysis shows that the optimal fanout is between 3 and 5 when the cost of an element swap is assumed to be from 0 to 3 comparisons.

Anyway, the advantage of increasing the fanout only goes so far because the the amount of work to be done at each level increases linearly with the fanout, while the saving due to reduced tree depth follows the base two logarithm of the fanout. In the extreme case, when the fanout is larger than N, the heap degenerates into a unordered list. Because of this phenomenon we cannot simply increase the fanout until the wasted load at the end of the cache line is amortized away.

Another trick that is investigated in [2] to keep the fanout down, is to align the heap in memory so that siblings do not span cache lines. Clearly this is possible only if the element size divides the cache line size.

Reducing the number of cache misses through alignment of the heap can for the heapsort application be considered cheating, because generally we cannot expect the program calling heapsort to align data optimally. Alternatively we must accept to loose the inplace property of heapsort. Of course this consideration does not apply to other priority queue applications because a priority queue is normally expected to handle memory allocation by it self anyway.


next up previous contents
Next: Programming considerations Up: Tuning the cache behavior Previous: Motivation.- Performance of the

Jesper Bojesen
Wed Nov 4 15:35:15 MET 1998