Change delta base cache eviction policy away from strict lru
When delta chains are decoded the objects in the chain are placed in
the cache consecutively. Strict LRU means that the oldest such
chain is dropped completely, then the next oldest and so on. However
caching an object is less valuable if you also cache nearby objects in
the chain: if you have nearby base objects cached then regenerating
this one is less costly, and in the other direction this one is less
likely to be needed as a base if the later objects are directly
available in the cache. Therefore strict LRU is not a good eviction
policy for the cache; instead of keeping every object from the most
recently used chains it's more efficient to keep objects from more
chains with fewer objects per chain.
Change the cache to drop only a percentage of the objects at the
oldest end of the age list, moving the remainder around to the newest
end again. This keeps fewer samples from chains the longer ago they
were used instead of suddenly dropping them completely at some point.
I tried a couple of variations of the OBJ_BLOB heuristic the old code
had, but in my benchmarks preferring to keep non-blob objects had a
slightly negative effect on performance, so I removed the heuristic.
I'm not certain that a case where it's beneficial could not exist
though.
Benchmark of 'git log -S' speed in a Linux-2.6.26.3 repository, most
of which was packed with "--window=250 --depth=250":
This revision with a large core.deltabasecachelimit config value:
$ time ~/src/git/git log -S'git is inefficient'
real 0m21.905s
user 0m21.309s
sys 0m0.583s
Stock git (before the 4 commits in this branch):
$ time ~/src/git/git log -S'git is inefficient'
real 6m28.919s
user 6m26.271s
sys 0m2.610s