[honey] Better per-term wdf upper bound
Previously we used min(cf(term), wdf_upper_bound(db)) for the per
term upper bound - that's tight for any terms which attain that
upper bound, and for terms with termfreq == 1, which are common
in the database (e.g. 66% for a database of wikipedia), but probably
much less common in searches.
We now use max(first_wdf(term), cf(term) - first_wdf(term)) when
termfreq > 1, which means terms with termfreq == 2 will also attain
their bound (another 11% for the same database) while terms with higher
termfreq but below the global bound will get a tighter bound.