Tweak mergeImpl loop for hoistable classes
This loop seems to be a significant source of LLC load misses. To some extent
this is unavoidable, since its working set (all the classes in the universe) is
way too big for cache.
However, we can avoid doing a bunch of the same work repeatedly. The series of
pointer chases (*pre->namedEntity()->clsList()) was the major culprit, and if
all the Unit's classes are unique and defined without failure, we "cache" the
results of those pointer chases and identify them with a marked low-order bit.
We already do something very similar for the non-hoistable classes.
Also, comment a few subtle things that I realized as I worked, including a
presumably deliberate (but safe) race condition.