Optimize unaligned vanilla vec iteration
Summary:
Previously, the assembly produced to iterate over a vec (e.g. `foreach ($x as $item)`) used an index combined with a `leaq` + `movq` to get the value of each element (effectively multiplying by 9).
Since the unaligned type values are stored one after another, we can instead use a pointer to the element, and increment the pointer by 9 via an `add` instead. This should save 1 instruction (overall, 5 bytes instruction-size wise (12 bytes previously compared to 7 bytes now; I'm ignoring the loading/checking of the type byte, which is consistent across both approaches)).
Given the following:
```
<?hh
<<__EntryPoint>>
function main() {
$v = 0;
$x = __hhvm_intrinsics\launder_value(vec[10, 12, 14]);
foreach ($x as $item) {
$v += $item;
}
var_dump($v);
}
```
Running this:
```
buck build --keep-going mode/dbgo-clang -c hhvm.use_lowptr=1 //hphp/hhvm:hhvm
hphp/tools/hhvm_wrapper.php -r 2 --trace-printir 1 --hdf Eval.EnableIntrinsicsExtension=1
```
Previously produced:
```
(76) t17:InitCell = LdVecElem t13:Vec, t15:Int
Main:
0x3160004b: leaq (%rcx,%rcx,8), %rdx // 4 bytes
0x3160004f: movb 0x18(%rax,%rdx,1), %bl
0x31600053: movq 0x10(%rax,%rdx,1), %r11 // 5 bytes
(77) t18:Int = CheckType<Int> t17:InitCell -> B30<Unlikely>
Main:
0x31600058: test $0xb9, %bl
0x3160005b: jnz 0x3160013c
-> B46
B46: [profCount=6] (preds B25)
--- bc main(id
1078027200)49, fp 0, fixupFP 0, spOff 0, [profTrans=6]
IterNext 0 NK V:2 -7 (42)
(126) t29:Int = AddInt t15:Int, 1
Main:
0x31600061: inc %rcx // 3 bytes
(127) t30:Bool = EqInt t29:Int, t14:Int
Main:
0x31600064: cmp %rsi, %rcx
```
Now produces:
```
B25: [profCount=6] (preds B51 B29)
t17:PtrToElem = phi t30:PtrToElem@B51, t16:PtrToElem@B29
t44:Int = phi t49:Int@B51, 0@B29
(76) t17:PtrToElem, t44:Int = DefLabel
(77) t18:InitCell = LdPtrIterVal<InitCell> t17:PtrToElem
Main:
0x31600058: movb 0x8(%rdx), %dil
0x3160005c: movq (%rdx), %r8 // 3 bytes
(78) t19:Int = CheckType<Int> t18:InitCell -> B30<Unlikely>
Main:
0x3160005f: test $0xb9, %dil
0x31600063: jnz 0x31600142
-> B46
B46: [profCount=6] (preds B25)
--- bc main(id
1078027200)49, fp 0, fixupFP 0, spOff 0, [profTrans=6]
IterNext 0 NK V:2 -7 (42)
(127) t30:PtrToElem = AdvancePackedPtrIter<1> t17:PtrToElem
Main:
0x31600069: add $0x9, %rdx // 4 bytes
(128) t31:Bool = EqPtrIter t30:PtrToElem, t15:PtrToElem
Main:
0x3160006d: cmp %rsi, %rdx
```
Reviewed By: mofarrell
Differential Revision:
D34222743
fbshipit-source-id:
3fe35c6fedaf997ec417615db0d537915b34c071