From 558f5e9517a4b6acf915d5f2d8083722c327a487 Mon Sep 17 00:00:00 2001 From: Julian Seward Date: Mon, 21 Oct 2019 11:19:59 +0200 Subject: [PATCH] Initial implementation of C-source-level &&-idiom recovery This branch contains code which avoids Memcheck false positives resulting from gcc and clang creating branches on uninitialised data. For example: bool isClosed; if (src.isRect(..., &isClosed, ...) && isClosed) { clang9 -O2 compiles this as: callq 7e7cdc0 <_ZNK6SkPath6isRectEP6SkRectPbPNS_9DirectionE> cmpb $0x0,-0x60(%rbp) // "if (isClosed) { .." je 7ed9e08 // "je after" test %al,%al // "if (return value of call is nonzero) { .." je 7ed9e08 // "je after" .. after: That is, the && has been evaluated right-to-left. This is a correct transformation if the compiler can prove that the call to |isRect| returns |false| along any path on which it does not write its out-parameter |&isClosed|. In general, for the lazy-semantics (L->R) C-source-level && operator, we have |A && B| == |B && A| if you can prove that |B| is |false| whenever A is undefined. I assume that clang has some kind of interprocedural analysis that tells it that. The compiler is further obliged to show that |B| won't trap, since it is now being evaluated speculatively, but that's no big deal to prove. A similar result holds, per de Morgan, for transformations involving the C language ||. Memcheck correctly handles bitwise &&/|| in the presence of undefined inputs. It has done so since the beginning. However, it assumes that every conditional branch in the program is important -- any branch on uninitialised data is an error. However, this idiom demonstrates otherwise. It defeats Memcheck's existing &&/|| handling because the &&/|| is spread across two basic blocks, rather than being bitwise. This initial commit contains a complete initial implementation to fix that. The basic idea is to detect the && condition spread across two blocks, and transform it into a single block using bitwise &&. Then Memcheck's existing accurate instrumentation of bitwise && will correctly handle it. The transformation is C1 = ... if (!C1) goto after .. falls through to .. C2 = ... if (!C2) goto after .. falls through to .. after: ===> C1 = ... C2 = ... if (!C1 && !C2) goto after .. falls through to .. after: This assumes that can be conditionalised, at the IR level, so that the guest state is not modified if C1 is |false|. That's not possible for all IRStmt kinds, but it is possible for a large enough subset to make this transformation feasible. There is no corresponding transformation that recovers an || condition, because, per de Morgan, that merely corresponds to swapping the side exits vs fallthoughs, and inverting the sense of the tests, and the pattern-recogniser as implemented checks all possible combinations already. The analysis and block-building is performed on the IR returned by the architecture specific front ends. So they are almost not modified at all: in fact they are simplified because all logic related to chasing through unconditional and conditional branches has been removed from them, redone at the IR level, and centralised. The only file with big changes is the IRSB constructor logic, guest_generic_bb_to_IR.c (a.k.a the "trace builder"). This is a complete rewrite. There is some additional work for the IR optimiser (ir_opt.c), since that needs to do a quick initial simplification pass of the basic blocks, in order to reduce the number of different IR variants that the trace-builder has to pattern match on. An important followup task is to further reduce this cost. There are two new IROps to support this: And1 and Or1, which both operate on Ity_I1. They are regarded as evaluating both arguments, consistent with AndXX and OrXX for all other sizes. It is possible to synthesise at the IR level by widening the value to Ity_I8 or above, doing bitwise And/Or, and re-narrowing it, but this gives inefficient code, so I chose to represent them directly. The transformation appears to work for amd64-linux. In principle -- because it operates entirely at the IR level -- it should work for all targets, providing the initial pre-simplification pass can normalise the block ends into the required form. That will no doubt require some tuning. And1 and Or1 will have to be implemented in all instruction selectors, but that's easy enough. Remaining FIXMEs in the code: * Rename `expr_is_speculatable` et al to `expr_is_conditionalisable`. These functions merely conditionalise code; the speculation has already been done by gcc/clang. * `expr_is_speculatable`: properly check that Iex_Unop/Binop don't contain operatins that might trap (Div, Rem, etc). * `analyse_block_end`: recognise all block ends, and abort on ones that can't be recognised. Needed to ensure we don't miss any cases. * maybe: guest_amd64_toIR.c: generate better code for And1/Or1 * ir_opt.c, do_iropt_BB: remove the initial flattening pass since presimp will already have done it * ir_opt.c, do_minimal_initial_iropt_BB (a.k.a. presimp). Make this as cheap as possible. In particular, calling `cprop_BB_wrk` is total overkill since we only need copy propagation. * ir_opt.c: once the above is done, remove boolean parameter for `cprop_BB_wrk`. * ir_opt.c: concatenate_irsbs: maybe de-dup w.r.t. maybe_unroll_loop_BB. * remove option `guest_chase_cond` from VexControl (?). It was never used. * convert option `guest_chase_thresh` from VexControl (?) into a Bool, since the revised code here only cares about the 0-vs-nonzero distinction now. --- VEX/priv/guest_amd64_defs.h | 3 - VEX/priv/guest_amd64_toIR.c | 186 +--- VEX/priv/guest_arm64_defs.h | 3 - VEX/priv/guest_arm64_toIR.c | 15 - VEX/priv/guest_arm_defs.h | 3 - VEX/priv/guest_arm_toIR.c | 108 +-- VEX/priv/guest_generic_bb_to_IR.c | 1874 +++++++++++++++++++++++++------------ VEX/priv/guest_generic_bb_to_IR.h | 49 +- VEX/priv/guest_mips_defs.h | 3 - VEX/priv/guest_mips_toIR.c | 85 +- VEX/priv/guest_nanomips_defs.h | 3 - VEX/priv/guest_nanomips_toIR.c | 5 - VEX/priv/guest_ppc_defs.h | 3 - VEX/priv/guest_ppc_toIR.c | 42 +- VEX/priv/guest_s390_defs.h | 3 - VEX/priv/guest_s390_toIR.c | 39 +- VEX/priv/guest_x86_defs.h | 3 - VEX/priv/guest_x86_toIR.c | 153 +-- VEX/priv/host_amd64_isel.c | 23 +- VEX/priv/ir_defs.c | 9 +- VEX/priv/ir_opt.c | 99 +- VEX/priv/ir_opt.h | 6 + VEX/pub/libvex.h | 2 + VEX/pub/libvex_ir.h | 2 + memcheck/mc_translate.c | 46 +- memcheck/tests/vbit-test/binary.c | 6 + memcheck/tests/vbit-test/irops.c | 4 +- memcheck/tests/vbit-test/vbits.c | 2 + memcheck/tests/vbit-test/vbits.h | 4 +- 29 files changed, 1571 insertions(+), 1212 deletions(-) diff --git a/VEX/priv/guest_amd64_defs.h b/VEX/priv/guest_amd64_defs.h index a5de527d2..54672dc22 100644 --- a/VEX/priv/guest_amd64_defs.h +++ b/VEX/priv/guest_amd64_defs.h @@ -49,9 +49,6 @@ guest_generic_bb_to_IR.h. */ extern DisResult disInstr_AMD64 ( IRSB* irbb, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_amd64_toIR.c b/VEX/priv/guest_amd64_toIR.c index 109241913..fadf47d41 100644 --- a/VEX/priv/guest_amd64_toIR.c +++ b/VEX/priv/guest_amd64_toIR.c @@ -2291,7 +2291,6 @@ static void jmp_lit( /*MOD*/DisResult* dres, { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = kind; @@ -2303,7 +2302,6 @@ static void jmp_treg( /*MOD*/DisResult* dres, { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = kind; @@ -2318,7 +2316,6 @@ void jcc_01 ( /*MOD*/DisResult* dres, AMD64Condcode condPos; vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = Ijk_Boring; @@ -19846,9 +19843,6 @@ static Long dis_ESC_NONE ( /*MB_OUT*/DisResult* dres, /*MB_OUT*/Bool* expect_CAS, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -20258,53 +20252,10 @@ Long dis_ESC_NONE ( vassert(-128 <= jmpDelta && jmpDelta < 128); d64 = (guest_RIP_bbstart+delta+1) + jmpDelta; delta++; - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr64)d64 != (Addr64)guest_RIP_bbstart - && jmpDelta < 0 - && resteerOkFn( callback_opaque, (Addr64)d64) ) { - /* Speculation: assume this backward branch is taken. So we - need to emit a side-exit to the insn following this one, - on the negation of the condition, and continue at the - branch target address (d64). If we wind up back at the - first instruction of the trace, just stop; it's better to - let the IR loop unroller handle that case. */ - stmt( IRStmt_Exit( - mk_amd64g_calculate_condition( - (AMD64Condcode)(1 ^ (opc - 0x70))), - Ijk_Boring, - IRConst_U64(guest_RIP_bbstart+delta), - OFFB_RIP ) ); - dres->whatNext = Dis_ResteerC; - dres->continueAt = d64; - comment = "(assumed taken)"; - } - else - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr64)d64 != (Addr64)guest_RIP_bbstart - && jmpDelta >= 0 - && resteerOkFn( callback_opaque, guest_RIP_bbstart+delta ) ) { - /* Speculation: assume this forward branch is not taken. So - we need to emit a side-exit to d64 (the dest) and continue - disassembling at the insn immediately following this - one. */ - stmt( IRStmt_Exit( - mk_amd64g_calculate_condition((AMD64Condcode)(opc - 0x70)), - Ijk_Boring, - IRConst_U64(d64), - OFFB_RIP ) ); - dres->whatNext = Dis_ResteerC; - dres->continueAt = guest_RIP_bbstart+delta; - comment = "(assumed not taken)"; - } - else { - /* Conservative default translation - end the block at this - point. */ - jcc_01( dres, (AMD64Condcode)(opc - 0x70), - guest_RIP_bbstart+delta, d64 ); - vassert(dres->whatNext == Dis_StopHere); - } + /* End the block at this point. */ + jcc_01( dres, (AMD64Condcode)(opc - 0x70), + guest_RIP_bbstart+delta, d64 ); + vassert(dres->whatNext == Dis_StopHere); DIP("j%s-8 0x%llx %s\n", name_AMD64Condcode(opc - 0x70), (ULong)d64, comment); return delta; @@ -21434,14 +21385,8 @@ Long dis_ESC_NONE ( t2 = newTemp(Ity_I64); assign(t2, mkU64((Addr64)d64)); make_redzone_AbiHint(vbi, t1, t2/*nia*/, "call-d32"); - if (resteerOkFn( callback_opaque, (Addr64)d64) ) { - /* follow into the call target. */ - dres->whatNext = Dis_ResteerU; - dres->continueAt = d64; - } else { - jmp_lit(dres, Ijk_Call, d64); - vassert(dres->whatNext == Dis_StopHere); - } + jmp_lit(dres, Ijk_Call, d64); + vassert(dres->whatNext == Dis_StopHere); DIP("call 0x%llx\n", (ULong)d64); return delta; @@ -21452,13 +21397,8 @@ Long dis_ESC_NONE ( if (haveF2(pfx)) DIP("bnd ; "); /* MPX bnd prefix. */ d64 = (guest_RIP_bbstart+delta+sz) + getSDisp(sz,delta); delta += sz; - if (resteerOkFn(callback_opaque, (Addr64)d64)) { - dres->whatNext = Dis_ResteerU; - dres->continueAt = d64; - } else { - jmp_lit(dres, Ijk_Boring, d64); - vassert(dres->whatNext == Dis_StopHere); - } + jmp_lit(dres, Ijk_Boring, d64); + vassert(dres->whatNext == Dis_StopHere); DIP("jmp 0x%llx\n", (ULong)d64); return delta; @@ -21469,13 +21409,8 @@ Long dis_ESC_NONE ( if (haveF2(pfx)) DIP("bnd ; "); /* MPX bnd prefix. */ d64 = (guest_RIP_bbstart+delta+1) + getSDisp8(delta); delta++; - if (resteerOkFn(callback_opaque, (Addr64)d64)) { - dres->whatNext = Dis_ResteerU; - dres->continueAt = d64; - } else { - jmp_lit(dres, Ijk_Boring, d64); - vassert(dres->whatNext == Dis_StopHere); - } + jmp_lit(dres, Ijk_Boring, d64); + vassert(dres->whatNext == Dis_StopHere); DIP("jmp-8 0x%llx\n", (ULong)d64); return delta; @@ -21658,9 +21593,6 @@ static Long dis_ESC_0F ( /*MB_OUT*/DisResult* dres, /*MB_OUT*/Bool* expect_CAS, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -21910,56 +21842,10 @@ Long dis_ESC_0F ( jmpDelta = getSDisp32(delta); d64 = (guest_RIP_bbstart+delta+4) + jmpDelta; delta += 4; - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr64)d64 != (Addr64)guest_RIP_bbstart - && jmpDelta < 0 - && resteerOkFn( callback_opaque, (Addr64)d64) ) { - /* Speculation: assume this backward branch is taken. So - we need to emit a side-exit to the insn following this - one, on the negation of the condition, and continue at - the branch target address (d64). If we wind up back at - the first instruction of the trace, just stop; it's - better to let the IR loop unroller handle that case. */ - stmt( IRStmt_Exit( - mk_amd64g_calculate_condition( - (AMD64Condcode)(1 ^ (opc - 0x80))), - Ijk_Boring, - IRConst_U64(guest_RIP_bbstart+delta), - OFFB_RIP - )); - dres->whatNext = Dis_ResteerC; - dres->continueAt = d64; - comment = "(assumed taken)"; - } - else - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr64)d64 != (Addr64)guest_RIP_bbstart - && jmpDelta >= 0 - && resteerOkFn( callback_opaque, guest_RIP_bbstart+delta ) ) { - /* Speculation: assume this forward branch is not taken. - So we need to emit a side-exit to d64 (the dest) and - continue disassembling at the insn immediately - following this one. */ - stmt( IRStmt_Exit( - mk_amd64g_calculate_condition((AMD64Condcode) - (opc - 0x80)), - Ijk_Boring, - IRConst_U64(d64), - OFFB_RIP - )); - dres->whatNext = Dis_ResteerC; - dres->continueAt = guest_RIP_bbstart+delta; - comment = "(assumed not taken)"; - } - else { - /* Conservative default translation - end the block at - this point. */ - jcc_01( dres, (AMD64Condcode)(opc - 0x80), - guest_RIP_bbstart+delta, d64 ); - vassert(dres->whatNext == Dis_StopHere); - } + /* End the block at this point. */ + jcc_01( dres, (AMD64Condcode)(opc - 0x80), + guest_RIP_bbstart+delta, d64 ); + vassert(dres->whatNext == Dis_StopHere); DIP("j%s-32 0x%llx %s\n", name_AMD64Condcode(opc - 0x80), (ULong)d64, comment); return delta; @@ -22727,9 +22613,6 @@ __attribute__((noinline)) static Long dis_ESC_0F38 ( /*MB_OUT*/DisResult* dres, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -22845,9 +22728,6 @@ __attribute__((noinline)) static Long dis_ESC_0F3A ( /*MB_OUT*/DisResult* dres, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -24187,9 +24067,6 @@ static Long dis_ESC_0F__VEX ( /*MB_OUT*/DisResult* dres, /*OUT*/ Bool* uses_vvvv, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -28158,9 +28035,6 @@ static Long dis_ESC_0F38__VEX ( /*MB_OUT*/DisResult* dres, /*OUT*/ Bool* uses_vvvv, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -30585,9 +30459,6 @@ static Long dis_ESC_0F3A__VEX ( /*MB_OUT*/DisResult* dres, /*OUT*/ Bool* uses_vvvv, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const VexArchInfo* archinfo, const VexAbiInfo* vbi, Prefix pfx, Int sz, Long deltaIN @@ -32206,9 +32077,6 @@ Long dis_ESC_0F3A__VEX ( static DisResult disInstr_AMD64_WRK ( /*OUT*/Bool* expect_CAS, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, Long delta64, const VexArchInfo* archinfo, const VexAbiInfo* vbi, @@ -32241,7 +32109,6 @@ DisResult disInstr_AMD64_WRK ( /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 0; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; *expect_CAS = False; @@ -32503,22 +32370,18 @@ DisResult disInstr_AMD64_WRK ( switch (esc) { case ESC_NONE: delta = dis_ESC_NONE( &dres, expect_CAS, - resteerOkFn, resteerCisOk, callback_opaque, archinfo, vbi, pfx, sz, delta ); break; case ESC_0F: delta = dis_ESC_0F ( &dres, expect_CAS, - resteerOkFn, resteerCisOk, callback_opaque, archinfo, vbi, pfx, sz, delta ); break; case ESC_0F38: delta = dis_ESC_0F38( &dres, - resteerOkFn, resteerCisOk, callback_opaque, archinfo, vbi, pfx, sz, delta ); break; case ESC_0F3A: delta = dis_ESC_0F3A( &dres, - resteerOkFn, resteerCisOk, callback_opaque, archinfo, vbi, pfx, sz, delta ); break; default: @@ -32533,20 +32396,14 @@ DisResult disInstr_AMD64_WRK ( switch (esc) { case ESC_0F: delta = dis_ESC_0F__VEX ( &dres, &uses_vvvv, - resteerOkFn, resteerCisOk, - callback_opaque, archinfo, vbi, pfx, sz, delta ); break; case ESC_0F38: delta = dis_ESC_0F38__VEX ( &dres, &uses_vvvv, - resteerOkFn, resteerCisOk, - callback_opaque, archinfo, vbi, pfx, sz, delta ); break; case ESC_0F3A: delta = dis_ESC_0F3A__VEX ( &dres, &uses_vvvv, - resteerOkFn, resteerCisOk, - callback_opaque, archinfo, vbi, pfx, sz, delta ); break; case ESC_NONE: @@ -32630,10 +32487,6 @@ DisResult disInstr_AMD64_WRK ( case Dis_Continue: stmt( IRStmt_Put( OFFB_RIP, mkU64(guest_RIP_bbstart + delta) ) ); break; - case Dis_ResteerU: - case Dis_ResteerC: - stmt( IRStmt_Put( OFFB_RIP, mkU64(dres.continueAt) ) ); - break; case Dis_StopHere: break; default: @@ -32657,9 +32510,6 @@ DisResult disInstr_AMD64_WRK ( is located in host memory at &guest_code[delta]. */ DisResult disInstr_AMD64 ( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta, Addr guest_IP, @@ -32687,9 +32537,7 @@ DisResult disInstr_AMD64 ( IRSB* irsb_IN, x1 = irsb_IN->stmts_used; expect_CAS = False; - dres = disInstr_AMD64_WRK ( &expect_CAS, resteerOkFn, - resteerCisOk, - callback_opaque, + dres = disInstr_AMD64_WRK ( &expect_CAS, delta, archinfo, abiinfo, sigill_diag_IN ); x2 = irsb_IN->stmts_used; vassert(x2 >= x1); @@ -32720,9 +32568,7 @@ DisResult disInstr_AMD64 ( IRSB* irsb_IN, /* inconsistency detected. re-disassemble the instruction so as to generate a useful error message; then assert. */ vex_traceflags |= VEX_TRACE_FE; - dres = disInstr_AMD64_WRK ( &expect_CAS, resteerOkFn, - resteerCisOk, - callback_opaque, + dres = disInstr_AMD64_WRK ( &expect_CAS, delta, archinfo, abiinfo, sigill_diag_IN ); for (i = x1; i < x2; i++) { vex_printf("\t\t"); diff --git a/VEX/priv/guest_arm64_defs.h b/VEX/priv/guest_arm64_defs.h index 319d6010b..b2094d61f 100644 --- a/VEX/priv/guest_arm64_defs.h +++ b/VEX/priv/guest_arm64_defs.h @@ -39,9 +39,6 @@ guest_generic_bb_to_IR.h. */ extern DisResult disInstr_ARM64 ( IRSB* irbb, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_arm64_toIR.c b/VEX/priv/guest_arm64_toIR.c index 2589ddfb5..6eb896cdb 100644 --- a/VEX/priv/guest_arm64_toIR.c +++ b/VEX/priv/guest_arm64_toIR.c @@ -6885,7 +6885,6 @@ Bool dis_ARM64_branch_etc(/*MB_OUT*/DisResult* dres, UInt insn, Long simm64 = (Long)sx_to_64(uimm64, 21); vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 4); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); stmt( IRStmt_Exit(unop(Iop_64to1, mk_arm64g_calculate_condition(cond)), Ijk_Boring, @@ -14797,9 +14796,6 @@ Bool dis_ARM64_simd_and_fp(/*MB_OUT*/DisResult* dres, UInt insn) static Bool disInstr_ARM64_WRK ( /*MB_OUT*/DisResult* dres, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_instr, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo @@ -14823,7 +14819,6 @@ Bool disInstr_ARM64_WRK ( /* Set result defaults. */ dres->whatNext = Dis_Continue; dres->len = 4; - dres->continueAt = 0; dres->jk_StopHere = Ijk_INVALID; dres->hint = Dis_HintNone; @@ -14959,7 +14954,6 @@ Bool disInstr_ARM64_WRK ( if (!ok) { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 4); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); } @@ -14977,9 +14971,6 @@ Bool disInstr_ARM64_WRK ( is located in host memory at &guest_code[delta]. */ DisResult disInstr_ARM64 ( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta_IN, Addr guest_IP, @@ -15006,7 +14997,6 @@ DisResult disInstr_ARM64 ( IRSB* irsb_IN, /* Try to decode */ Bool ok = disInstr_ARM64_WRK( &dres, - resteerOkFn, resteerCisOk, callback_opaque, &guest_code_IN[delta_IN], archinfo, abiinfo ); if (ok) { @@ -15016,10 +15006,6 @@ DisResult disInstr_ARM64 ( IRSB* irsb_IN, case Dis_Continue: putPC( mkU64(dres.len + guest_PC_curr_instr) ); break; - case Dis_ResteerU: - case Dis_ResteerC: - putPC(mkU64(dres.continueAt)); - break; case Dis_StopHere: break; default: @@ -15054,7 +15040,6 @@ DisResult disInstr_ARM64 ( IRSB* irsb_IN, dres.len = 0; dres.whatNext = Dis_StopHere; dres.jk_StopHere = Ijk_NoDecode; - dres.continueAt = 0; } return dres; } diff --git a/VEX/priv/guest_arm_defs.h b/VEX/priv/guest_arm_defs.h index 58bbbd086..85521e770 100644 --- a/VEX/priv/guest_arm_defs.h +++ b/VEX/priv/guest_arm_defs.h @@ -41,9 +41,6 @@ geust_generic_ bb_to_IR.h. */ extern DisResult disInstr_ARM ( IRSB* irbb, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_arm_toIR.c b/VEX/priv/guest_arm_toIR.c index 50c97e929..6027d477e 100644 --- a/VEX/priv/guest_arm_toIR.c +++ b/VEX/priv/guest_arm_toIR.c @@ -16077,9 +16077,6 @@ static Bool decode_NV_instruction_ARMv7_and_below static DisResult disInstr_ARM_WRK ( - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_instr, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, @@ -16099,7 +16096,6 @@ DisResult disInstr_ARM_WRK ( /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 4; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; @@ -17034,75 +17030,19 @@ DisResult disInstr_ARM_WRK ( condT, Ijk_Boring); } if (condT == IRTemp_INVALID) { - /* unconditional transfer to 'dst'. See if we can simply - continue tracing at the destination. */ - if (resteerOkFn( callback_opaque, dst )) { - /* yes */ - dres.whatNext = Dis_ResteerU; - dres.continueAt = dst; - } else { - /* no; terminate the SB at this point. */ - llPutIReg(15, mkU32(dst)); - dres.jk_StopHere = jk; - dres.whatNext = Dis_StopHere; - } + /* Unconditional transfer to 'dst'. Terminate the SB at this point. */ + llPutIReg(15, mkU32(dst)); + dres.jk_StopHere = jk; + dres.whatNext = Dis_StopHere; DIP("b%s 0x%x\n", link ? "l" : "", dst); } else { - /* conditional transfer to 'dst' */ - const HChar* comment = ""; - - /* First see if we can do some speculative chasing into one - arm or the other. Be conservative and only chase if - !link, that is, this is a normal conditional branch to a - known destination. */ - if (!link - && resteerCisOk - && vex_control.guest_chase_cond - && dst < guest_R15_curr_instr_notENC - && resteerOkFn( callback_opaque, dst) ) { - /* Speculation: assume this backward branch is taken. So - we need to emit a side-exit to the insn following this - one, on the negation of the condition, and continue at - the branch target address (dst). */ - stmt( IRStmt_Exit( unop(Iop_Not1, - unop(Iop_32to1, mkexpr(condT))), - Ijk_Boring, - IRConst_U32(guest_R15_curr_instr_notENC+4), - OFFB_R15T )); - dres.whatNext = Dis_ResteerC; - dres.continueAt = (Addr32)dst; - comment = "(assumed taken)"; - } - else - if (!link - && resteerCisOk - && vex_control.guest_chase_cond - && dst >= guest_R15_curr_instr_notENC - && resteerOkFn( callback_opaque, - guest_R15_curr_instr_notENC+4) ) { - /* Speculation: assume this forward branch is not taken. - So we need to emit a side-exit to dst (the dest) and - continue disassembling at the insn immediately - following this one. */ - stmt( IRStmt_Exit( unop(Iop_32to1, mkexpr(condT)), - Ijk_Boring, - IRConst_U32(dst), - OFFB_R15T )); - dres.whatNext = Dis_ResteerC; - dres.continueAt = guest_R15_curr_instr_notENC+4; - comment = "(assumed not taken)"; - } - else { - /* Conservative default translation - end the block at - this point. */ - stmt( IRStmt_Exit( unop(Iop_32to1, mkexpr(condT)), - jk, IRConst_U32(dst), OFFB_R15T )); - llPutIReg(15, mkU32(guest_R15_curr_instr_notENC + 4)); - dres.jk_StopHere = Ijk_Boring; - dres.whatNext = Dis_StopHere; - } - DIP("b%s%s 0x%x %s\n", link ? "l" : "", nCC(INSN_COND), - dst, comment); + /* Conditional transfer to 'dst'. Terminate the SB at this point. */ + stmt( IRStmt_Exit( unop(Iop_32to1, mkexpr(condT)), + jk, IRConst_U32(dst), OFFB_R15T )); + llPutIReg(15, mkU32(guest_R15_curr_instr_notENC + 4)); + dres.jk_StopHere = Ijk_Boring; + dres.whatNext = Dis_StopHere; + DIP("b%s%s 0x%x\n", link ? "l" : "", nCC(INSN_COND), dst); } goto decode_success; } @@ -18896,7 +18836,6 @@ DisResult disInstr_ARM_WRK ( dres.len = 0; dres.whatNext = Dis_StopHere; dres.jk_StopHere = Ijk_NoDecode; - dres.continueAt = 0; return dres; decode_success: @@ -18953,10 +18892,6 @@ DisResult disInstr_ARM_WRK ( case Dis_Continue: llPutIReg(15, mkU32(dres.len + guest_R15_curr_instr_notENC)); break; - case Dis_ResteerU: - case Dis_ResteerC: - llPutIReg(15, mkU32(dres.continueAt)); - break; case Dis_StopHere: break; default: @@ -18989,9 +18924,6 @@ static const UChar it_length_table[256]; /* fwds */ static DisResult disInstr_THUMB_WRK ( - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_instr, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, @@ -19016,7 +18948,6 @@ DisResult disInstr_THUMB_WRK ( /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 2; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; @@ -20761,7 +20692,6 @@ DisResult disInstr_THUMB_WRK ( /* Change result defaults to suit 32-bit insns. */ vassert(dres.whatNext == Dis_Continue); vassert(dres.len == 2); - vassert(dres.continueAt == 0); dres.len = 4; /* ---------------- BL/BLX simm26 ---------------- */ @@ -23531,7 +23461,6 @@ DisResult disInstr_THUMB_WRK ( dres.len = 0; dres.whatNext = Dis_StopHere; dres.jk_StopHere = Ijk_NoDecode; - dres.continueAt = 0; return dres; decode_success: @@ -23541,10 +23470,6 @@ DisResult disInstr_THUMB_WRK ( case Dis_Continue: llPutIReg(15, mkU32(dres.len + (guest_R15_curr_instr_notENC | 1))); break; - case Dis_ResteerU: - case Dis_ResteerC: - llPutIReg(15, mkU32(dres.continueAt)); - break; case Dis_StopHere: break; default: @@ -23650,9 +23575,6 @@ static const UChar it_length_table[256] is located in host memory at &guest_code[delta]. */ DisResult disInstr_ARM ( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta_ENCODED, Addr guest_IP_ENCODED, @@ -23679,14 +23601,10 @@ DisResult disInstr_ARM ( IRSB* irsb_IN, } if (isThumb) { - dres = disInstr_THUMB_WRK ( resteerOkFn, - resteerCisOk, callback_opaque, - &guest_code_IN[delta_ENCODED - 1], + dres = disInstr_THUMB_WRK ( &guest_code_IN[delta_ENCODED - 1], archinfo, abiinfo, sigill_diag_IN ); } else { - dres = disInstr_ARM_WRK ( resteerOkFn, - resteerCisOk, callback_opaque, - &guest_code_IN[delta_ENCODED], + dres = disInstr_ARM_WRK ( &guest_code_IN[delta_ENCODED], archinfo, abiinfo, sigill_diag_IN ); } diff --git a/VEX/priv/guest_generic_bb_to_IR.c b/VEX/priv/guest_generic_bb_to_IR.c index 959678789..4cb813f2a 100644 --- a/VEX/priv/guest_generic_bb_to_IR.c +++ b/VEX/priv/guest_generic_bb_to_IR.c @@ -37,290 +37,738 @@ #include "main_util.h" #include "main_globals.h" #include "guest_generic_bb_to_IR.h" +#include "ir_opt.h" +/*--------------------------------------------------------------*/ +/*--- Forwards for fns called by self-checking translations ---*/ +/*--------------------------------------------------------------*/ + /* Forwards .. */ -VEX_REGPARM(2) -static UInt genericg_compute_checksum_4al ( HWord first_w32, HWord n_w32s ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_1 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_2 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_3 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_4 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_5 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_6 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_7 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_8 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_9 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_10 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_11 ( HWord first_w32 ); -VEX_REGPARM(1) -static UInt genericg_compute_checksum_4al_12 ( HWord first_w32 ); +VEX_REGPARM(2) static UInt genericg_compute_checksum_4al ( HWord first_w32, + HWord n_w32s ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_1 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_2 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_3 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_4 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_5 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_6 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_7 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_8 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_9 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_10 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_11 ( HWord first_w32 ); +VEX_REGPARM(1) static UInt genericg_compute_checksum_4al_12 ( HWord first_w32 ); + +VEX_REGPARM(2) static ULong genericg_compute_checksum_8al ( HWord first_w64, + HWord n_w64s ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_1 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_2 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_3 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_4 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_5 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_6 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_7 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_8 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_9 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_10 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_11 ( HWord first_w64 ); +VEX_REGPARM(1) static ULong genericg_compute_checksum_8al_12 ( HWord first_w64 ); + + +/*--------------------------------------------------------------*/ +/*--- Creation of self-check IR ---*/ +/*--------------------------------------------------------------*/ + +static void create_self_checks_as_needed( + /*MOD*/ IRSB* irsb, + /*OUT*/ UInt* n_sc_extents, + /*MOD*/ VexRegisterUpdates* pxControl, + /*MOD*/ void* callback_opaque, + /*IN*/ UInt (*needs_self_check) + (void*, /*MB_MOD*/VexRegisterUpdates*, + const VexGuestExtents*), + const VexGuestExtents* vge, + const VexAbiInfo* abiinfo_both, + const IRType guest_word_type, + const Int selfcheck_idx, + /*IN*/ Int offB_GUEST_CMSTART, + /*IN*/ Int offB_GUEST_CMLEN, + /*IN*/ Int offB_GUEST_IP, + const Addr guest_IP_sbstart + ) +{ + /* The scheme is to compute a rather crude checksum of the code + we're making a translation of, and add to the IR a call to a + helper routine which recomputes the checksum every time the + translation is run, and requests a retranslation if it doesn't + match. This is obviously very expensive and considerable + efforts are made to speed it up: -VEX_REGPARM(2) -static ULong genericg_compute_checksum_8al ( HWord first_w64, HWord n_w64s ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_1 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_2 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_3 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_4 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_5 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_6 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_7 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_8 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_9 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_10 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_11 ( HWord first_w64 ); -VEX_REGPARM(1) -static ULong genericg_compute_checksum_8al_12 ( HWord first_w64 ); + * the checksum is computed from all the naturally aligned + host-sized words that overlap the translated code. That means + it could depend on up to 7 bytes before and 7 bytes after + which aren't part of the translated area, and so if those + change then we'll unnecessarily have to discard and + retranslate. This seems like a pretty remote possibility and + it seems as if the benefit of not having to deal with the ends + of the range at byte precision far outweigh any possible extra + translations needed. -/* Small helpers */ -static Bool const_False ( void* callback_opaque, Addr a ) { - return False; -} + * there's a generic routine and 12 specialised cases, which + handle the cases of 1 through 12-word lengths respectively. + They seem to cover about 90% of the cases that occur in + practice. -/* Disassemble a complete basic block, starting at guest_IP_start, - returning a new IRSB. The disassembler may chase across basic - block boundaries if it wishes and if chase_into_ok allows it. - The precise guest address ranges from which code has been taken - are written into vge. guest_IP_bbstart is taken to be the IP in - the guest's address space corresponding to the instruction at - &guest_code[0]. + We ask the caller, via needs_self_check, which of the 3 vge + extents needs a check, and only generate check code for those + that do. + */ + { + Addr base2check; + UInt len2check; + HWord expectedhW; + IRTemp tistart_tmp, tilen_tmp; + HWord VEX_REGPARM(2) (*fn_generic)(HWord, HWord); + HWord VEX_REGPARM(1) (*fn_spec)(HWord); + const HChar* nm_generic; + const HChar* nm_spec; + HWord fn_generic_entry = 0; + HWord fn_spec_entry = 0; + UInt host_word_szB = sizeof(HWord); + IRType host_word_type = Ity_INVALID; - dis_instr_fn is the arch-specific fn to disassemble on function; it - is this that does the real work. + UInt extents_needing_check + = needs_self_check(callback_opaque, pxControl, vge); - needs_self_check is a callback used to ask the caller which of the - extents, if any, a self check is required for. The returned value - is a bitmask with a 1 in position i indicating that the i'th extent - needs a check. Since there can be at most 3 extents, the returned - values must be between 0 and 7. + if (host_word_szB == 4) host_word_type = Ity_I32; + if (host_word_szB == 8) host_word_type = Ity_I64; + vassert(host_word_type != Ity_INVALID); - The number of extents which did get a self check (0 to 3) is put in - n_sc_extents. The caller already knows this because it told us - which extents to add checks for, via the needs_self_check callback, - but we ship the number back out here for the caller's convenience. + vassert(vge->n_used >= 1 && vge->n_used <= 3); - preamble_function is a callback which allows the caller to add - its own IR preamble (following the self-check, if any). May be - NULL. If non-NULL, the IRSB under construction is handed to - this function, which presumably adds IR statements to it. The - callback may optionally complete the block and direct bb_to_IR - not to disassemble any instructions into it; this is indicated - by the callback returning True. + /* Caller shouldn't claim that nonexistent extents need a + check. */ + vassert((extents_needing_check >> vge->n_used) == 0); - offB_CMADDR and offB_CMLEN are the offsets of guest_CMADDR and - guest_CMLEN. Since this routine has to work for any guest state, - without knowing what it is, those offsets have to passed in. + /* Guest addresses as IRConsts. Used in self-checks to specify the + restart-after-discard point. */ + IRConst* guest_IP_sbstart_IRConst + = guest_word_type==Ity_I32 + ? IRConst_U32(toUInt(guest_IP_sbstart)) + : IRConst_U64(guest_IP_sbstart); - callback_opaque is a caller-supplied pointer to data which the - callbacks may want to see. Vex has no idea what it is. - (In fact it's a VgInstrumentClosure.) -*/ + const Int n_extent_slots = sizeof(vge->base) / sizeof(vge->base[0]); + vassert(n_extent_slots == 3); -/* Regarding IP updating. dis_instr_fn (that does the guest specific - work of disassembling an individual instruction) must finish the - resulting IR with "PUT(guest_IP) = ". Hence in all cases it must - state the next instruction address. + vassert(selfcheck_idx + (n_extent_slots - 1) * 5 + 4 < irsb->stmts_used); - If the block is to be ended at that point, then this routine - (bb_to_IR) will set up the next/jumpkind/offsIP fields so as to - make a transfer (of the right kind) to "GET(guest_IP)". Hence if - dis_instr_fn generates incorrect IP updates we will see it - immediately (due to jumping to the wrong next guest address). + for (Int i = 0; i < vge->n_used; i++) { + /* Do we need to generate a check for this extent? */ + if ((extents_needing_check & (1 << i)) == 0) + continue; - However it is also necessary to set this up so it can be optimised - nicely. The IRSB exit is defined to update the guest IP, so that - chaining works -- since the chain_me stubs expect the chain-to - address to be in the guest state. Hence what the IRSB next fields - will contain initially is (implicitly) + /* Tell the caller */ + (*n_sc_extents)++; - PUT(guest_IP) [implicitly] = GET(guest_IP) [explicit expr on ::next] + /* the extent we're generating a check for */ + base2check = vge->base[i]; + len2check = vge->len[i]; + + /* stay sane */ + vassert(len2check >= 0 && len2check < 2000/*arbitrary*/); + + /* Skip the check if the translation involved zero bytes */ + if (len2check == 0) + continue; + + HWord first_hW = ((HWord)base2check) + & ~(HWord)(host_word_szB-1); + HWord last_hW = (((HWord)base2check) + len2check - 1) + & ~(HWord)(host_word_szB-1); + vassert(first_hW <= last_hW); + HWord hW_diff = last_hW - first_hW; + vassert(0 == (hW_diff & (host_word_szB-1))); + HWord hWs_to_check = (hW_diff + host_word_szB) / host_word_szB; + vassert(hWs_to_check > 0 + && hWs_to_check < 2004/*arbitrary*/ / host_word_szB); + + /* vex_printf("%lx %lx %ld\n", first_hW, last_hW, hWs_to_check); */ + + if (host_word_szB == 8) { + fn_generic = (VEX_REGPARM(2) HWord(*)(HWord, HWord)) + genericg_compute_checksum_8al; + nm_generic = "genericg_compute_checksum_8al"; + } else { + fn_generic = (VEX_REGPARM(2) HWord(*)(HWord, HWord)) + genericg_compute_checksum_4al; + nm_generic = "genericg_compute_checksum_4al"; + } + + fn_spec = NULL; + nm_spec = NULL; + + if (host_word_szB == 8) { + const HChar* nm = NULL; + ULong VEX_REGPARM(1) (*fn)(HWord) = NULL; + switch (hWs_to_check) { + case 1: fn = genericg_compute_checksum_8al_1; + nm = "genericg_compute_checksum_8al_1"; break; + case 2: fn = genericg_compute_checksum_8al_2; + nm = "genericg_compute_checksum_8al_2"; break; + case 3: fn = genericg_compute_checksum_8al_3; + nm = "genericg_compute_checksum_8al_3"; break; + case 4: fn = genericg_compute_checksum_8al_4; + nm = "genericg_compute_checksum_8al_4"; break; + case 5: fn = genericg_compute_checksum_8al_5; + nm = "genericg_compute_checksum_8al_5"; break; + case 6: fn = genericg_compute_checksum_8al_6; + nm = "genericg_compute_checksum_8al_6"; break; + case 7: fn = genericg_compute_checksum_8al_7; + nm = "genericg_compute_checksum_8al_7"; break; + case 8: fn = genericg_compute_checksum_8al_8; + nm = "genericg_compute_checksum_8al_8"; break; + case 9: fn = genericg_compute_checksum_8al_9; + nm = "genericg_compute_checksum_8al_9"; break; + case 10: fn = genericg_compute_checksum_8al_10; + nm = "genericg_compute_checksum_8al_10"; break; + case 11: fn = genericg_compute_checksum_8al_11; + nm = "genericg_compute_checksum_8al_11"; break; + case 12: fn = genericg_compute_checksum_8al_12; + nm = "genericg_compute_checksum_8al_12"; break; + default: break; + } + fn_spec = (VEX_REGPARM(1) HWord(*)(HWord)) fn; + nm_spec = nm; + } else { + const HChar* nm = NULL; + UInt VEX_REGPARM(1) (*fn)(HWord) = NULL; + switch (hWs_to_check) { + case 1: fn = genericg_compute_checksum_4al_1; + nm = "genericg_compute_checksum_4al_1"; break; + case 2: fn = genericg_compute_checksum_4al_2; + nm = "genericg_compute_checksum_4al_2"; break; + case 3: fn = genericg_compute_checksum_4al_3; + nm = "genericg_compute_checksum_4al_3"; break; + case 4: fn = genericg_compute_checksum_4al_4; + nm = "genericg_compute_checksum_4al_4"; break; + case 5: fn = genericg_compute_checksum_4al_5; + nm = "genericg_compute_checksum_4al_5"; break; + case 6: fn = genericg_compute_checksum_4al_6; + nm = "genericg_compute_checksum_4al_6"; break; + case 7: fn = genericg_compute_checksum_4al_7; + nm = "genericg_compute_checksum_4al_7"; break; + case 8: fn = genericg_compute_checksum_4al_8; + nm = "genericg_compute_checksum_4al_8"; break; + case 9: fn = genericg_compute_checksum_4al_9; + nm = "genericg_compute_checksum_4al_9"; break; + case 10: fn = genericg_compute_checksum_4al_10; + nm = "genericg_compute_checksum_4al_10"; break; + case 11: fn = genericg_compute_checksum_4al_11; + nm = "genericg_compute_checksum_4al_11"; break; + case 12: fn = genericg_compute_checksum_4al_12; + nm = "genericg_compute_checksum_4al_12"; break; + default: break; + } + fn_spec = (VEX_REGPARM(1) HWord(*)(HWord))fn; + nm_spec = nm; + } + + expectedhW = fn_generic( first_hW, hWs_to_check ); + /* If we got a specialised version, check it produces the same + result as the generic version! */ + if (fn_spec) { + vassert(nm_spec); + vassert(expectedhW == fn_spec( first_hW )); + } else { + vassert(!nm_spec); + } + + /* Set CMSTART and CMLEN. These will describe to the despatcher + the area of guest code to invalidate should we exit with a + self-check failure. */ + tistart_tmp = newIRTemp(irsb->tyenv, guest_word_type); + tilen_tmp = newIRTemp(irsb->tyenv, guest_word_type); + + IRConst* base2check_IRConst + = guest_word_type==Ity_I32 ? IRConst_U32(toUInt(base2check)) + : IRConst_U64(base2check); + IRConst* len2check_IRConst + = guest_word_type==Ity_I32 ? IRConst_U32(len2check) + : IRConst_U64(len2check); + + IRStmt** stmt0 = &irsb->stmts[selfcheck_idx + i * 5 + 0]; + IRStmt** stmt1 = &irsb->stmts[selfcheck_idx + i * 5 + 1]; + IRStmt** stmt2 = &irsb->stmts[selfcheck_idx + i * 5 + 2]; + IRStmt** stmt3 = &irsb->stmts[selfcheck_idx + i * 5 + 3]; + IRStmt** stmt4 = &irsb->stmts[selfcheck_idx + i * 5 + 4]; + vassert((*stmt0)->tag == Ist_NoOp); + vassert((*stmt1)->tag == Ist_NoOp); + vassert((*stmt2)->tag == Ist_NoOp); + vassert((*stmt3)->tag == Ist_NoOp); + vassert((*stmt4)->tag == Ist_NoOp); + + *stmt0 = IRStmt_WrTmp(tistart_tmp, IRExpr_Const(base2check_IRConst) ); + *stmt1 = IRStmt_WrTmp(tilen_tmp, IRExpr_Const(len2check_IRConst) ); + *stmt2 = IRStmt_Put( offB_GUEST_CMSTART, IRExpr_RdTmp(tistart_tmp) ); + *stmt3 = IRStmt_Put( offB_GUEST_CMLEN, IRExpr_RdTmp(tilen_tmp) ); + + /* Generate the entry point descriptors */ + if (abiinfo_both->host_ppc_calls_use_fndescrs) { + HWord* descr = (HWord*)fn_generic; + fn_generic_entry = descr[0]; + if (fn_spec) { + descr = (HWord*)fn_spec; + fn_spec_entry = descr[0]; + } else { + fn_spec_entry = (HWord)NULL; + } + } else { + fn_generic_entry = (HWord)fn_generic; + if (fn_spec) { + fn_spec_entry = (HWord)fn_spec; + } else { + fn_spec_entry = (HWord)NULL; + } + } + + IRExpr* callexpr = NULL; + if (fn_spec) { + callexpr = mkIRExprCCall( + host_word_type, 1/*regparms*/, + nm_spec, (void*)fn_spec_entry, + mkIRExprVec_1( + mkIRExpr_HWord( (HWord)first_hW ) + ) + ); + } else { + callexpr = mkIRExprCCall( + host_word_type, 2/*regparms*/, + nm_generic, (void*)fn_generic_entry, + mkIRExprVec_2( + mkIRExpr_HWord( (HWord)first_hW ), + mkIRExpr_HWord( (HWord)hWs_to_check ) + ) + ); + } + + *stmt4 + = IRStmt_Exit( + IRExpr_Binop( + host_word_type==Ity_I64 ? Iop_CmpNE64 : Iop_CmpNE32, + callexpr, + host_word_type==Ity_I64 + ? IRExpr_Const(IRConst_U64(expectedhW)) + : IRExpr_Const(IRConst_U32(expectedhW)) + ), + Ijk_InvalICache, + /* Where we must restart if there's a failure: at the + first extent, regardless of which extent the + failure actually happened in. */ + guest_IP_sbstart_IRConst, + offB_GUEST_IP + ); + } /* for (i = 0; i < vge->n_used; i++) */ + + for (Int i = vge->n_used; + i < sizeof(vge->base) / sizeof(vge->base[0]); i++) { + IRStmt* stmt0 = irsb->stmts[selfcheck_idx + i * 5 + 0]; + IRStmt* stmt1 = irsb->stmts[selfcheck_idx + i * 5 + 1]; + IRStmt* stmt2 = irsb->stmts[selfcheck_idx + i * 5 + 2]; + IRStmt* stmt3 = irsb->stmts[selfcheck_idx + i * 5 + 3]; + IRStmt* stmt4 = irsb->stmts[selfcheck_idx + i * 5 + 4]; + vassert(stmt0->tag == Ist_NoOp); + vassert(stmt1->tag == Ist_NoOp); + vassert(stmt2->tag == Ist_NoOp); + vassert(stmt3->tag == Ist_NoOp); + vassert(stmt4->tag == Ist_NoOp); + } + } +} + + +/*--------------------------------------------------------------*/ +/*--- To do with speculation of IRStmts ---*/ +/*--------------------------------------------------------------*/ + +static Bool expr_is_speculatable ( const IRExpr* e ) +{ + switch (e->tag) { + case Iex_Load: + return False; + case Iex_Unop: // FIXME BOGUS, since it might trap + case Iex_Binop: // FIXME ditto + case Iex_ITE: // this is OK + return True; + case Iex_CCall: + return True; // This is probably correct + case Iex_Get: + return True; + default: + vex_printf("\n"); ppIRExpr(e); + vpanic("expr_is_speculatable: unhandled expr"); + } +} + +static Bool stmt_is_speculatable ( const IRStmt* st ) +{ + switch (st->tag) { + case Ist_IMark: + case Ist_Put: + return True; + case Ist_Store: // definitely not + case Ist_CAS: // definitely not + case Ist_Exit: // We could in fact spec this, if required + return False; + case Ist_WrTmp: + return expr_is_speculatable(st->Ist.WrTmp.data); + default: + vex_printf("\n"); ppIRStmt(st); + vpanic("stmt_is_speculatable: unhandled stmt"); + } +} + +static Bool block_is_speculatable ( const IRSB* bb ) +{ + Int i = bb->stmts_used; + vassert(i >= 2); // Must have at least: IMark, final Exit + i--; + vassert(bb->stmts[i]->tag == Ist_Exit); + i--; + for (; i >= 0; i--) { + if (!stmt_is_speculatable(bb->stmts[i])) + return False; + } + return True; +} + +static void speculate_stmt_to_end_of ( /*MOD*/IRSB* bb, + /*IN*/ IRStmt* st, IRTemp guard ) +{ + // We assume all stmts we're presented with here have previously been OK'd by + // stmt_is_speculatable above. + switch (st->tag) { + case Ist_IMark: + case Ist_WrTmp: // FIXME is this ok? + addStmtToIRSB(bb, st); + break; + case Ist_Put: { + // Put(offs, e) ==> Put(offs, ITE(guard, e, Get(offs, sizeof(e)))) + // Which when flattened out is: + // t1 = Get(offs, sizeof(e)) + // t2 = ITE(guard, e, t2) + // Put(offs, t2) + Int offset = st->Ist.Put.offset; + IRExpr* e = st->Ist.Put.data; + IRType ty = typeOfIRExpr(bb->tyenv, e); + IRTemp t1 = newIRTemp(bb->tyenv, ty); + IRTemp t2 = newIRTemp(bb->tyenv, ty); + addStmtToIRSB(bb, IRStmt_WrTmp(t1, IRExpr_Get(offset, ty))); + addStmtToIRSB(bb, IRStmt_WrTmp(t2, IRExpr_ITE(IRExpr_RdTmp(guard), + e, IRExpr_RdTmp(t1)))); + addStmtToIRSB(bb, IRStmt_Put(offset, IRExpr_RdTmp(t2))); + break; + } + case Ist_Exit: { + // Exit(xguard, dst, jk, offsIP) + // ==> t1 = And1(xguard, guard) + // Exit(And1(xguard, guard), dst, jk, offsIP) + IRExpr* xguard = st->Ist.Exit.guard; + IRTemp t1 = newIRTemp(bb->tyenv, Ity_I1); + addStmtToIRSB(bb, IRStmt_WrTmp(t1, IRExpr_Binop(Iop_And1, xguard, + IRExpr_RdTmp(guard)))); + addStmtToIRSB(bb, IRStmt_Exit(IRExpr_RdTmp(t1), st->Ist.Exit.jk, + st->Ist.Exit.dst, st->Ist.Exit.offsIP)); + break; + } + default: + vex_printf("\n"); ppIRStmt(st); + vpanic("speculate_stmt_to_end_of: unhandled stmt"); + } +} + + +/*--------------------------------------------------------------*/ +/*--- Analysis of block ends ---*/ +/*--------------------------------------------------------------*/ + +typedef + enum { + Be_Unknown=1, // Unknown end + Be_UnCond, // Unconditional branch to known destination, unassisted + Be_Cond // Conditional branch to known destinations, unassisted + } + BlockEndTag; + +typedef + struct { + BlockEndTag tag; + union { + struct { + } Unknown; + struct { + Long delta; + } UnCond; + struct { + IRTemp condSX; + Long deltaSX; + Long deltaFT; + } Cond; + } Be; + } + BlockEnd; - which looks pretty strange at first. Eg so unconditional branch - to some address 0x123456 looks like this: +static void ppBlockEnd ( const BlockEnd* be ) +{ + switch (be->tag) { + case Be_Unknown: + vex_printf("!!Unknown!!"); + break; + case Be_UnCond: + vex_printf("UnCond{delta=%lld}", be->Be.UnCond.delta); + break; + case Be_Cond: + vex_printf("Cond{condSX="); + ppIRTemp(be->Be.Cond.condSX); + vex_printf(", deltaSX=%lld, deltaFT=%lld}", + be->Be.Cond.deltaSX, be->Be.Cond.deltaFT); + break; + default: + vassert(0); + } +} - PUT(guest_IP) = 0x123456; // dis_instr_fn generates this - // the exit - PUT(guest_IP) [implicitly] = GET(guest_IP); exit-Boring +// Return True if |be| definitely does not jump to |delta|. In case of +// doubt, returns False. +static Bool definitely_does_not_jump_to_delta ( const BlockEnd* be, Long delta ) +{ + switch (be->tag) { + case Be_Unknown: return False; + case Be_UnCond: return be->Be.UnCond.delta != delta; + case Be_Cond: return be->Be.Cond.deltaSX != delta + && be->Be.Cond.deltaFT != delta; + default: vassert(0); + } +} - after redundant-GET and -PUT removal by iropt, we get what we want: +static Bool irconst_to_maybe_delta ( /*OUT*/Long* delta, + const IRConst* known_dst, + const Addr guest_IP_sbstart, + const IRType guest_word_type, + Bool (*chase_into_ok)(void*,Addr), + void* callback_opaque ) +{ + vassert(typeOfIRConst(known_dst) == guest_word_type); + + *delta = 0; + + // Extract the destination guest address. + Addr dst_ga = 0; + switch (known_dst->tag) { + case Ico_U32: + vassert(guest_word_type == Ity_I32); + dst_ga = known_dst->Ico.U32; + break; + case Ico_U64: + vassert(guest_word_type == Ity_I64); + dst_ga = known_dst->Ico.U64; + break; + default: + vassert(0); + } - // the exit - PUT(guest_IP) [implicitly] = 0x123456; exit-Boring + // Check we're allowed to chase into it. + if (!chase_into_ok(callback_opaque, dst_ga)) + return False; - This makes the IRSB-end case the same as the side-exit case: update - IP, then transfer. There is no redundancy of representation for - the destination, and we use the destination specified by - dis_instr_fn, so any errors it makes show up sooner. -*/ + Addr delta_as_Addr = dst_ga - guest_IP_sbstart; + // Either |delta_as_Addr| is a 64-bit value, in which case copy it directly + // to |delta|, or it's a 32 bit value, in which case sign extend it. + *delta = sizeof(Addr) == 8 ? (Long)delta_as_Addr : (Long)(Int)delta_as_Addr; + return True; +} -IRSB* bb_to_IR ( - /*OUT*/VexGuestExtents* vge, - /*OUT*/UInt* n_sc_extents, - /*OUT*/UInt* n_guest_instrs, /* stats only */ - /*MOD*/VexRegisterUpdates* pxControl, - /*IN*/ void* callback_opaque, - /*IN*/ DisOneInstrFn dis_instr_fn, - /*IN*/ const UChar* guest_code, - /*IN*/ Addr guest_IP_bbstart, - /*IN*/ Bool (*chase_into_ok)(void*,Addr), - /*IN*/ VexEndness host_endness, - /*IN*/ Bool sigill_diag, - /*IN*/ VexArch arch_guest, - /*IN*/ const VexArchInfo* archinfo_guest, - /*IN*/ const VexAbiInfo* abiinfo_both, - /*IN*/ IRType guest_word_type, - /*IN*/ UInt (*needs_self_check) - (void*, /*MB_MOD*/VexRegisterUpdates*, - const VexGuestExtents*), - /*IN*/ Bool (*preamble_function)(void*,IRSB*), - /*IN*/ Int offB_GUEST_CMSTART, - /*IN*/ Int offB_GUEST_CMLEN, - /*IN*/ Int offB_GUEST_IP, - /*IN*/ Int szB_GUEST_IP - ) +/* Scan |stmts|, starting at |scan_start| and working backwards, to detect the + case where there are no IRStmt_Exits before we find the IMark. In other + words, it scans backwards through some prefix of an instruction's IR to see + if there is an exit there. */ +static Bool insn_has_no_other_exits ( IRStmt** const stmts, Int scan_start ) { - Long delta; - Int i, n_instrs, first_stmt_idx; - Bool resteerOK, debug_print; - DisResult dres; - IRStmt* imark; - IRStmt* nop; - static Int n_resteers = 0; - Int d_resteers = 0; - Int selfcheck_idx = 0; - IRSB* irsb; - Addr guest_IP_curr_instr; - IRConst* guest_IP_bbstart_IRConst = NULL; - Int n_cond_resteers_allowed = 2; - - Bool (*resteerOKfn)(void*,Addr) = NULL; - - debug_print = toBool(vex_traceflags & VEX_TRACE_FE); + Bool found_exit = False; + Int i = scan_start; + while (True) { + if (i < 0) + break; + const IRStmt* st = stmts[i]; + if (st->tag == Ist_IMark) + break; + if (st->tag == Ist_Exit) { + found_exit = True; + break; + } + i--; + } + // We expect IR for all instructions to start with an IMark. + vassert(i >= 0); + return !found_exit; +} - /* check sanity .. */ - vassert(sizeof(HWord) == sizeof(void*)); - vassert(vex_control.guest_max_insns >= 1); - vassert(vex_control.guest_max_insns <= 100); - vassert(vex_control.guest_chase_thresh >= 0); - vassert(vex_control.guest_chase_thresh < vex_control.guest_max_insns); - vassert(guest_word_type == Ity_I32 || guest_word_type == Ity_I64); +// FIXME make this able to recognise all block ends +static void analyse_block_end ( /*OUT*/BlockEnd* be, const IRSB* irsb, + const Addr guest_IP_sbstart, + const IRType guest_word_type, + Bool (*chase_into_ok)(void*,Addr), + void* callback_opaque, + Bool debug_print ) +{ + vex_bzero(be, sizeof(*be)); + + // -- Conditional branch to known destination + /* In short, detect the following end form: + ------ IMark(0x4002009, 2, 0) ------ + // Zero or more non-exit statements + if (t14) { PUT(184) = 0x4002040:I64; exit-Boring } + PUT(184) = 0x400200B:I64; exit-Boring + Checks: + - Both transfers are 'boring' + - Both dsts are constants + - The cond is non-constant (an IRExpr_Tmp) + - There are no other exits in this instruction + - The client allows chasing into both destinations + */ + if (irsb->jumpkind == Ijk_Boring && irsb->stmts_used >= 2) { + const IRStmt* maybe_exit = irsb->stmts[irsb->stmts_used - 1]; + if (maybe_exit->tag == Ist_Exit + && maybe_exit->Ist.Exit.guard->tag == Iex_RdTmp + && maybe_exit->Ist.Exit.jk == Ijk_Boring + && irsb->next->tag == Iex_Const + && insn_has_no_other_exits(irsb->stmts, irsb->stmts_used - 2)) { + vassert(maybe_exit->Ist.Exit.offsIP == irsb->offsIP); + IRConst* dst_SX = maybe_exit->Ist.Exit.dst; + IRConst* dst_FT = irsb->next->Iex.Const.con; + IRTemp cond_SX = maybe_exit->Ist.Exit.guard->Iex.RdTmp.tmp; + Long delta_SX = 0; + Long delta_FT = 0; + Bool ok_SX + = irconst_to_maybe_delta(&delta_SX, dst_SX, + guest_IP_sbstart, guest_word_type, + chase_into_ok, callback_opaque); + Bool ok_FT + = irconst_to_maybe_delta(&delta_FT, dst_FT, + guest_IP_sbstart, guest_word_type, + chase_into_ok, callback_opaque); + if (ok_SX && ok_FT) { + be->tag = Be_Cond; + be->Be.Cond.condSX = cond_SX; + be->Be.Cond.deltaSX = delta_SX; + be->Be.Cond.deltaFT = delta_FT; + goto out; + } + } + } - if (guest_word_type == Ity_I32) { - vassert(szB_GUEST_IP == 4); - vassert((offB_GUEST_IP % 4) == 0); - } else { - vassert(szB_GUEST_IP == 8); - vassert((offB_GUEST_IP % 8) == 0); + // -- Unconditional branch/call to known destination + /* Four checks: + - The transfer is 'boring' or 'call', so that no assistance is needed + - The dst is a constant (known at jit time) + - There are no other exits in this instruction. In other words, the + transfer is unconditional. + - The client allows chasing into the destination. + */ + if ((irsb->jumpkind == Ijk_Boring || irsb->jumpkind == Ijk_Call) + && irsb->next->tag == Iex_Const) { + if (insn_has_no_other_exits(irsb->stmts, irsb->stmts_used - 1)) { + // We've got the right pattern. Check whether we can chase into the + // destination, and if so convert that to a delta value. + const IRConst* known_dst = irsb->next->Iex.Const.con; + Long delta = 0; + // This call also checks the type of the dst addr, and that the client + // allows chasing into it. + Bool ok = irconst_to_maybe_delta(&delta, known_dst, + guest_IP_sbstart, guest_word_type, + chase_into_ok, callback_opaque); + if (ok) { + be->tag = Be_UnCond; + be->Be.UnCond.delta = delta; + goto out; + } + } } - /* Although we will try to disassemble up to vex_control.guest_max_insns - insns into the block, the individual insn assemblers may hint to us that a - disassembled instruction is verbose. In that case we will lower the limit - so as to ensure that the JIT doesn't run out of space. See bug 375839 for - the motivating example. */ - Int guest_max_insns_really = vex_control.guest_max_insns; + be->tag = Be_Unknown; + // Not identified as anything in particular. - /* Start a new, empty extent. */ - vge->n_used = 1; - vge->base[0] = guest_IP_bbstart; - vge->len[0] = 0; - *n_sc_extents = 0; + out: + if (debug_print) { + vex_printf("\nBlockEnd: "); + ppBlockEnd(be); + vex_printf("\n"); + } +} - /* And a new IR superblock to dump the result into. */ - irsb = emptyIRSB(); - /* Delta keeps track of how far along the guest_code array we have - so far gone. */ - delta = 0; - n_instrs = 0; - *n_guest_instrs = 0; +/*--------------------------------------------------------------*/ +/*--- Disassembly of basic (not super) blocks ---*/ +/*--------------------------------------------------------------*/ - /* Guest addresses as IRConsts. Used in self-checks to specify the - restart-after-discard point. */ - guest_IP_bbstart_IRConst - = guest_word_type==Ity_I32 - ? IRConst_U32(toUInt(guest_IP_bbstart)) - : IRConst_U64(guest_IP_bbstart); +/* Disassemble instructions, starting at |&guest_code[delta_IN]|, into |irbb|, + and terminate the block properly. At most |n_instrs_allowed_IN| may be + disassembled, and this function may choose to disassemble fewer. - /* Leave 15 spaces in which to put the check statements for a self - checking translation (up to 3 extents, and 5 stmts required for - each). We won't know until later the extents and checksums of - the areas, if any, that need to be checked. */ - nop = IRStmt_NoOp(); - selfcheck_idx = irsb->stmts_used; - for (i = 0; i < 3 * 5; i++) - addStmtToIRSB( irsb, nop ); + Also do minimal simplifications on the resulting block, so as to convert the + end of the block into something that |analyse_block_end| can reliably + recognise. - /* If the caller supplied a function to add its own preamble, use - it now. */ - if (preamble_function) { - Bool stopNow = preamble_function( callback_opaque, irsb ); - if (stopNow) { - /* The callback has completed the IR block without any guest - insns being disassembled into it, so just return it at - this point, even if a self-check was requested - as there - is nothing to self-check. The 15 self-check no-ops will - still be in place, but they are harmless. */ - return irsb; - } - } + |irbb| will both be modified, and replaced by a new, simplified version, + which is returned. +*/ +static IRSB* disassemble_basic_block_till_stop( + /*OUT*/ Int* n_instrs, // #instrs actually used + /*OUT*/ Bool* is_verbose_seen, // did we get a 'verbose' hint? + /*OUT*/ Addr* extent_base, // VexGuestExtents[..].base + /*OUT*/ UShort* extent_len, // VexGuestExtents[..].len + /*MOD*/ IRSB* irbb, + const Long delta_IN, + const Int n_instrs_allowed_IN, + const Addr guest_IP_sbstart, + const VexEndness host_endness, + const Bool sigill_diag, + const VexArch arch_guest, + const VexArchInfo* archinfo_guest, + const VexAbiInfo* abiinfo_both, + const IRType guest_word_type, + const Bool debug_print, + const DisOneInstrFn dis_instr_fn, + const UChar* guest_code, + const Int offB_GUEST_IP + ) +{ + /* This is the max instrs we allow in the block. It starts off at + |n_instrs_allowed_IN| but we may choose to reduce it in the case where the + instruction disassembler returns an 'is verbose' hint. This is so as to + ensure that the JIT doesn't run out of space. See bug 375839 for a + motivating example. */ /* Process instructions. */ + Long delta = delta_IN; + Int n_instrs_allowed = n_instrs_allowed_IN; + + *n_instrs = 0; + *is_verbose_seen = False; + *extent_base = guest_IP_sbstart + delta; + *extent_len = 0; + while (True) { - vassert(n_instrs < guest_max_insns_really); - - /* Regardless of what chase_into_ok says, is chasing permissible - at all right now? Set resteerOKfn accordingly. */ - resteerOK - = toBool( - n_instrs < vex_control.guest_chase_thresh - /* we can't afford to have a resteer once we're on the - last extent slot. */ - && vge->n_used < 3 - ); - - resteerOKfn - = resteerOK ? chase_into_ok : const_False; - - /* n_cond_resteers_allowed keeps track of whether we're still - allowing dis_instr_fn to chase conditional branches. It - starts (at 2) and gets decremented each time dis_instr_fn - tells us it has chased a conditional branch. We then - decrement it, and use it to tell later calls to dis_instr_fn - whether or not it is allowed to chase conditional - branches. */ - vassert(n_cond_resteers_allowed >= 0 && n_cond_resteers_allowed <= 2); + vassert(*n_instrs < n_instrs_allowed); /* This is the IP of the instruction we're just about to deal with. */ - guest_IP_curr_instr = guest_IP_bbstart + delta; + Addr guest_IP_curr_instr = guest_IP_sbstart + delta; - /* This is the irsb statement array index of the first stmt in + /* This is the irbb statement array index of the first stmt in this insn. That will always be the instruction-mark descriptor. */ - first_stmt_idx = irsb->stmts_used; + Int first_stmt_idx = irbb->stmts_used; /* Add an instruction-mark statement. We won't know until after disassembling the instruction how long it instruction is, so @@ -339,7 +787,7 @@ IRSB* bb_to_IR ( libvex_guest_arm.h. */ if (arch_guest == VexArchARM && (guest_IP_curr_instr & 1)) { /* Thumb insn => mask out the T bit, but put it in delta */ - addStmtToIRSB( irsb, + addStmtToIRSB( irbb, IRStmt_IMark(guest_IP_curr_instr & ~(Addr)1, 0, /* len */ 1 /* delta */ @@ -347,7 +795,7 @@ IRSB* bb_to_IR ( ); } else { /* All other targets: store IP as-is, and set delta to zero. */ - addStmtToIRSB( irsb, + addStmtToIRSB( irbb, IRStmt_IMark(guest_IP_curr_instr, 0, /* len */ 0 /* delta */ @@ -355,38 +803,20 @@ IRSB* bb_to_IR ( ); } - if (debug_print && n_instrs > 0) + if (debug_print && *n_instrs > 0) vex_printf("\n"); /* Finally, actually disassemble an instruction. */ - vassert(irsb->next == NULL); - dres = dis_instr_fn ( irsb, - resteerOKfn, - toBool(n_cond_resteers_allowed > 0), - callback_opaque, - guest_code, - delta, - guest_IP_curr_instr, - arch_guest, - archinfo_guest, - abiinfo_both, - host_endness, - sigill_diag ); + vassert(irbb->next == NULL); + DisResult dres + = dis_instr_fn ( irbb, guest_code, delta, guest_IP_curr_instr, + arch_guest, archinfo_guest, abiinfo_both, + host_endness, sigill_diag ); /* stay sane ... */ - vassert(dres.whatNext == Dis_StopHere - || dres.whatNext == Dis_Continue - || dres.whatNext == Dis_ResteerU - || dres.whatNext == Dis_ResteerC); + vassert(dres.whatNext == Dis_StopHere || dres.whatNext == Dis_Continue); /* ... disassembled insn length is sane ... */ vassert(dres.len >= 0 && dres.len <= 24); - /* ... continueAt is zero if no resteer requested ... */ - if (dres.whatNext != Dis_ResteerU && dres.whatNext != Dis_ResteerC) - vassert(dres.continueAt == 0); - /* ... if we disallowed conditional resteers, check that one - didn't actually happen anyway ... */ - if (n_cond_resteers_allowed == 0) - vassert(dres.whatNext != Dis_ResteerC); /* If the disassembly function passed us a hint, take note of it. */ if (LIKELY(dres.hint == Dis_HintNone)) { @@ -397,17 +827,23 @@ IRSB* bb_to_IR ( if necessary so as to avoid running the JIT out of space in the event that we've encountered the start of a long sequence of them. This is expected to be a very rare event. In any case the remaining - limit (30 insns) is still so high that most blocks will terminate - anyway before then. So this is very unlikely to give a perf hit in - practice. See bug 375839 for the motivating example. */ - if (guest_max_insns_really > 30) { - guest_max_insns_really = 30; + limit (in the default setting, 30 insns) is still so high that most + blocks will terminate anyway before then. So this is very unlikely + to give a perf hit in practice. See bug 375839 for the motivating + example. */ + if (!(*is_verbose_seen)) { + *is_verbose_seen = True; + // Halve the number of allowed insns, but only above 2 + if (n_instrs_allowed > 2) { + n_instrs_allowed = ((n_instrs_allowed - 2) / 2) + 2; + //vassert(*n_instrs <= n_instrs_allowed); + } } } /* Fill in the insn-mark length field. */ - vassert(first_stmt_idx >= 0 && first_stmt_idx < irsb->stmts_used); - imark = irsb->stmts[first_stmt_idx]; + vassert(first_stmt_idx >= 0 && first_stmt_idx < irbb->stmts_used); + IRStmt* imark = irbb->stmts[first_stmt_idx]; vassert(imark); vassert(imark->tag == Ist_IMark); vassert(imark->Ist.IMark.len == 0); @@ -415,24 +851,24 @@ IRSB* bb_to_IR ( /* Print the resulting IR, if needed. */ if (vex_traceflags & VEX_TRACE_FE) { - for (i = first_stmt_idx; i < irsb->stmts_used; i++) { + for (Int i = first_stmt_idx; i < irbb->stmts_used; i++) { vex_printf(" "); - ppIRStmt(irsb->stmts[i]); + ppIRStmt(irbb->stmts[i]); vex_printf("\n"); } } - /* Individual insn disassembly may not mess with irsb->next. + /* Individual insn disassembly may not mess with irbb->next. This function is the only place where it can be set. */ - vassert(irsb->next == NULL); - vassert(irsb->jumpkind == Ijk_Boring); - vassert(irsb->offsIP == 0); + vassert(irbb->next == NULL); + vassert(irbb->jumpkind == Ijk_Boring); + vassert(irbb->offsIP == 0); /* Individual insn disassembly must finish the IR for each instruction with an assignment to the guest PC. */ - vassert(first_stmt_idx < irsb->stmts_used); - /* it follows that irsb->stmts_used must be > 0 */ - { IRStmt* st = irsb->stmts[irsb->stmts_used-1]; + vassert(first_stmt_idx < irbb->stmts_used); + /* it follows that irbb->stmts_used must be > 0 */ + { IRStmt* st = irbb->stmts[irbb->stmts_used-1]; vassert(st); vassert(st->tag == Ist_Put); vassert(st->Ist.Put.offset == offB_GUEST_IP); @@ -440,361 +876,641 @@ IRSB* bb_to_IR ( == guest_word_type, but that's a bit expensive. */ } - /* Update the VexGuestExtents we are constructing. */ + /* Update the extents entry that we are constructing. */ /* If vex_control.guest_max_insns is required to be < 100 and each insn is at max 20 bytes long, this limit of 5000 then seems reasonable since the max possible extent length will be 100 * 20 == 2000. */ - vassert(vge->len[vge->n_used-1] < 5000); - vge->len[vge->n_used-1] - = toUShort(toUInt( vge->len[vge->n_used-1] + dres.len )); - n_instrs++; + vassert(*extent_len < 5000); + (*extent_len) += dres.len; + (*n_instrs)++; /* Advance delta (inconspicuous but very important :-) */ delta += (Long)dres.len; + Bool stopNow = False; switch (dres.whatNext) { case Dis_Continue: - vassert(dres.continueAt == 0); vassert(dres.jk_StopHere == Ijk_INVALID); - if (n_instrs < guest_max_insns_really) { - /* keep going */ - } else { - /* We have to stop. See comment above re irsb field + if (*n_instrs >= n_instrs_allowed) { + /* We have to stop. See comment above re irbb field settings here. */ - irsb->next = IRExpr_Get(offB_GUEST_IP, guest_word_type); - /* irsb->jumpkind must already by Ijk_Boring */ - irsb->offsIP = offB_GUEST_IP; - goto done; + irbb->next = IRExpr_Get(offB_GUEST_IP, guest_word_type); + /* irbb->jumpkind must already by Ijk_Boring */ + irbb->offsIP = offB_GUEST_IP; + stopNow = True; } break; case Dis_StopHere: - vassert(dres.continueAt == 0); vassert(dres.jk_StopHere != Ijk_INVALID); - /* See comment above re irsb field settings here. */ - irsb->next = IRExpr_Get(offB_GUEST_IP, guest_word_type); - irsb->jumpkind = dres.jk_StopHere; - irsb->offsIP = offB_GUEST_IP; - goto done; - - case Dis_ResteerU: - case Dis_ResteerC: - /* Check that we actually allowed a resteer .. */ - vassert(resteerOK); - if (dres.whatNext == Dis_ResteerC) { - vassert(n_cond_resteers_allowed > 0); - n_cond_resteers_allowed--; - } - /* figure out a new delta to continue at. */ - vassert(resteerOKfn(callback_opaque,dres.continueAt)); - delta = dres.continueAt - guest_IP_bbstart; - /* we now have to start a new extent slot. */ - vge->n_used++; - vassert(vge->n_used <= 3); - vge->base[vge->n_used-1] = dres.continueAt; - vge->len[vge->n_used-1] = 0; - n_resteers++; - d_resteers++; - if (0 && (n_resteers & 0xFF) == 0) - vex_printf("resteer[%d,%d] to 0x%lx (delta = %lld)\n", - n_resteers, d_resteers, - dres.continueAt, delta); + /* See comment above re irbb field settings here. */ + irbb->next = IRExpr_Get(offB_GUEST_IP, guest_word_type); + irbb->jumpkind = dres.jk_StopHere; + irbb->offsIP = offB_GUEST_IP; + stopNow = True; break; default: vpanic("bb_to_IR"); } + + if (stopNow) + break; + } /* while (True) */ + + /* irbb->next must now be set, since we've finished the block. + Print it if necessary.*/ + vassert(irbb->next != NULL); + if (debug_print) { + vex_printf(" "); + vex_printf( "PUT(%d) = ", irbb->offsIP); + ppIRExpr( irbb->next ); + vex_printf( "; exit-"); + ppIRJumpKind(irbb->jumpkind); + vex_printf( "\n"); + vex_printf( "\n"); } - /*NOTREACHED*/ - vassert(0); - done: - /* We're done. The only thing that might need attending to is that - a self-checking preamble may need to be created. If so it gets - placed in the 15 slots reserved above. + /* And clean it up. */ + irbb = do_minimal_initial_iropt_BB ( irbb ); + if (debug_print) { + ppIRSB(irbb); + } - The scheme is to compute a rather crude checksum of the code - we're making a translation of, and add to the IR a call to a - helper routine which recomputes the checksum every time the - translation is run, and requests a retranslation if it doesn't - match. This is obviously very expensive and considerable - efforts are made to speed it up: + return irbb; +} - * the checksum is computed from all the naturally aligned - host-sized words that overlap the translated code. That means - it could depend on up to 7 bytes before and 7 bytes after - which aren't part of the translated area, and so if those - change then we'll unnecessarily have to discard and - retranslate. This seems like a pretty remote possibility and - it seems as if the benefit of not having to deal with the ends - of the range at byte precision far outweigh any possible extra - translations needed. - * there's a generic routine and 12 specialised cases, which - handle the cases of 1 through 12-word lengths respectively. - They seem to cover about 90% of the cases that occur in - practice. +/*--------------------------------------------------------------*/ +/*--- Disassembly of traces: helper functions ---*/ +/*--------------------------------------------------------------*/ - We ask the caller, via needs_self_check, which of the 3 vge - extents needs a check, and only generate check code for those - that do. - */ - { - Addr base2check; - UInt len2check; - HWord expectedhW; - IRTemp tistart_tmp, tilen_tmp; - HWord VEX_REGPARM(2) (*fn_generic)(HWord, HWord); - HWord VEX_REGPARM(1) (*fn_spec)(HWord); - const HChar* nm_generic; - const HChar* nm_spec; - HWord fn_generic_entry = 0; - HWord fn_spec_entry = 0; - UInt host_word_szB = sizeof(HWord); - IRType host_word_type = Ity_INVALID; +// Swap the side exit and fall through exit for |bb|. Update |be| so as to be +// consistent. +static void swap_sx_and_ft ( /*MOD*/IRSB* bb, /*MOD*/BlockEnd* be ) +{ + vassert(be->tag == Be_Cond); + vassert(bb->stmts_used >= 2); // Must have at least: IMark, Exit + IRStmt* exit = bb->stmts[bb->stmts_used - 1]; + vassert(exit->tag == Ist_Exit); + vassert(exit->Ist.Exit.guard->tag == Iex_RdTmp); + vassert(exit->Ist.Exit.guard->Iex.RdTmp.tmp == be->Be.Cond.condSX); + vassert(bb->next->tag == Iex_Const); + vassert(bb->jumpkind == Ijk_Boring); + // We need to insert a new stmt, just before the exit, that computes 'Not1' + // of the guard condition. Replace |bb->stmts[bb->stmts_used - 1]| by the + // new stmt, and then place |exit| immediately after it. + IRTemp invertedGuard = newIRTemp(bb->tyenv, Ity_I1); + bb->stmts[bb->stmts_used - 1] + = IRStmt_WrTmp(invertedGuard, + IRExpr_Unop(Iop_Not1, IRExpr_RdTmp(exit->Ist.Exit.guard + ->Iex.RdTmp.tmp))); + exit->Ist.Exit.guard->Iex.RdTmp.tmp = invertedGuard; + addStmtToIRSB(bb, exit); + + // Swap the actual destination constants. + { IRConst* tmp = exit->Ist.Exit.dst; + exit->Ist.Exit.dst = bb->next->Iex.Const.con; + bb->next->Iex.Const.con = tmp; + } - UInt extents_needing_check - = needs_self_check(callback_opaque, pxControl, vge); + // And update |be|. + { be->Be.Cond.condSX = invertedGuard; + Long tmp = be->Be.Cond.deltaSX; + be->Be.Cond.deltaSX = be->Be.Cond.deltaFT; + be->Be.Cond.deltaFT = tmp; + } +} - if (host_word_szB == 4) host_word_type = Ity_I32; - if (host_word_szB == 8) host_word_type = Ity_I64; - vassert(host_word_type != Ity_INVALID); - vassert(vge->n_used >= 1 && vge->n_used <= 3); +static void update_instr_budget( /*MOD*/Int* instrs_avail, + /*MOD*/Bool* verbose_mode, + const Int bb_instrs_used, + const Bool bb_verbose_seen ) +{ + if (0) + vex_printf("UIB: verbose_mode %d, instrs_avail %d, " + "bb_instrs_used %d, bb_verbose_seen %d\n", + *verbose_mode ? 1 : 0, *instrs_avail, + bb_instrs_used, bb_verbose_seen ? 1 : 0); + + vassert(bb_instrs_used <= *instrs_avail); + + if (bb_verbose_seen && !(*verbose_mode)) { + *verbose_mode = True; + // Adjust *instrs_avail so that, when it becomes zero, we haven't used + // more than 50% of vex_control.guest_max_instrs. + if (bb_instrs_used > vex_control.guest_max_insns / 2) { + *instrs_avail = 0; + } else { + *instrs_avail = vex_control.guest_max_insns / 2; + } + vassert(*instrs_avail >= 0); + } + + // Subtract bb_instrs_used from *instrs_avail, clamping at 0 if necessary. + if (bb_instrs_used > *instrs_avail) { + *instrs_avail = 0; + } else { + *instrs_avail -= bb_instrs_used; + } + + vassert(*instrs_avail >= 0); +} + +// Add the extent [base, +len) to |vge|. Asserts if |vge| is already full. +// As an optimisation only, tries to also merge the new extent with the +// previous one, if possible. +static void add_extent ( /*MOD*/VexGuestExtents* vge, Addr base, UShort len ) +{ + const UInt limit = sizeof(vge->base) / sizeof(vge->base[0]); + vassert(limit == 3); + const UInt i = vge->n_used; + vassert(i < limit); + vge->n_used++; + vge->base[i] = base; + vge->len[i] = len; + // Try to merge with the previous extent + if (i > 0 + && (((UInt)vge->len[i-1]) + ((UInt)len)) + < 200*25 /* say, 200 insns of size 25 bytes, absolute worst case */ + && vge->base[i-1] + vge->len[i-1] == base) { + vge->len[i-1] += len; + vge->n_used--; + //vex_printf("MERGE\n"); + } +} + + +/*--------------------------------------------------------------*/ +/*--- Disassembly of traces: main function ---*/ +/*--------------------------------------------------------------*/ + +/* Disassemble a complete basic block, starting at guest_IP_start, + returning a new IRSB. The disassembler may chase across basic + block boundaries if it wishes and if chase_into_ok allows it. + The precise guest address ranges from which code has been taken + are written into vge. guest_IP_sbstart is taken to be the IP in + the guest's address space corresponding to the instruction at + &guest_code[0]. + + dis_instr_fn is the arch-specific fn to disassemble on function; it + is this that does the real work. + + needs_self_check is a callback used to ask the caller which of the + extents, if any, a self check is required for. The returned value + is a bitmask with a 1 in position i indicating that the i'th extent + needs a check. Since there can be at most 3 extents, the returned + values must be between 0 and 7. + + The number of extents which did get a self check (0 to 3) is put in + n_sc_extents. The caller already knows this because it told us + which extents to add checks for, via the needs_self_check callback, + but we ship the number back out here for the caller's convenience. + + preamble_function is a callback which allows the caller to add + its own IR preamble (following the self-check, if any). May be + NULL. If non-NULL, the IRSB under construction is handed to + this function, which presumably adds IR statements to it. The + callback may optionally complete the block and direct bb_to_IR + not to disassemble any instructions into it; this is indicated + by the callback returning True. + + offB_CMADDR and offB_CMLEN are the offsets of guest_CMADDR and + guest_CMLEN. Since this routine has to work for any guest state, + without knowing what it is, those offsets have to passed in. + + callback_opaque is a caller-supplied pointer to data which the + callbacks may want to see. Vex has no idea what it is. + (In fact it's a VgInstrumentClosure.) +*/ + +/* Regarding IP updating. dis_instr_fn (that does the guest specific + work of disassembling an individual instruction) must finish the + resulting IR with "PUT(guest_IP) = ". Hence in all cases it must + state the next instruction address. + + If the block is to be ended at that point, then this routine + (bb_to_IR) will set up the next/jumpkind/offsIP fields so as to + make a transfer (of the right kind) to "GET(guest_IP)". Hence if + dis_instr_fn generates incorrect IP updates we will see it + immediately (due to jumping to the wrong next guest address). + + However it is also necessary to set this up so it can be optimised + nicely. The IRSB exit is defined to update the guest IP, so that + chaining works -- since the chain_me stubs expect the chain-to + address to be in the guest state. Hence what the IRSB next fields + will contain initially is (implicitly) + + PUT(guest_IP) [implicitly] = GET(guest_IP) [explicit expr on ::next] + + which looks pretty strange at first. Eg so unconditional branch + to some address 0x123456 looks like this: + + PUT(guest_IP) = 0x123456; // dis_instr_fn generates this + // the exit + PUT(guest_IP) [implicitly] = GET(guest_IP); exit-Boring + + after redundant-GET and -PUT removal by iropt, we get what we want: + + // the exit + PUT(guest_IP) [implicitly] = 0x123456; exit-Boring + + This makes the IRSB-end case the same as the side-exit case: update + IP, then transfer. There is no redundancy of representation for + the destination, and we use the destination specified by + dis_instr_fn, so any errors it makes show up sooner. +*/ +IRSB* bb_to_IR ( + /*OUT*/VexGuestExtents* vge, + /*OUT*/UInt* n_sc_extents, + /*OUT*/UInt* n_guest_instrs, /* stats only */ + /*MOD*/VexRegisterUpdates* pxControl, + /*IN*/ void* callback_opaque, + /*IN*/ DisOneInstrFn dis_instr_fn, + /*IN*/ const UChar* guest_code, + /*IN*/ Addr guest_IP_sbstart, + /*IN*/ Bool (*chase_into_ok)(void*,Addr), + /*IN*/ VexEndness host_endness, + /*IN*/ Bool sigill_diag, + /*IN*/ VexArch arch_guest, + /*IN*/ const VexArchInfo* archinfo_guest, + /*IN*/ const VexAbiInfo* abiinfo_both, + /*IN*/ IRType guest_word_type, + /*IN*/ UInt (*needs_self_check) + (void*, /*MB_MOD*/VexRegisterUpdates*, + const VexGuestExtents*), + /*IN*/ Bool (*preamble_function)(void*,IRSB*), + /*IN*/ Int offB_GUEST_CMSTART, + /*IN*/ Int offB_GUEST_CMLEN, + /*IN*/ Int offB_GUEST_IP, + /*IN*/ Int szB_GUEST_IP + ) +{ + Bool debug_print = toBool(vex_traceflags & VEX_TRACE_FE); + + /* check sanity .. */ + vassert(sizeof(HWord) == sizeof(void*)); + vassert(vex_control.guest_max_insns >= 1); + vassert(vex_control.guest_max_insns <= 100); + vassert(vex_control.guest_chase_thresh >= 0); + vassert(vex_control.guest_chase_thresh < vex_control.guest_max_insns); + vassert(guest_word_type == Ity_I32 || guest_word_type == Ity_I64); - /* Caller shouldn't claim that nonexistent extents need a - check. */ - vassert((extents_needing_check >> vge->n_used) == 0); + if (guest_word_type == Ity_I32) { + vassert(szB_GUEST_IP == 4); + vassert((offB_GUEST_IP % 4) == 0); + } else { + vassert(szB_GUEST_IP == 8); + vassert((offB_GUEST_IP % 8) == 0); + } - for (i = 0; i < vge->n_used; i++) { + /* Initialise all return-by-ref state. */ + vge->n_used = 0; + *n_sc_extents = 0; + *n_guest_instrs = 0; - /* Do we need to generate a check for this extent? */ - if ((extents_needing_check & (1 << i)) == 0) - continue; + /* And a new IR superblock to dump the result into. */ + IRSB* irsb = emptyIRSB(); - /* Tell the caller */ - (*n_sc_extents)++; + /* Leave 15 spaces in which to put the check statements for a self + checking translation (up to 3 extents, and 5 stmts required for + each). We won't know until later the extents and checksums of + the areas, if any, that need to be checked. */ + IRStmt* nop = IRStmt_NoOp(); + Int selfcheck_idx = irsb->stmts_used; + for (Int i = 0; i < 3 * 5; i++) + addStmtToIRSB( irsb, nop ); - /* the extent we're generating a check for */ - base2check = vge->base[i]; - len2check = vge->len[i]; + /* If the caller supplied a function to add its own preamble, use + it now. */ + if (preamble_function) { + Bool stopNow = preamble_function( callback_opaque, irsb ); + if (stopNow) { + /* The callback has completed the IR block without any guest + insns being disassembled into it, so just return it at + this point, even if a self-check was requested - as there + is nothing to self-check. The 15 self-check no-ops will + still be in place, but they are harmless. */ + vge->n_used = 1; + vge->base[0] = guest_IP_sbstart; + vge->len[0] = 0; + return irsb; + } + } - /* stay sane */ - vassert(len2check >= 0 && len2check < 2000/*arbitrary*/); + /* Running state: + irsb the SB we are incrementally constructing + vge associated extents for irsb + instrs_used instrs incorporated in irsb so far + instrs_avail number of instrs we have space for + verbose_mode did we see an 'is verbose' hint at some point? + */ + Int instrs_used = 0; + Int instrs_avail = vex_control.guest_max_insns; + Bool verbose_mode = False; - /* Skip the check if the translation involved zero bytes */ - if (len2check == 0) - continue; + /* Disassemble the initial block until we have to stop. */ + { + Int ib_instrs_used = 0; + Bool ib_verbose_seen = False; + Addr ib_base = 0; + UShort ib_len = 0; + irsb = disassemble_basic_block_till_stop( + /*OUT*/ &ib_instrs_used, &ib_verbose_seen, &ib_base, &ib_len, + /*MOD*/ irsb, + /*IN*/ 0/*delta for the first block in the trace*/, + instrs_avail, guest_IP_sbstart, host_endness, sigill_diag, + arch_guest, archinfo_guest, abiinfo_both, guest_word_type, + debug_print, dis_instr_fn, guest_code, offB_GUEST_IP + ); + vassert(ib_instrs_used <= instrs_avail); + + // Update instrs_used, extents, budget. + instrs_used += ib_instrs_used; + add_extent(vge, ib_base, ib_len); + update_instr_budget(&instrs_avail, &verbose_mode, + ib_instrs_used, ib_verbose_seen); + } - HWord first_hW = ((HWord)base2check) - & ~(HWord)(host_word_szB-1); - HWord last_hW = (((HWord)base2check) + len2check - 1) - & ~(HWord)(host_word_szB-1); - vassert(first_hW <= last_hW); - HWord hW_diff = last_hW - first_hW; - vassert(0 == (hW_diff & (host_word_szB-1))); - HWord hWs_to_check = (hW_diff + host_word_szB) / host_word_szB; - vassert(hWs_to_check > 0 - && hWs_to_check < 2004/*arbitrary*/ / host_word_szB); + /* Now, see if we can extend the initial block. */ + while (True) { + const Int n_extent_slots = sizeof(vge->base) / sizeof(vge->base[0]); + vassert(n_extent_slots == 3); + + // Reasons to give up immediately: + // User or tool asked us not to chase + if (vex_control.guest_chase_thresh == 0) + break; + + // Out of extent slots + vassert(vge->n_used <= n_extent_slots); + if (vge->n_used == n_extent_slots) + break; + + // Almost out of available instructions + vassert(instrs_avail >= 0); + if (instrs_avail < 3) + break; + + // Try for an extend. What kind we do depends on how the current trace + // ends. + BlockEnd irsb_be; + analyse_block_end(&irsb_be, irsb, guest_IP_sbstart, guest_word_type, + chase_into_ok, callback_opaque, debug_print); + + // Try for an extend based on an unconditional branch or call to a known + // destination. + if (irsb_be.tag == Be_UnCond) { + if (debug_print) { + vex_printf("\n-+-+ Unconditional follow (ext# %d) to 0x%llx " + "-+-+\n\n", + (Int)vge->n_used, + (ULong)((Long)guest_IP_sbstart+ irsb_be.Be.UnCond.delta)); + } + Int bb_instrs_used = 0; + Bool bb_verbose_seen = False; + Addr bb_base = 0; + UShort bb_len = 0; + IRSB* bb + = disassemble_basic_block_till_stop( + /*OUT*/ &bb_instrs_used, &bb_verbose_seen, &bb_base, &bb_len, + /*MOD*/ emptyIRSB(), + /*IN*/ irsb_be.Be.UnCond.delta, + instrs_avail, guest_IP_sbstart, host_endness, sigill_diag, + arch_guest, archinfo_guest, abiinfo_both, guest_word_type, + debug_print, dis_instr_fn, guest_code, offB_GUEST_IP + ); + vassert(bb_instrs_used <= instrs_avail); + + /* Now we have to append 'bb' to 'irsb'. */ + concatenate_irsbs(irsb, bb); + + // Update instrs_used, extents, budget. + instrs_used += bb_instrs_used; + add_extent(vge, bb_base, bb_len); + update_instr_budget(&instrs_avail, &verbose_mode, + bb_instrs_used, bb_verbose_seen); + } // if (be.tag == Be_UnCond) + + // Try for an extend based on a conditional branch, specifically in the + // hope of identifying and recovering, an "A && B" condition spread across + // two basic blocks. + if (irsb_be.tag == Be_Cond) { + if (debug_print) { + vex_printf("\n-+-+ (ext# %d) Considering cbranch to" + " SX=0x%llx FT=0x%llx -+-+\n\n", + (Int)vge->n_used, + (ULong)((Long)guest_IP_sbstart+ irsb_be.Be.Cond.deltaSX), + (ULong)((Long)guest_IP_sbstart+ irsb_be.Be.Cond.deltaFT)); + } + const Int instrs_avail_spec = 3; - /* vex_printf("%lx %lx %ld\n", first_hW, last_hW, hWs_to_check); */ + if (debug_print) { + vex_printf("-+-+ SPEC side exit -+-+\n\n"); + } + Int sx_instrs_used = 0; + Bool sx_verbose_seen = False; + Addr sx_base = 0; + UShort sx_len = 0; + IRSB* sx_bb + = disassemble_basic_block_till_stop( + /*OUT*/ &sx_instrs_used, &sx_verbose_seen, &sx_base, &sx_len, + /*MOD*/ emptyIRSB(), + /*IN*/ irsb_be.Be.Cond.deltaSX, + instrs_avail_spec, guest_IP_sbstart, host_endness, sigill_diag, + arch_guest, archinfo_guest, abiinfo_both, guest_word_type, + debug_print, dis_instr_fn, guest_code, offB_GUEST_IP + ); + vassert(sx_instrs_used <= instrs_avail_spec); + BlockEnd sx_be; + analyse_block_end(&sx_be, sx_bb, guest_IP_sbstart, guest_word_type, + chase_into_ok, callback_opaque, debug_print); - if (host_word_szB == 8) { - fn_generic = (VEX_REGPARM(2) HWord(*)(HWord, HWord)) - genericg_compute_checksum_8al; - nm_generic = "genericg_compute_checksum_8al"; - } else { - fn_generic = (VEX_REGPARM(2) HWord(*)(HWord, HWord)) - genericg_compute_checksum_4al; - nm_generic = "genericg_compute_checksum_4al"; + if (debug_print) { + vex_printf("\n-+-+ SPEC fall through -+-+\n\n"); + } + Int ft_instrs_used = 0; + Bool ft_verbose_seen = False; + Addr ft_base = 0; + UShort ft_len = 0; + IRSB* ft_bb + = disassemble_basic_block_till_stop( + /*OUT*/ &ft_instrs_used, &ft_verbose_seen, &ft_base, &ft_len, + /*MOD*/ emptyIRSB(), + /*IN*/ irsb_be.Be.Cond.deltaFT, + instrs_avail_spec, guest_IP_sbstart, host_endness, sigill_diag, + arch_guest, archinfo_guest, abiinfo_both, guest_word_type, + debug_print, dis_instr_fn, guest_code, offB_GUEST_IP + ); + vassert(ft_instrs_used <= instrs_avail_spec); + BlockEnd ft_be; + analyse_block_end(&ft_be, ft_bb, guest_IP_sbstart, guest_word_type, + chase_into_ok, callback_opaque, debug_print); + + /* In order for the transformation to be remotely valid, we need: + - At least one of the sx_bb or ft_bb to be have a Be_Cond end. + - sx_bb and ft_bb definitely don't form a loop. + */ + Bool ok = sx_be.tag == Be_Cond || ft_be.tag == Be_Cond; + if (ok) { + ok = definitely_does_not_jump_to_delta(&sx_be, + irsb_be.Be.Cond.deltaFT) + || definitely_does_not_jump_to_delta(&ft_be, + irsb_be.Be.Cond.deltaSX); } - fn_spec = NULL; - nm_spec = NULL; + // Check for other mutancy: + // irsb ft == sx, or the same for ft itself or sx itself + if (ok) { + if (irsb_be.Be.Cond.deltaSX == irsb_be.Be.Cond.deltaFT + || (sx_be.tag == Be_Cond + && sx_be.Be.Cond.deltaSX == sx_be.Be.Cond.deltaFT) + || (ft_be.tag == Be_Cond + && ft_be.Be.Cond.deltaSX == ft_be.Be.Cond.deltaFT)) { + ok = False; + } + } - if (host_word_szB == 8) { - const HChar* nm = NULL; - ULong VEX_REGPARM(1) (*fn)(HWord) = NULL; - switch (hWs_to_check) { - case 1: fn = genericg_compute_checksum_8al_1; - nm = "genericg_compute_checksum_8al_1"; break; - case 2: fn = genericg_compute_checksum_8al_2; - nm = "genericg_compute_checksum_8al_2"; break; - case 3: fn = genericg_compute_checksum_8al_3; - nm = "genericg_compute_checksum_8al_3"; break; - case 4: fn = genericg_compute_checksum_8al_4; - nm = "genericg_compute_checksum_8al_4"; break; - case 5: fn = genericg_compute_checksum_8al_5; - nm = "genericg_compute_checksum_8al_5"; break; - case 6: fn = genericg_compute_checksum_8al_6; - nm = "genericg_compute_checksum_8al_6"; break; - case 7: fn = genericg_compute_checksum_8al_7; - nm = "genericg_compute_checksum_8al_7"; break; - case 8: fn = genericg_compute_checksum_8al_8; - nm = "genericg_compute_checksum_8al_8"; break; - case 9: fn = genericg_compute_checksum_8al_9; - nm = "genericg_compute_checksum_8al_9"; break; - case 10: fn = genericg_compute_checksum_8al_10; - nm = "genericg_compute_checksum_8al_10"; break; - case 11: fn = genericg_compute_checksum_8al_11; - nm = "genericg_compute_checksum_8al_11"; break; - case 12: fn = genericg_compute_checksum_8al_12; - nm = "genericg_compute_checksum_8al_12"; break; - default: break; + /* Now let's see if any of our four cases actually holds (viz, is this + really an && idiom? */ + UInt idiom = 4; + if (ok) { + vassert(irsb_be.tag == Be_Cond); + UInt iom1 = 4/*invalid*/; + if (sx_be.tag == Be_Cond) { + /**/ if (sx_be.Be.Cond.deltaFT == irsb_be.Be.Cond.deltaFT) + iom1 = 0; + else if (sx_be.Be.Cond.deltaSX == irsb_be.Be.Cond.deltaFT) + iom1 = 1; } - fn_spec = (VEX_REGPARM(1) HWord(*)(HWord)) fn; - nm_spec = nm; - } else { - const HChar* nm = NULL; - UInt VEX_REGPARM(1) (*fn)(HWord) = NULL; - switch (hWs_to_check) { - case 1: fn = genericg_compute_checksum_4al_1; - nm = "genericg_compute_checksum_4al_1"; break; - case 2: fn = genericg_compute_checksum_4al_2; - nm = "genericg_compute_checksum_4al_2"; break; - case 3: fn = genericg_compute_checksum_4al_3; - nm = "genericg_compute_checksum_4al_3"; break; - case 4: fn = genericg_compute_checksum_4al_4; - nm = "genericg_compute_checksum_4al_4"; break; - case 5: fn = genericg_compute_checksum_4al_5; - nm = "genericg_compute_checksum_4al_5"; break; - case 6: fn = genericg_compute_checksum_4al_6; - nm = "genericg_compute_checksum_4al_6"; break; - case 7: fn = genericg_compute_checksum_4al_7; - nm = "genericg_compute_checksum_4al_7"; break; - case 8: fn = genericg_compute_checksum_4al_8; - nm = "genericg_compute_checksum_4al_8"; break; - case 9: fn = genericg_compute_checksum_4al_9; - nm = "genericg_compute_checksum_4al_9"; break; - case 10: fn = genericg_compute_checksum_4al_10; - nm = "genericg_compute_checksum_4al_10"; break; - case 11: fn = genericg_compute_checksum_4al_11; - nm = "genericg_compute_checksum_4al_11"; break; - case 12: fn = genericg_compute_checksum_4al_12; - nm = "genericg_compute_checksum_4al_12"; break; - default: break; + UInt iom2 = 4/*invalid*/; + if (ft_be.tag == Be_Cond) { + /**/ if (ft_be.Be.Cond.deltaFT == irsb_be.Be.Cond.deltaSX) + iom2 = 2; + else if (ft_be.Be.Cond.deltaSX == irsb_be.Be.Cond.deltaSX) + iom2 = 3; } - fn_spec = (VEX_REGPARM(1) HWord(*)(HWord))fn; - nm_spec = nm; - } - expectedhW = fn_generic( first_hW, hWs_to_check ); - /* If we got a specialised version, check it produces the same - result as the generic version! */ - if (fn_spec) { - vassert(nm_spec); - vassert(expectedhW == fn_spec( first_hW )); - } else { - vassert(!nm_spec); + /* We should only have identified at most one of the four idioms. */ + vassert(iom1 == 4 || iom2 == 4); + idiom = (iom1 < 4) ? iom1 : (iom2 < 4 ? iom2 : 4); + if (idiom == 4) { + ok = False; + if (debug_print) { + vex_printf("\n-+-+ &&-idiom not recognised, " + "giving up. -+-+\n\n"); + } + } } - /* Set CMSTART and CMLEN. These will describe to the despatcher - the area of guest code to invalidate should we exit with a - self-check failure. */ - - tistart_tmp = newIRTemp(irsb->tyenv, guest_word_type); - tilen_tmp = newIRTemp(irsb->tyenv, guest_word_type); - - IRConst* base2check_IRConst - = guest_word_type==Ity_I32 ? IRConst_U32(toUInt(base2check)) - : IRConst_U64(base2check); - IRConst* len2check_IRConst - = guest_word_type==Ity_I32 ? IRConst_U32(len2check) - : IRConst_U64(len2check); - - irsb->stmts[selfcheck_idx + i * 5 + 0] - = IRStmt_WrTmp(tistart_tmp, IRExpr_Const(base2check_IRConst) ); - - irsb->stmts[selfcheck_idx + i * 5 + 1] - = IRStmt_WrTmp(tilen_tmp, IRExpr_Const(len2check_IRConst) ); - - irsb->stmts[selfcheck_idx + i * 5 + 2] - = IRStmt_Put( offB_GUEST_CMSTART, IRExpr_RdTmp(tistart_tmp) ); + if (ok) { + vassert(idiom < 4); + // "Normalise" the data so as to ensure we only have one of the four + // idioms to transform. + if (idiom == 2 || idiom == 3) { + swap_sx_and_ft(irsb, &irsb_be); +# define SWAP(_ty, _aa, _bb) \ + do { _ty _tmp = _aa; _aa = _bb; _bb = _tmp; } while (0) + SWAP(Int, sx_instrs_used, ft_instrs_used); + SWAP(Bool, sx_verbose_seen, ft_verbose_seen); + SWAP(Addr, sx_base, ft_base); + SWAP(UShort, sx_len, ft_len); + SWAP(IRSB*, sx_bb, ft_bb); + SWAP(BlockEnd, sx_be, ft_be); +# undef SWAP + } + if (idiom == 1 || idiom == 3) { + swap_sx_and_ft(sx_bb, &sx_be); + } + vassert(sx_be.tag == Be_Cond); + vassert(sx_be.Be.Cond.deltaFT == irsb_be.Be.Cond.deltaFT); + + if (debug_print) { + vex_printf("\n-+-+ After normalisation (idiom=%u) -+-+\n", idiom); + vex_printf("\n-+-+ IRSB -+-+\n"); + ppIRSB(irsb); + ppBlockEnd(&irsb_be); + vex_printf("\n\n-+-+ SX -+-+\n"); + ppIRSB(sx_bb); + ppBlockEnd(&sx_be); + vex_printf("\n"); + } + // Finally, check the sx block actually is speculatable. + ok = block_is_speculatable(sx_bb); + if (!ok && debug_print) { + vex_printf("\n-+-+ SX not speculatable, giving up. -+-+\n\n"); + } + } - irsb->stmts[selfcheck_idx + i * 5 + 3] - = IRStmt_Put( offB_GUEST_CMLEN, IRExpr_RdTmp(tilen_tmp) ); + if (ok) { + if (0 || debug_print) { + vex_printf("\n-+-+ DOING &&-TRANSFORM -+-+\n"); + } + // Finally really actually do the transformation. + // 0. remove the last Exit on irsb. + // 1. Add irsb->tyenv->types_used to all the tmps in sx_bb, + // by calling deltaIRStmt on all stmts. + // 2. Speculate all stmts in sx_bb on irsb_be.Be.Cond.condSX, + // **including** the last stmt (which must be an Exit). It's + // here that the And1 is generated. + // 3. Copy all speculated stmts to the end of irsb. + vassert(irsb->stmts_used >= 2); + irsb->stmts_used--; + Int delta = irsb->tyenv->types_used; + + // Append sx_bb's tyenv to irsb's + for (Int i = 0; i < sx_bb->tyenv->types_used; i++) { + (void)newIRTemp(irsb->tyenv, sx_bb->tyenv->types[i]); + } - /* Generate the entry point descriptors */ - if (abiinfo_both->host_ppc_calls_use_fndescrs) { - HWord* descr = (HWord*)fn_generic; - fn_generic_entry = descr[0]; - if (fn_spec) { - descr = (HWord*)fn_spec; - fn_spec_entry = descr[0]; - } else { - fn_spec_entry = (HWord)NULL; + for (Int i = 0; i < sx_bb->stmts_used; i++) { + IRStmt* st = deepCopyIRStmt(sx_bb->stmts[i]); + deltaIRStmt(st, delta); + speculate_stmt_to_end_of(irsb, st, irsb_be.Be.Cond.condSX); } - } else { - fn_generic_entry = (HWord)fn_generic; - if (fn_spec) { - fn_spec_entry = (HWord)fn_spec; - } else { - fn_spec_entry = (HWord)NULL; + + if (debug_print) { + vex_printf("\n-+-+ FINAL RESULT -+-+\n\n"); + ppIRSB(irsb); + vex_printf("\n"); } - } - IRExpr* callexpr = NULL; - if (fn_spec) { - callexpr = mkIRExprCCall( - host_word_type, 1/*regparms*/, - nm_spec, (void*)fn_spec_entry, - mkIRExprVec_1( - mkIRExpr_HWord( (HWord)first_hW ) - ) - ); - } else { - callexpr = mkIRExprCCall( - host_word_type, 2/*regparms*/, - nm_generic, (void*)fn_generic_entry, - mkIRExprVec_2( - mkIRExpr_HWord( (HWord)first_hW ), - mkIRExpr_HWord( (HWord)hWs_to_check ) - ) - ); + // Update instrs_used, extents, budget. + instrs_used += sx_instrs_used; + add_extent(vge, sx_base, sx_len); + update_instr_budget(&instrs_avail, &verbose_mode, + sx_instrs_used, sx_verbose_seen); } + break; + } // if (be.tag == Be_Cond) - irsb->stmts[selfcheck_idx + i * 5 + 4] - = IRStmt_Exit( - IRExpr_Binop( - host_word_type==Ity_I64 ? Iop_CmpNE64 : Iop_CmpNE32, - callexpr, - host_word_type==Ity_I64 - ? IRExpr_Const(IRConst_U64(expectedhW)) - : IRExpr_Const(IRConst_U32(expectedhW)) - ), - Ijk_InvalICache, - /* Where we must restart if there's a failure: at the - first extent, regardless of which extent the - failure actually happened in. */ - guest_IP_bbstart_IRConst, - offB_GUEST_IP - ); - } /* for (i = 0; i < vge->n_used; i++) */ - } + // We don't know any other way to extend the block. Give up. + else { + break; + } - /* irsb->next must now be set, since we've finished the block. - Print it if necessary.*/ - vassert(irsb->next != NULL); - if (debug_print) { - vex_printf(" "); - vex_printf( "PUT(%d) = ", irsb->offsIP); - ppIRExpr( irsb->next ); - vex_printf( "; exit-"); - ppIRJumpKind(irsb->jumpkind); - vex_printf( "\n"); - vex_printf( "\n"); - } + } // while (True) + + /* We're almost done. The only thing that might need attending to is that + a self-checking preamble may need to be created. If so it gets placed + in the 15 slots reserved above. */ + create_self_checks_as_needed( + irsb, n_sc_extents, pxControl, callback_opaque, needs_self_check, + vge, abiinfo_both, guest_word_type, selfcheck_idx, offB_GUEST_CMSTART, + offB_GUEST_CMLEN, offB_GUEST_IP, guest_IP_sbstart + ); - *n_guest_instrs = n_instrs; + *n_guest_instrs = instrs_used; return irsb; } -/*------------------------------------------------------------- - A support routine for doing self-checking translations. - -------------------------------------------------------------*/ +/*--------------------------------------------------------------*/ +/*--- Functions called by self-checking transations ---*/ +/*--------------------------------------------------------------*/ -/* CLEAN HELPER */ -/* CALLED FROM GENERATED CODE */ +/* All of these are CLEAN HELPERs */ +/* All of these are CALLED FROM GENERATED CODE */ /* Compute a checksum of host memory at [addr .. addr+len-1], as fast as possible. All _4al versions assume that the supplied address is diff --git a/VEX/priv/guest_generic_bb_to_IR.h b/VEX/priv/guest_generic_bb_to_IR.h index 21c902d88..08d33ad3a 100644 --- a/VEX/priv/guest_generic_bb_to_IR.h +++ b/VEX/priv/guest_generic_bb_to_IR.h @@ -50,13 +50,11 @@ Result of disassembling an instruction --------------------------------------------------------------- */ -/* The results of disassembling an instruction. There are three - possible outcomes. For Dis_Resteer, the disassembler _must_ - continue at the specified address. For Dis_StopHere, the - disassembler _must_ terminate the BB. For Dis_Continue, we may at - our option either disassemble the next insn, or terminate the BB; - but in the latter case we must set the bb's ->next field to point - to the next instruction. */ +/* The results of disassembling an instruction. There are three possible + outcomes. For Dis_StopHere, the disassembler _must_ terminate the BB. For + Dis_Continue, we may at our option either disassemble the next insn, or + terminate the BB; but in the latter case we must set the bb's ->next field + to point to the next instruction. */ typedef @@ -69,13 +67,8 @@ typedef /* What happens next? Dis_StopHere: this insn terminates the BB; we must stop. Dis_Continue: we can optionally continue into the next insn - Dis_ResteerU: followed an unconditional branch; continue at - 'continueAt' - Dis_ResteerC: (speculatively, of course) followed a - conditional branch; continue at 'continueAt' */ - enum { Dis_StopHere=0x10, Dis_Continue, - Dis_ResteerU, Dis_ResteerC } whatNext; + enum { Dis_StopHere=0x10, Dis_Continue } whatNext; /* Any other hints that we should feed back to the disassembler? Dis_HintNone: no hint @@ -90,9 +83,6 @@ typedef cases. */ IRJumpKind jk_StopHere; - /* For Dis_Resteer, this is the guest address we should continue - at. Otherwise ignored (should be zero). */ - Addr continueAt; } DisResult; @@ -100,22 +90,15 @@ typedef /* --------------------------------------------------------------- The type of a function which disassembles one instruction. - C's function-type syntax is really astonishing bizarre. --------------------------------------------------------------- */ /* A function of this type (DisOneInstrFn) disassembles an instruction located at host address &guest_code[delta], whose guest IP is guest_IP (this may be entirely unrelated to where the insn is actually located in the host's address space.). The returned - DisResult.len field carries its size. If the returned - DisResult.whatNext field is Dis_Resteer then DisResult.continueAt - should hold the guest IP of the next insn to disassemble. + DisResult.len field carries its size. - disInstr is not permitted to return Dis_Resteer if resteerOkFn, - when applied to the address which it wishes to resteer into, - returns False. - - The resulting IR is added to the end of irbb. + The resulting IR is added to the end of irsb. */ typedef @@ -123,21 +106,7 @@ typedef DisResult (*DisOneInstrFn) ( /* This is the IRSB to which the resulting IR is to be appended. */ - /*OUT*/ IRSB* irbb, - - /* Return True iff resteering to the given addr is allowed (for - branches/calls to destinations that are known at JIT-time) */ - /*IN*/ Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - - /* Should we speculatively resteer across conditional branches? - (Experimental and not enabled by default). The strategy is - to assume that backward branches are taken and forward - branches are not taken. */ - /*IN*/ Bool resteerCisOk, - - /* Vex-opaque data passed to all caller (valgrind) supplied - callbacks. */ - /*IN*/ void* callback_opaque, + /*OUT*/ IRSB* irsb, /* Where is the guest code? */ /*IN*/ const UChar* guest_code, diff --git a/VEX/priv/guest_mips_defs.h b/VEX/priv/guest_mips_defs.h index cfc817020..3bbb3eeed 100644 --- a/VEX/priv/guest_mips_defs.h +++ b/VEX/priv/guest_mips_defs.h @@ -41,9 +41,6 @@ /* Convert one MIPS insn to IR. See the type DisOneInstrFn in guest_generic_bb_to_IR.h. */ extern DisResult disInstr_MIPS ( IRSB* irbb, - Bool (*resteerOkFn) (void *, Addr), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_mips_toIR.c b/VEX/priv/guest_mips_toIR.c index 32632077d..87f53e334 100644 --- a/VEX/priv/guest_mips_toIR.c +++ b/VEX/priv/guest_mips_toIR.c @@ -1301,7 +1301,6 @@ static void jmp_lit32 ( /*MOD*/ DisResult* dres, IRJumpKind kind, Addr32 d32 ) { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = kind; @@ -1312,7 +1311,6 @@ static void jmp_lit64 ( /*MOD*/ DisResult* dres, IRJumpKind kind, Addr64 d64 ) { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = kind; @@ -2734,9 +2732,7 @@ static Bool dis_instr_CCondFmt ( UInt cins ) /*********************************************************/ /*--- Branch Instructions for mips64 ---*/ /*********************************************************/ -static Bool dis_instr_branch ( UInt theInstr, DisResult * dres, - Bool(*resteerOkFn) (void *, Addr), - void *callback_opaque, IRStmt ** set ) +static Bool dis_instr_branch ( UInt theInstr, DisResult * dres, IRStmt ** set ) { UInt jmpKind = 0; UChar opc1 = get_opcode(theInstr); @@ -16742,10 +16738,7 @@ extern UInt disDSPInstr_MIPS_WRK ( UInt ); static UInt disInstr_MIPS_WRK_Special(UInt cins, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, DisResult* dres, - IRStmt** bstmt, IRExpr** lastn, - Bool(*resteerOkFn) (/*opaque */void *, - Addr), - void* callback_opaque) + IRStmt** bstmt, IRExpr** lastn) { IRTemp t0, t1 = 0, t2, t3, t4, t5; UInt rs, rt, rd, sa, tf, function, trap_code, imm, instr_index, rot, sel; @@ -18369,10 +18362,7 @@ static UInt disInstr_MIPS_WRK_Special(UInt cins, const VexArchInfo* archinfo, static UInt disInstr_MIPS_WRK_Special2(UInt cins, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, DisResult* dres, - IRStmt** bstmt, IRExpr** lastn, - Bool(*resteerOkFn) (/*opaque */void *, - Addr), - void* callback_opaque) + IRStmt** bstmt, IRExpr** lastn) { IRTemp t0, t1 = 0, t2, t3, t4, t5, t6; UInt rs, rt, rd, function; @@ -18852,10 +18842,7 @@ static UInt disInstr_MIPS_WRK_Special2(UInt cins, const VexArchInfo* archinfo, static UInt disInstr_MIPS_WRK_Special3(UInt cins, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, DisResult* dres, - IRStmt** bstmt, IRExpr** lastn, - Bool(*resteerOkFn) (/*opaque */void *, - Addr), - void* callback_opaque) + IRStmt** bstmt, IRExpr** lastn) { IRTemp t0, t1 = 0, t2, t3, t4, t5, t6; @@ -19726,10 +19713,7 @@ static UInt disInstr_MIPS_WRK_Special3(UInt cins, const VexArchInfo* archinfo, static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, DisResult* dres, - IRStmt** bstmt, IRExpr** lastn, - Bool(*resteerOkFn) (/*opaque */void *, - Addr), - void* callback_opaque) + IRStmt** bstmt, IRExpr** lastn) { IRTemp t0; UInt opcode, rs, rt, trap_code, imm, instr_index, p; @@ -19745,8 +19729,8 @@ static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, switch (opcode & 0x0F) { case 0x00: /* Special */ - return disInstr_MIPS_WRK_Special(cins, archinfo, abiinfo, dres, bstmt, lastn, - resteerOkFn, callback_opaque); + return disInstr_MIPS_WRK_Special(cins, archinfo, abiinfo, + dres, bstmt, lastn); case 0x01: /* Regimm */ switch (rt) { @@ -19754,8 +19738,7 @@ static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, DIP("bltz r%u, %u", rs, imm); if (mode64) { - if (!dis_instr_branch(cins, dres, resteerOkFn, - callback_opaque, bstmt)) + if (!dis_instr_branch(cins, dres, bstmt)) return -1; } else dis_branch(False, binop(Iop_CmpEQ32, binop(Iop_And32, getIReg(rs), @@ -19767,8 +19750,7 @@ static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, DIP("bgez r%u, %u", rs, imm); if (mode64) { - if (!dis_instr_branch(cins, dres, resteerOkFn, - callback_opaque, bstmt)) + if (!dis_instr_branch(cins, dres, bstmt)) return -1; } else dis_branch(False, binop(Iop_CmpEQ32, binop(Iop_And32, getIReg(rs), @@ -19936,8 +19918,7 @@ static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, DIP("bltzal r%u, %u", rs, imm); if (mode64) { - if (!dis_instr_branch(cins, dres, resteerOkFn, - callback_opaque, bstmt)) + if (!dis_instr_branch(cins, dres, bstmt)) return -1; } else dis_branch(True, binop(Iop_CmpEQ32, binop(Iop_And32, getIReg(rs), @@ -19949,8 +19930,7 @@ static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, DIP("bgezal r%u, %u", rs, imm); if (mode64) { - if (!dis_instr_branch(cins, dres, resteerOkFn, - callback_opaque, bstmt)) + if (!dis_instr_branch(cins, dres, bstmt)) return -1; } else dis_branch(True, binop(Iop_CmpEQ32, binop(Iop_And32, getIReg(rs), @@ -20456,10 +20436,7 @@ static UInt disInstr_MIPS_WRK_00(UInt cins, const VexArchInfo* archinfo, static UInt disInstr_MIPS_WRK_10(UInt cins, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, DisResult* dres, - IRStmt** bstmt, IRExpr** lastn, - Bool(*resteerOkFn) (/*opaque */void *, - Addr), - void* callback_opaque) + IRStmt** bstmt, IRExpr** lastn) { IRTemp t0, t1 = 0, t2, t3, t4, t5, t6, t7; UInt opcode, rs, rt, ft, fs, fd, fmt, tf, nd, function, imm; @@ -23469,8 +23446,8 @@ static UInt disInstr_MIPS_WRK_10(UInt cins, const VexArchInfo* archinfo, } case 0x0C: /* Special2 */ - return disInstr_MIPS_WRK_Special2(cins, archinfo, abiinfo, dres, bstmt, lastn, - resteerOkFn, callback_opaque); + return disInstr_MIPS_WRK_Special2(cins, archinfo, abiinfo, + dres, bstmt, lastn); case 0x0D: /* DAUI */ if (VEX_MIPS_CPU_HAS_MIPSR6(archinfo->hwcaps)) { @@ -23500,8 +23477,8 @@ static UInt disInstr_MIPS_WRK_10(UInt cins, const VexArchInfo* archinfo, return -1; case 0x0F: /* Special3 */ - return disInstr_MIPS_WRK_Special3(cins, archinfo, abiinfo, dres, bstmt, lastn, - resteerOkFn, callback_opaque); + return disInstr_MIPS_WRK_Special3(cins, archinfo, abiinfo, + dres, bstmt, lastn); default: return -1; @@ -24785,11 +24762,7 @@ static UInt disInstr_MIPS_WRK_30(UInt cins, const VexArchInfo* archinfo, return 0; } -static DisResult disInstr_MIPS_WRK ( Bool(*resteerOkFn) (/*opaque */void *, - Addr), - Bool resteerCisOk, - void* callback_opaque, - Long delta64, +static DisResult disInstr_MIPS_WRK ( Long delta64, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, Bool sigill_diag ) @@ -24815,7 +24788,6 @@ static DisResult disInstr_MIPS_WRK ( Bool(*resteerOkFn) (/*opaque */void *, /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 0; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; @@ -24951,8 +24923,8 @@ static DisResult disInstr_MIPS_WRK ( Bool(*resteerOkFn) (/*opaque */void *, switch (opcode & 0x30) { case 0x00: - result = disInstr_MIPS_WRK_00(cins, archinfo, abiinfo, &dres, &bstmt, &lastn, - resteerOkFn, callback_opaque); + result = disInstr_MIPS_WRK_00(cins, archinfo, abiinfo, + &dres, &bstmt, &lastn); if (result == -1) goto decode_failure; @@ -24962,8 +24934,8 @@ static DisResult disInstr_MIPS_WRK ( Bool(*resteerOkFn) (/*opaque */void *, case 0x10: - result = disInstr_MIPS_WRK_10(cins, archinfo, abiinfo, &dres, &bstmt, &lastn, - resteerOkFn, callback_opaque); + result = disInstr_MIPS_WRK_10(cins, archinfo, abiinfo, + &dres, &bstmt, &lastn); if (result == -1) goto decode_failure; @@ -25066,15 +25038,6 @@ decode_success: break; - case Dis_ResteerU: - case Dis_ResteerC: - if (mode64) - putPC(mkU64(dres.continueAt)); - else - putPC(mkU32(dres.continueAt)); - - break; - case Dis_StopHere: break; @@ -25110,9 +25073,6 @@ decode_success: /* Disassemble a single instruction into IR. The instruction is located in host memory at &guest_code[delta]. */ DisResult disInstr_MIPS( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void *, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta, Addr guest_IP, @@ -25142,8 +25102,7 @@ DisResult disInstr_MIPS( IRSB* irsb_IN, guest_PC_curr_instr = (Addr64)guest_IP; #endif - dres = disInstr_MIPS_WRK(resteerOkFn, resteerCisOk, callback_opaque, - delta, archinfo, abiinfo, sigill_diag_IN); + dres = disInstr_MIPS_WRK(delta, archinfo, abiinfo, sigill_diag_IN); return dres; } diff --git a/VEX/priv/guest_nanomips_defs.h b/VEX/priv/guest_nanomips_defs.h index 490ef4bc5..39176713e 100644 --- a/VEX/priv/guest_nanomips_defs.h +++ b/VEX/priv/guest_nanomips_defs.h @@ -50,9 +50,6 @@ /* Convert one nanoMIPS insn to IR. See the type DisOneInstrFn in guest_generic_bb_to_IR.h. */ extern DisResult disInstr_nanoMIPS ( IRSB* irbb, - Bool (*resteerOkFn) (void *, Addr), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_nanomips_toIR.c b/VEX/priv/guest_nanomips_toIR.c index 1a6ed0d7f..ad099eddc 100644 --- a/VEX/priv/guest_nanomips_toIR.c +++ b/VEX/priv/guest_nanomips_toIR.c @@ -3037,9 +3037,6 @@ static Bool check_for_special_requests_nanoMIPS(DisResult *dres, /* Disassemble a single instruction into IR. The instruction is located in host memory at &guest_code[delta]. */ DisResult disInstr_nanoMIPS( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void *, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta, Addr guest_IP, @@ -3056,7 +3053,6 @@ DisResult disInstr_nanoMIPS( IRSB* irsb_IN, /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 0; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; @@ -3082,7 +3078,6 @@ DisResult disInstr_nanoMIPS( IRSB* irsb_IN, } if ((dres.whatNext == Dis_Continue) || - (dres.whatNext == Dis_ResteerC) || (dres.jk_StopHere == Ijk_Sys_syscall) || (dres.jk_StopHere == Ijk_SigTRAP) || (dres.jk_StopHere == Ijk_SigILL) || diff --git a/VEX/priv/guest_ppc_defs.h b/VEX/priv/guest_ppc_defs.h index bbca1d8e4..802e75a32 100644 --- a/VEX/priv/guest_ppc_defs.h +++ b/VEX/priv/guest_ppc_defs.h @@ -50,9 +50,6 @@ guest_generic_bb_to_IR.h. */ extern DisResult disInstr_PPC ( IRSB* irbb, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_ppc_toIR.c b/VEX/priv/guest_ppc_toIR.c index 12480b3d1..6e3fac74f 100644 --- a/VEX/priv/guest_ppc_toIR.c +++ b/VEX/priv/guest_ppc_toIR.c @@ -8188,9 +8188,7 @@ static IRExpr* /* :: Ity_I32 */ branch_cond_ok( UInt BO, UInt BI ) */ static Bool dis_branch ( UInt theInstr, const VexAbiInfo* vbi, - /*OUT*/DisResult* dres, - Bool (*resteerOkFn)(void*,Addr), - void* callback_opaque ) + /*OUT*/DisResult* dres ) { UChar opc1 = ifieldOPC(theInstr); UChar BO = ifieldRegDS(theInstr); @@ -8250,13 +8248,8 @@ static Bool dis_branch ( UInt theInstr, } } - if (resteerOkFn( callback_opaque, tgt )) { - dres->whatNext = Dis_ResteerU; - dres->continueAt = tgt; - } else { - dres->jk_StopHere = flag_LK ? Ijk_Call : Ijk_Boring; ; - putGST( PPC_GST_CIA, mkSzImm(ty, tgt) ); - } + dres->jk_StopHere = flag_LK ? Ijk_Call : Ijk_Boring; ; + putGST( PPC_GST_CIA, mkSzImm(ty, tgt) ); break; case 0x10: // bc (Branch Conditional, PPC32 p361) @@ -27939,9 +27932,7 @@ static Bool dis_av_fp_convert ( UInt theInstr ) static Bool dis_transactional_memory ( UInt theInstr, UInt nextInstr, const VexAbiInfo* vbi, - /*OUT*/DisResult* dres, - Bool (*resteerOkFn)(void*,Addr), - void* callback_opaque ) + /*OUT*/DisResult* dres ) { UInt opc2 = IFIELD( theInstr, 1, 10 ); @@ -28419,9 +28410,6 @@ static UInt get_VSX60_opc2(UInt opc2_full, UInt theInstr) static DisResult disInstr_PPC_WRK ( - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, Long delta64, const VexArchInfo* archinfo, const VexAbiInfo* abiinfo, @@ -28474,7 +28462,6 @@ DisResult disInstr_PPC_WRK ( /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 0; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; @@ -28662,8 +28649,7 @@ DisResult disInstr_PPC_WRK ( /* Branch Instructions */ case 0x12: case 0x10: // b, bc - if (dis_branch(theInstr, abiinfo, &dres, - resteerOkFn, callback_opaque)) + if (dis_branch(theInstr, abiinfo, &dres)) goto decode_success; goto decode_failure; @@ -29318,8 +29304,7 @@ DisResult disInstr_PPC_WRK ( /* Branch Instructions */ case 0x210: case 0x010: // bcctr, bclr - if (dis_branch(theInstr, abiinfo, &dres, - resteerOkFn, callback_opaque)) + if (dis_branch(theInstr, abiinfo, &dres)) goto decode_success; goto decode_failure; @@ -29420,8 +29405,7 @@ DisResult disInstr_PPC_WRK ( case 0x38E: case 0x3AE: case 0x3EE: // tabort., treclaim., trechkpt. if (dis_transactional_memory( theInstr, getUIntPPCendianly( &guest_code[delta + 4]), - abiinfo, &dres, - resteerOkFn, callback_opaque)) + abiinfo, &dres)) goto decode_success; goto decode_failure; @@ -30137,7 +30121,6 @@ DisResult disInstr_PPC_WRK ( dres.len = 0; dres.whatNext = Dis_StopHere; dres.jk_StopHere = Ijk_NoDecode; - dres.continueAt = 0; return dres; } /* switch (opc) for the main (primary) opcode switch. */ @@ -30147,10 +30130,6 @@ DisResult disInstr_PPC_WRK ( case Dis_Continue: putGST( PPC_GST_CIA, mkSzImm(ty, guest_CIA_curr_instr + 4)); break; - case Dis_ResteerU: - case Dis_ResteerC: - putGST( PPC_GST_CIA, mkSzImm(ty, dres.continueAt)); - break; case Dis_StopHere: break; default: @@ -30178,9 +30157,6 @@ DisResult disInstr_PPC_WRK ( is located in host memory at &guest_code[delta]. */ DisResult disInstr_PPC ( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta, Addr guest_IP, @@ -30205,7 +30181,6 @@ DisResult disInstr_PPC ( IRSB* irsb_IN, dres.len = 0; dres.whatNext = Dis_StopHere; dres.jk_StopHere = Ijk_NoDecode; - dres.continueAt = 0; dres.hint = Dis_HintNone; return dres; } @@ -30233,8 +30208,7 @@ DisResult disInstr_PPC ( IRSB* irsb_IN, guest_CIA_curr_instr = mkSzAddr(ty, guest_IP); guest_CIA_bbstart = mkSzAddr(ty, guest_IP - delta); - dres = disInstr_PPC_WRK ( resteerOkFn, resteerCisOk, callback_opaque, - delta, archinfo, abiinfo, sigill_diag_IN); + dres = disInstr_PPC_WRK ( delta, archinfo, abiinfo, sigill_diag_IN ); return dres; } diff --git a/VEX/priv/guest_s390_defs.h b/VEX/priv/guest_s390_defs.h index 1470558ce..b9f038aaa 100644 --- a/VEX/priv/guest_s390_defs.h +++ b/VEX/priv/guest_s390_defs.h @@ -39,9 +39,6 @@ /* Convert one s390 insn to IR. See the type DisOneInstrFn in guest_generic_bb_to_IR.h. */ DisResult disInstr_S390 ( IRSB* irbb, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_s390_toIR.c b/VEX/priv/guest_s390_toIR.c index 06ec27fae..a8f0d3a07 100644 --- a/VEX/priv/guest_s390_toIR.c +++ b/VEX/priv/guest_s390_toIR.c @@ -68,10 +68,6 @@ static Addr64 guest_IA_next_instr; /* Result of disassembly step. */ static DisResult *dis_res; -/* Resteer function and callback data */ -static Bool (*resteer_fn)(void *, Addr); -static void *resteer_data; - /* Whether to print diagnostics for illegal instructions. */ static Bool sigill_diag; @@ -427,15 +423,10 @@ call_function(IRExpr *callee_address) static void call_function_and_chase(Addr64 callee_address) { - if (resteer_fn(resteer_data, callee_address)) { - dis_res->whatNext = Dis_ResteerU; - dis_res->continueAt = callee_address; - } else { - put_IA(mkaddr_expr(callee_address)); + put_IA(mkaddr_expr(callee_address)); - dis_res->whatNext = Dis_StopHere; - dis_res->jk_StopHere = Ijk_Call; - } + dis_res->whatNext = Dis_StopHere; + dis_res->jk_StopHere = Ijk_Call; } /* Function return sequence */ @@ -501,19 +492,14 @@ always_goto(IRExpr *target) /* An unconditional branch to a known target. */ +// QQQQ fixme this is now the same as always_goto static void always_goto_and_chase(Addr64 target) { - if (resteer_fn(resteer_data, target)) { - /* Follow into the target */ - dis_res->whatNext = Dis_ResteerU; - dis_res->continueAt = target; - } else { - put_IA(mkaddr_expr(target)); + put_IA(mkaddr_expr(target)); - dis_res->whatNext = Dis_StopHere; - dis_res->jk_StopHere = Ijk_Boring; - } + dis_res->whatNext = Dis_StopHere; + dis_res->jk_StopHere = Ijk_Boring; } /* A system call */ @@ -21953,7 +21939,6 @@ disInstr_S390_WRK(const UChar *insn) /* ---------------------------------------------------- */ dres.whatNext = Dis_Continue; dres.len = insn_length; - dres.continueAt = 0; dres.jk_StopHere = Ijk_INVALID; dres.hint = Dis_HintNone; @@ -21973,17 +21958,12 @@ disInstr_S390_WRK(const UChar *insn) dres.len = 0; dres.whatNext = Dis_StopHere; dres.jk_StopHere = Ijk_NoDecode; - dres.continueAt = 0; } else { /* Decode success */ switch (dres.whatNext) { case Dis_Continue: put_IA(mkaddr_expr(guest_IA_next_instr)); break; - case Dis_ResteerU: - case Dis_ResteerC: - put_IA(mkaddr_expr(dres.continueAt)); - break; case Dis_StopHere: if (dres.jk_StopHere == Ijk_EmWarn || dres.jk_StopHere == Ijk_EmFail) { @@ -22011,9 +21991,6 @@ disInstr_S390_WRK(const UChar *insn) DisResult disInstr_S390(IRSB *irsb_IN, - Bool (*resteerOkFn)(void *, Addr), - Bool resteerCisOk, - void *callback_opaque, const UChar *guest_code, Long delta, Addr guest_IP, @@ -22028,8 +22005,6 @@ disInstr_S390(IRSB *irsb_IN, /* Set globals (see top of this file) */ guest_IA_curr_instr = guest_IP; irsb = irsb_IN; - resteer_fn = resteerOkFn; - resteer_data = callback_opaque; sigill_diag = sigill_diag_IN; return disInstr_S390_WRK(guest_code + delta); diff --git a/VEX/priv/guest_x86_defs.h b/VEX/priv/guest_x86_defs.h index d10ca2771..3f86339bc 100644 --- a/VEX/priv/guest_x86_defs.h +++ b/VEX/priv/guest_x86_defs.h @@ -49,9 +49,6 @@ guest_generic_bb_to_IR.h. */ extern DisResult disInstr_X86 ( IRSB* irbb, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code, Long delta, Addr guest_IP, diff --git a/VEX/priv/guest_x86_toIR.c b/VEX/priv/guest_x86_toIR.c index fd45a2ae2..01bcc8a95 100644 --- a/VEX/priv/guest_x86_toIR.c +++ b/VEX/priv/guest_x86_toIR.c @@ -1346,7 +1346,6 @@ static void jmp_lit( /*MOD*/DisResult* dres, { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = kind; @@ -1358,7 +1357,6 @@ static void jmp_treg( /*MOD*/DisResult* dres, { vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = kind; @@ -1373,7 +1371,6 @@ void jcc_01( /*MOD*/DisResult* dres, X86Condcode condPos; vassert(dres->whatNext == Dis_Continue); vassert(dres->len == 0); - vassert(dres->continueAt == 0); vassert(dres->jk_StopHere == Ijk_INVALID); dres->whatNext = Dis_StopHere; dres->jk_StopHere = Ijk_Boring; @@ -8070,9 +8067,6 @@ static IRTemp math_BSWAP ( IRTemp t1, IRType ty ) static DisResult disInstr_X86_WRK ( /*OUT*/Bool* expect_CAS, - Bool (*resteerOkFn) ( /*opaque*/void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, Long delta64, const VexArchInfo* archinfo, const VexAbiInfo* vbi, @@ -8111,7 +8105,6 @@ DisResult disInstr_X86_WRK ( /* Set result defaults. */ dres.whatNext = Dis_Continue; dres.len = 0; - dres.continueAt = 0; dres.hint = Dis_HintNone; dres.jk_StopHere = Ijk_INVALID; @@ -13143,14 +13136,8 @@ DisResult disInstr_X86_WRK ( assign(t1, binop(Iop_Sub32, getIReg(4,R_ESP), mkU32(4))); putIReg(4, R_ESP, mkexpr(t1)); storeLE( mkexpr(t1), mkU32(guest_EIP_bbstart+delta)); - if (resteerOkFn( callback_opaque, (Addr32)d32 )) { - /* follow into the call target. */ - dres.whatNext = Dis_ResteerU; - dres.continueAt = (Addr32)d32; - } else { - jmp_lit(&dres, Ijk_Call, d32); - vassert(dres.whatNext == Dis_StopHere); - } + jmp_lit(&dres, Ijk_Call, d32); + vassert(dres.whatNext == Dis_StopHere); DIP("call 0x%x\n",d32); } break; @@ -13460,13 +13447,8 @@ DisResult disInstr_X86_WRK ( case 0xEB: /* Jb (jump, byte offset) */ d32 = (((Addr32)guest_EIP_bbstart)+delta+1) + getSDisp8(delta); delta++; - if (resteerOkFn( callback_opaque, (Addr32)d32) ) { - dres.whatNext = Dis_ResteerU; - dres.continueAt = (Addr32)d32; - } else { - jmp_lit(&dres, Ijk_Boring, d32); - vassert(dres.whatNext == Dis_StopHere); - } + jmp_lit(&dres, Ijk_Boring, d32); + vassert(dres.whatNext == Dis_StopHere); DIP("jmp-8 0x%x\n", d32); break; @@ -13474,13 +13456,8 @@ DisResult disInstr_X86_WRK ( vassert(sz == 4); /* JRS added 2004 July 11 */ d32 = (((Addr32)guest_EIP_bbstart)+delta+sz) + getSDisp(sz,delta); delta += sz; - if (resteerOkFn( callback_opaque, (Addr32)d32) ) { - dres.whatNext = Dis_ResteerU; - dres.continueAt = (Addr32)d32; - } else { - jmp_lit(&dres, Ijk_Boring, d32); - vassert(dres.whatNext == Dis_StopHere); - } + jmp_lit(&dres, Ijk_Boring, d32); + vassert(dres.whatNext == Dis_StopHere); DIP("jmp 0x%x\n", d32); break; @@ -13506,53 +13483,10 @@ DisResult disInstr_X86_WRK ( vassert(-128 <= jmpDelta && jmpDelta < 128); d32 = (((Addr32)guest_EIP_bbstart)+delta+1) + jmpDelta; delta++; - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr32)d32 != (Addr32)guest_EIP_bbstart - && jmpDelta < 0 - && resteerOkFn( callback_opaque, (Addr32)d32) ) { - /* Speculation: assume this backward branch is taken. So we - need to emit a side-exit to the insn following this one, - on the negation of the condition, and continue at the - branch target address (d32). If we wind up back at the - first instruction of the trace, just stop; it's better to - let the IR loop unroller handle that case. */ - stmt( IRStmt_Exit( - mk_x86g_calculate_condition((X86Condcode)(1 ^ (opc - 0x70))), - Ijk_Boring, - IRConst_U32(guest_EIP_bbstart+delta), - OFFB_EIP ) ); - dres.whatNext = Dis_ResteerC; - dres.continueAt = (Addr32)d32; - comment = "(assumed taken)"; - } - else - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr32)d32 != (Addr32)guest_EIP_bbstart - && jmpDelta >= 0 - && resteerOkFn( callback_opaque, - (Addr32)(guest_EIP_bbstart+delta)) ) { - /* Speculation: assume this forward branch is not taken. So - we need to emit a side-exit to d32 (the dest) and continue - disassembling at the insn immediately following this - one. */ - stmt( IRStmt_Exit( - mk_x86g_calculate_condition((X86Condcode)(opc - 0x70)), - Ijk_Boring, - IRConst_U32(d32), - OFFB_EIP ) ); - dres.whatNext = Dis_ResteerC; - dres.continueAt = guest_EIP_bbstart + delta; - comment = "(assumed not taken)"; - } - else { - /* Conservative default translation - end the block at this - point. */ - jcc_01( &dres, (X86Condcode)(opc - 0x70), - (Addr32)(guest_EIP_bbstart+delta), d32); - vassert(dres.whatNext == Dis_StopHere); - } + /* End the block at this point. */ + jcc_01( &dres, (X86Condcode)(opc - 0x70), + (Addr32)(guest_EIP_bbstart+delta), d32); + vassert(dres.whatNext == Dis_StopHere); DIP("j%s-8 0x%x %s\n", name_X86Condcode(opc - 0x70), d32, comment); break; } @@ -15075,54 +15009,10 @@ DisResult disInstr_X86_WRK ( jmpDelta = (Int)getUDisp32(delta); d32 = (((Addr32)guest_EIP_bbstart)+delta+4) + jmpDelta; delta += 4; - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr32)d32 != (Addr32)guest_EIP_bbstart - && jmpDelta < 0 - && resteerOkFn( callback_opaque, (Addr32)d32) ) { - /* Speculation: assume this backward branch is taken. So - we need to emit a side-exit to the insn following this - one, on the negation of the condition, and continue at - the branch target address (d32). If we wind up back at - the first instruction of the trace, just stop; it's - better to let the IR loop unroller handle that case.*/ - stmt( IRStmt_Exit( - mk_x86g_calculate_condition((X86Condcode) - (1 ^ (opc - 0x80))), - Ijk_Boring, - IRConst_U32(guest_EIP_bbstart+delta), - OFFB_EIP ) ); - dres.whatNext = Dis_ResteerC; - dres.continueAt = (Addr32)d32; - comment = "(assumed taken)"; - } - else - if (resteerCisOk - && vex_control.guest_chase_cond - && (Addr32)d32 != (Addr32)guest_EIP_bbstart - && jmpDelta >= 0 - && resteerOkFn( callback_opaque, - (Addr32)(guest_EIP_bbstart+delta)) ) { - /* Speculation: assume this forward branch is not taken. - So we need to emit a side-exit to d32 (the dest) and - continue disassembling at the insn immediately - following this one. */ - stmt( IRStmt_Exit( - mk_x86g_calculate_condition((X86Condcode)(opc - 0x80)), - Ijk_Boring, - IRConst_U32(d32), - OFFB_EIP ) ); - dres.whatNext = Dis_ResteerC; - dres.continueAt = guest_EIP_bbstart + delta; - comment = "(assumed not taken)"; - } - else { - /* Conservative default translation - end the block at - this point. */ - jcc_01( &dres, (X86Condcode)(opc - 0x80), - (Addr32)(guest_EIP_bbstart+delta), d32); - vassert(dres.whatNext == Dis_StopHere); - } + /* End the block at this point. */ + jcc_01( &dres, (X86Condcode)(opc - 0x80), + (Addr32)(guest_EIP_bbstart+delta), d32); + vassert(dres.whatNext == Dis_StopHere); DIP("j%s-32 0x%x %s\n", name_X86Condcode(opc - 0x80), d32, comment); break; } @@ -15462,10 +15352,6 @@ DisResult disInstr_X86_WRK ( case Dis_Continue: stmt( IRStmt_Put( OFFB_EIP, mkU32(guest_EIP_bbstart + delta) ) ); break; - case Dis_ResteerU: - case Dis_ResteerC: - stmt( IRStmt_Put( OFFB_EIP, mkU32(dres.continueAt) ) ); - break; case Dis_StopHere: break; default: @@ -15489,9 +15375,6 @@ DisResult disInstr_X86_WRK ( is located in host memory at &guest_code[delta]. */ DisResult disInstr_X86 ( IRSB* irsb_IN, - Bool (*resteerOkFn) ( void*, Addr ), - Bool resteerCisOk, - void* callback_opaque, const UChar* guest_code_IN, Long delta, Addr guest_IP, @@ -15515,9 +15398,7 @@ DisResult disInstr_X86 ( IRSB* irsb_IN, x1 = irsb_IN->stmts_used; expect_CAS = False; - dres = disInstr_X86_WRK ( &expect_CAS, resteerOkFn, - resteerCisOk, - callback_opaque, + dres = disInstr_X86_WRK ( &expect_CAS, delta, archinfo, abiinfo, sigill_diag_IN ); x2 = irsb_IN->stmts_used; vassert(x2 >= x1); @@ -15535,9 +15416,7 @@ DisResult disInstr_X86 ( IRSB* irsb_IN, /* inconsistency detected. re-disassemble the instruction so as to generate a useful error message; then assert. */ vex_traceflags |= VEX_TRACE_FE; - dres = disInstr_X86_WRK ( &expect_CAS, resteerOkFn, - resteerCisOk, - callback_opaque, + dres = disInstr_X86_WRK ( &expect_CAS, delta, archinfo, abiinfo, sigill_diag_IN ); for (i = x1; i < x2; i++) { vex_printf("\t\t"); diff --git a/VEX/priv/host_amd64_isel.c b/VEX/priv/host_amd64_isel.c index e19dcb34f..a389e8178 100644 --- a/VEX/priv/host_amd64_isel.c +++ b/VEX/priv/host_amd64_isel.c @@ -2292,9 +2292,7 @@ static AMD64CondCode iselCondCode_wrk ( ISelEnv* env, const IRExpr* e ) /* var */ if (e->tag == Iex_RdTmp) { HReg r64 = lookupIRTemp(env, e->Iex.RdTmp.tmp); - HReg dst = newVRegI(env); - addInstr(env, mk_iMOVsd_RR(r64,dst)); - addInstr(env, AMD64Instr_Alu64R(Aalu_AND,AMD64RMI_Imm(1),dst)); + addInstr(env, AMD64Instr_Test64(1,r64)); return Acc_NZ; } @@ -2536,6 +2534,25 @@ static AMD64CondCode iselCondCode_wrk ( ISelEnv* env, const IRExpr* e ) } } + /* And1(x,y), Or1(x,y) */ + /* FIXME: We could (and probably should) do a lot better here. If both args + are in temps already then we can just emit a reg-reg And/Or directly, + followed by the final Test. */ + if (e->tag == Iex_Binop + && (e->Iex.Binop.op == Iop_And1 || e->Iex.Binop.op == Iop_Or1)) { + // We could probably be cleverer about this. In the meantime .. + HReg x_as_64 = newVRegI(env); + AMD64CondCode cc_x = iselCondCode(env, e->Iex.Binop.arg1); + addInstr(env, AMD64Instr_Set64(cc_x, x_as_64)); + HReg y_as_64 = newVRegI(env); + AMD64CondCode cc_y = iselCondCode(env, e->Iex.Binop.arg2); + addInstr(env, AMD64Instr_Set64(cc_y, y_as_64)); + AMD64AluOp aop = e->Iex.Binop.op == Iop_And1 ? Aalu_AND : Aalu_OR; + addInstr(env, AMD64Instr_Alu64R(aop, AMD64RMI_Reg(x_as_64), y_as_64)); + addInstr(env, AMD64Instr_Test64(1, y_as_64)); + return Acc_NZ; + } + ppIRExpr(e); vpanic("iselCondCode(amd64)"); } diff --git a/VEX/priv/ir_defs.c b/VEX/priv/ir_defs.c index 30e936a2d..603557466 100644 --- a/VEX/priv/ir_defs.c +++ b/VEX/priv/ir_defs.c @@ -168,6 +168,8 @@ void ppIROp ( IROp op ) case Iop_64to8: vex_printf("64to8"); return; case Iop_Not1: vex_printf("Not1"); return; + case Iop_And1: vex_printf("And1"); return; + case Iop_Or1: vex_printf("Or1"); return; case Iop_32to1: vex_printf("32to1"); return; case Iop_64to1: vex_printf("64to1"); return; case Iop_1Uto8: vex_printf("1Uto8"); return; @@ -2849,7 +2851,12 @@ void typeOfPrimop ( IROp op, case Iop_64HLto128: BINARY(Ity_I64,Ity_I64, Ity_I128); - case Iop_Not1: UNARY(Ity_I1, Ity_I1); + case Iop_Not1: + UNARY(Ity_I1, Ity_I1); + case Iop_And1: + case Iop_Or1: + BINARY(Ity_I1,Ity_I1, Ity_I1); + case Iop_1Uto8: UNARY(Ity_I1, Ity_I8); case Iop_1Sto8: UNARY(Ity_I1, Ity_I8); case Iop_1Sto16: UNARY(Ity_I1, Ity_I16); diff --git a/VEX/priv/ir_opt.c b/VEX/priv/ir_opt.c index bbe9401d5..9e9c02601 100644 --- a/VEX/priv/ir_opt.c +++ b/VEX/priv/ir_opt.c @@ -891,7 +891,7 @@ static void redundant_put_removal_BB ( IRStmt* st; UInt key = 0; /* keep gcc -O happy */ - vassert(pxControl < VexRegUpdAllregsAtEachInsn); +// vassert(pxControl < VexRegUpdAllregsAtEachInsn); HashHW* env = newHHW(); @@ -2819,7 +2819,7 @@ static IRStmt* subst_and_fold_Stmt ( IRExpr** env, IRStmt* st ) } -IRSB* cprop_BB ( IRSB* in ) +static IRSB* cprop_BB_wrk ( IRSB* in, Bool mustRetainNoOps ) { Int i; IRSB* out; @@ -2854,7 +2854,7 @@ IRSB* cprop_BB ( IRSB* in ) st2 = in->stmts[i]; /* perhaps st2 is already a no-op? */ - if (st2->tag == Ist_NoOp) continue; + if (st2->tag == Ist_NoOp && !mustRetainNoOps) continue; st2 = subst_and_fold_Stmt( env, st2 ); @@ -2864,7 +2864,11 @@ IRSB* cprop_BB ( IRSB* in ) /* If the statement has been folded into a no-op, forget it. */ case Ist_NoOp: - continue; + if (mustRetainNoOps) { + break; + } else { + continue; + } /* If the statement assigns to an IRTemp add it to the running environment. This is for the benefit of copy @@ -2982,6 +2986,11 @@ IRSB* cprop_BB ( IRSB* in ) } +IRSB* cprop_BB ( IRSB* in ) { + return cprop_BB_wrk(in, /*mustRetainNoOps=*/False); +} + + /*---------------------------------------------------------------*/ /*--- Dead code (t = E) removal ---*/ /*---------------------------------------------------------------*/ @@ -4665,7 +4674,7 @@ static void deltaIRExpr ( IRExpr* e, Int delta ) /* Adjust all tmp values (names) in st by delta. st is destructively modified. */ -static void deltaIRStmt ( IRStmt* st, Int delta ) +/*static*/ void deltaIRStmt ( IRStmt* st, Int delta ) { Int i; IRDirty* d; @@ -6632,8 +6641,6 @@ static void considerExpensives ( /*OUT*/Bool* hasGetIorPutI, may get shared. So never change a field of such a tree node; instead construct and return a new one if needed. */ - - IRSB* do_iropt_BB( IRSB* bb0, IRExpr* (*specHelper) (const HChar*, IRExpr**, IRStmt**, Int), @@ -6653,7 +6660,8 @@ IRSB* do_iropt_BB( /* First flatten the block out, since all other phases assume flat code. */ - + // FIXME this is no longer necessary, since minimal_iropt should have + // flattened it bb = flatten_BB ( bb0 ); if (iropt_verbose) { @@ -6746,6 +6754,81 @@ IRSB* do_iropt_BB( return bb; } +//static Bool alwaysPrecise ( Int minoff, Int maxoff, +// VexRegisterUpdates pxControl ) +//{ +// return True; +//} + +// FIXME make this as cheap as possible +IRSB* do_minimal_initial_iropt_BB( + IRSB* bb0 + //IRExpr* (*specHelper) (const HChar*, IRExpr**, IRStmt**, Int), + //Bool (*preciseMemExnsFn)(Int,Int,VexRegisterUpdates), + //VexRegisterUpdates pxControl, + //Addr guest_addr, + //VexArch guest_arch + ) +{ + /* First flatten the block out, since all other phases assume flat code. */ + IRSB* bb = flatten_BB ( bb0 ); + + if (iropt_verbose) { + vex_printf("\n========= FLAT\n\n" ); + ppIRSB(bb); + } + + redundant_get_removal_BB ( bb ); + bb = cprop_BB_wrk ( bb, /*mustRetainNoOps=*/True ); // FIXME + // This is overkill. We only really want constant prop, not folding + + // Minor tidying of the block end, to remove a redundant Put of the IP right + // at the end: + /* + ------ IMark(0x401FEC9, 2, 0) ------ + t18 = GET:I64(168) + t19 = amd64g_calculate_condition[mcx=0x13]{0x58155130}(0x4:I64,0x5:I64,.. + t14 = 64to1(t19) + if (t14) { PUT(184) = 0x401FED6:I64; exit-Boring } + PUT(184) = 0x401FECB:I64 <-------------------------------- + PUT(184) = 0x401FECB:I64; exit-Boring + */ + if (bb->stmts_used > 0) { + const IRStmt* last = bb->stmts[bb->stmts_used - 1]; + if (last->tag == Ist_Put && last->Ist.Put.offset == bb->offsIP + && eqIRAtom(last->Ist.Put.data, bb->next)) { + bb->stmts_used--; + } + } + + return bb; +} + +/* Copy the contents of |src| to the end of |dst|. This entails fixing up the + tmp numbers in |src| accordingly. The final destination of |dst| is thrown + away and replaced by the final destination of |src|. This function doesn't + make any assessment of whether it's meaningful or valid to concatenate the + two IRSBs; it just *does* the concatenation. */ +void concatenate_irsbs ( IRSB* dst, IRSB* src ) +{ + // FIXME this is almost identical to code at the end of maybe_unroll_loop_BB. + // Maybe try to common it up. + Int delta = dst->tyenv->types_used; + for (Int i = 0; i < src->tyenv->types_used; i++) { + (void)newIRTemp(dst->tyenv, src->tyenv->types[i]); + } + + for (Int i = 0; i < src->stmts_used; i++) { + IRStmt* s = deepCopyIRStmt(src->stmts[i]); + deltaIRStmt(s, delta); + addStmtToIRSB(dst, s); + } + deltaIRExpr(src->next, delta); + + dst->next = src->next; + dst->jumpkind = src->jumpkind; + vassert(dst->offsIP == src->offsIP); +} /*---------------------------------------------------------------*/ /*--- end ir_opt.c ---*/ diff --git a/VEX/priv/ir_opt.h b/VEX/priv/ir_opt.h index 3ea4a5aa6..e5127be6c 100644 --- a/VEX/priv/ir_opt.h +++ b/VEX/priv/ir_opt.h @@ -70,6 +70,12 @@ Addr ado_treebuild_BB ( VexRegisterUpdates pxControl ); +IRSB* do_minimal_initial_iropt_BB( + IRSB* bb0 + ); +void concatenate_irsbs ( IRSB* dst, IRSB* src ); +void deltaIRStmt ( IRStmt* st, Int delta ); + #endif /* ndef __VEX_IR_OPT_H */ /*---------------------------------------------------------------*/ diff --git a/VEX/pub/libvex.h b/VEX/pub/libvex.h index 1d1979cb5..5a76066d7 100644 --- a/VEX/pub/libvex.h +++ b/VEX/pub/libvex.h @@ -517,9 +517,11 @@ typedef meaning that if a block contains less than 10 guest insns so far, the front end(s) will attempt to chase into its successor. A setting of zero disables chasing. */ + // FIXME change this to a Bool Int guest_chase_thresh; /* EXPERIMENTAL: chase across conditional branches? Not all front ends honour this. Default: NO. */ + // FIXME remove this completely. Bool guest_chase_cond; /* Register allocator version. Allowed values are: - '2': previous, good and slow implementation. diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h index d5e7f5036..087a41457 100644 --- a/VEX/pub/libvex_ir.h +++ b/VEX/pub/libvex_ir.h @@ -557,6 +557,8 @@ typedef Iop_64HLto128, // :: (I64,I64) -> I128 /* 1-bit stuff */ Iop_Not1, /* :: Ity_Bit -> Ity_Bit */ + Iop_And1, /* :: (Ity_Bit, Ity_Bit) -> Ity_Bit. Evaluates both args! */ + Iop_Or1, /* :: (Ity_Bit, Ity_Bit) -> Ity_Bit. Evaluates both args! */ Iop_32to1, /* :: Ity_I32 -> Ity_Bit, just select bit[0] */ Iop_64to1, /* :: Ity_I64 -> Ity_Bit, just select bit[0] */ Iop_1Uto8, /* :: Ity_Bit -> Ity_I8, unsigned widen */ diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c index 02b93ba51..bd29ea09f 100644 --- a/memcheck/mc_translate.c +++ b/memcheck/mc_translate.c @@ -610,6 +610,12 @@ static IRExpr *i128_const_zero(void) /* --------- Defined-if-either-defined --------- */ +static IRAtom* mkDifD1 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) { + tl_assert(isShadowAtom(mce,a1)); + tl_assert(isShadowAtom(mce,a2)); + return assignNew('V', mce, Ity_I1, binop(Iop_And1, a1, a2)); +} + static IRAtom* mkDifD8 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) { tl_assert(isShadowAtom(mce,a1)); tl_assert(isShadowAtom(mce,a2)); @@ -648,6 +654,12 @@ static IRAtom* mkDifDV256 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) { /* --------- Undefined-if-either-undefined --------- */ +static IRAtom* mkUifU1 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) { + tl_assert(isShadowAtom(mce,a1)); + tl_assert(isShadowAtom(mce,a2)); + return assignNew('V', mce, Ity_I1, binop(Iop_Or1, a1, a2)); +} + static IRAtom* mkUifU8 ( MCEnv* mce, IRAtom* a1, IRAtom* a2 ) { tl_assert(isShadowAtom(mce,a1)); tl_assert(isShadowAtom(mce,a2)); @@ -768,6 +780,14 @@ static IRAtom* mkRight64 ( MCEnv* mce, IRAtom* a1 ) /* ImproveAND(data, vbits) = data OR vbits. Defined (0) data 0s give defined (0); all other -> undefined (1). */ +static IRAtom* mkImproveAND1 ( MCEnv* mce, IRAtom* data, IRAtom* vbits ) +{ + tl_assert(isOriginalAtom(mce, data)); + tl_assert(isShadowAtom(mce, vbits)); + tl_assert(sameKindedAtoms(data, vbits)); + return assignNew('V', mce, Ity_I1, binop(Iop_Or1, data, vbits)); +} + static IRAtom* mkImproveAND8 ( MCEnv* mce, IRAtom* data, IRAtom* vbits ) { tl_assert(isOriginalAtom(mce, data)); @@ -819,6 +839,18 @@ static IRAtom* mkImproveANDV256 ( MCEnv* mce, IRAtom* data, IRAtom* vbits ) /* ImproveOR(data, vbits) = ~data OR vbits. Defined (0) data 1s give defined (0); all other -> undefined (1). */ +static IRAtom* mkImproveOR1 ( MCEnv* mce, IRAtom* data, IRAtom* vbits ) +{ + tl_assert(isOriginalAtom(mce, data)); + tl_assert(isShadowAtom(mce, vbits)); + tl_assert(sameKindedAtoms(data, vbits)); + return assignNew( + 'V', mce, Ity_I1, + binop(Iop_Or1, + assignNew('V', mce, Ity_I1, unop(Iop_Not1, data)), + vbits) ); +} + static IRAtom* mkImproveOR8 ( MCEnv* mce, IRAtom* data, IRAtom* vbits ) { tl_assert(isOriginalAtom(mce, data)); @@ -3392,10 +3424,10 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce, IRAtom* atom1, IRAtom* atom2, HowUsed hu/*use HuOth if unknown*/ ) { - IRType and_or_ty; - IRAtom* (*uifu) (MCEnv*, IRAtom*, IRAtom*); - IRAtom* (*difd) (MCEnv*, IRAtom*, IRAtom*); - IRAtom* (*improve) (MCEnv*, IRAtom*, IRAtom*); + IRType and_or_ty = Ity_INVALID; + IRAtom* (*uifu) (MCEnv*, IRAtom*, IRAtom*) = NULL; + IRAtom* (*difd) (MCEnv*, IRAtom*, IRAtom*) = NULL; + IRAtom* (*improve) (MCEnv*, IRAtom*, IRAtom*) = NULL; IRAtom* vatom1 = expr2vbits( mce, atom1, HuOth ); IRAtom* vatom2 = expr2vbits( mce, atom2, HuOth ); @@ -4654,6 +4686,9 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce, case Iop_And8: uifu = mkUifU8; difd = mkDifD8; and_or_ty = Ity_I8; improve = mkImproveAND8; goto do_And_Or; + case Iop_And1: + uifu = mkUifU1; difd = mkDifD1; + and_or_ty = Ity_I1; improve = mkImproveAND1; goto do_And_Or; case Iop_OrV256: uifu = mkUifUV256; difd = mkDifDV256; @@ -4673,6 +4708,9 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce, case Iop_Or8: uifu = mkUifU8; difd = mkDifD8; and_or_ty = Ity_I8; improve = mkImproveOR8; goto do_And_Or; + case Iop_Or1: + uifu = mkUifU1; difd = mkDifD1; + and_or_ty = Ity_I1; improve = mkImproveOR1; goto do_And_Or; do_And_Or: return diff --git a/memcheck/tests/vbit-test/binary.c b/memcheck/tests/vbit-test/binary.c index 473a631d6..045221b05 100644 --- a/memcheck/tests/vbit-test/binary.c +++ b/memcheck/tests/vbit-test/binary.c @@ -38,6 +38,7 @@ and_combine(vbits_t v1, vbits_t v2, value_t val2, int invert_val2) if (invert_val2) { switch (v2.num_bits) { + case 1: val2.u1 = ~val2.u1 & 1; break; case 8: val2.u8 = ~val2.u8 & 0xff; break; case 16: val2.u16 = ~val2.u16 & 0xffff; break; case 32: val2.u32 = ~val2.u32; break; @@ -48,6 +49,9 @@ and_combine(vbits_t v1, vbits_t v2, value_t val2, int invert_val2) } switch (v2.num_bits) { + case 1: + new.bits.u1 = (v1.bits.u1 & ~v2.bits.u1 & val2.u1) & 1; + break; case 8: new.bits.u8 = (v1.bits.u8 & ~v2.bits.u8 & val2.u8) & 0xff; break; @@ -423,6 +427,7 @@ all_bits_zero_value(unsigned num_bits) value_t val; switch (num_bits) { + case 1: val.u1 = 0; break; case 8: val.u8 = 0; break; case 16: val.u16 = 0; break; case 32: val.u32 = 0; break; @@ -440,6 +445,7 @@ all_bits_one_value(unsigned num_bits) value_t val; switch (num_bits) { + case 1: val.u1 = 1; break; case 8: val.u8 = 0xff; break; case 16: val.u16 = 0xffff; break; case 32: val.u32 = ~0u; break; diff --git a/memcheck/tests/vbit-test/irops.c b/memcheck/tests/vbit-test/irops.c index 29451b435..79745f593 100644 --- a/memcheck/tests/vbit-test/irops.c +++ b/memcheck/tests/vbit-test/irops.c @@ -46,10 +46,12 @@ static irop_t irops[] = { { DEFOP(Iop_Mul16, UNDEF_LEFT), .s390x = 0, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 0, .ppc32 = 0, .mips32 = 0, .mips64 = 0 }, { DEFOP(Iop_Mul32, UNDEF_LEFT), .s390x = 0, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Mul64, UNDEF_LEFT), .s390x = 0, .amd64 = 1, .x86 = 0, .arm = 0, .ppc64 = 1, .ppc32 = 0, .mips32 = 0, .mips64 = 1 }, // ppc32, mips assert + { DEFOP(Iop_Or1, UNDEF_OR), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Or8, UNDEF_OR), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Or16, UNDEF_OR), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Or32, UNDEF_OR), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Or64, UNDEF_OR), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 0, .mips64 = 1 }, // mips asserts + { DEFOP(Iop_And1, UNDEF_AND), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_And8, UNDEF_AND), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_And16, UNDEF_AND), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_And32, UNDEF_AND), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, @@ -78,6 +80,7 @@ static irop_t irops[] = { { DEFOP(Iop_CmpNE16, UNDEF_CMP_EQ_NE), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 0, .ppc32 = 0, .mips32 = 0, .mips64 = 0 }, { DEFOP(Iop_CmpNE32, UNDEF_CMP_EQ_NE), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_CmpNE64, UNDEF_CMP_EQ_NE), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 0, .mips32 = 0, .mips64 = 1 }, // ppc32, mips assert + { DEFOP(Iop_Not1, UNDEF_SAME), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Not8, UNDEF_SAME), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Not16, UNDEF_SAME), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_Not32, UNDEF_SAME), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, @@ -183,7 +186,6 @@ static irop_t irops[] = { { DEFOP(Iop_128to64, UNDEF_TRUNC), .s390x = 1, .amd64 = 1, .x86 = 0, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 0, .mips64 = 1 }, // mips asserts { DEFOP(Iop_128HIto64, UNDEF_UPPER), .s390x = 1, .amd64 = 1, .x86 = 0, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 0, .mips64 = 1 }, // mips asserts { DEFOP(Iop_64HLto128, UNDEF_CONCAT), .s390x = 1, .amd64 = 1, .x86 = 0, .arm = 0, .ppc64 = 1, .ppc32 = 1, .mips32 = 0, .mips64 = 1 }, // mips asserts - { DEFOP(Iop_Not1, UNDEF_ALL), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_32to1, UNDEF_TRUNC), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, { DEFOP(Iop_64to1, UNDEF_TRUNC), .s390x = 1, .amd64 = 1, .x86 = 0, .arm = 0, .ppc64 = 1, .ppc32 = 0, .mips32 = 0, .mips64 = 1 }, // ppc32, mips assert { DEFOP(Iop_1Uto8, UNDEF_ZEXT), .s390x = 1, .amd64 = 1, .x86 = 1, .arm = 1, .ppc64 = 1, .ppc32 = 1, .mips32 = 1, .mips64 = 1 }, diff --git a/memcheck/tests/vbit-test/vbits.c b/memcheck/tests/vbit-test/vbits.c index 58f82d68d..9307efb35 100644 --- a/memcheck/tests/vbit-test/vbits.c +++ b/memcheck/tests/vbit-test/vbits.c @@ -656,6 +656,7 @@ or_vbits(vbits_t v1, vbits_t v2) vbits_t new = { .num_bits = v1.num_bits }; switch (v1.num_bits) { + case 1: new.bits.u1 = (v1.bits.u1 | v2.bits.u1) & 1; break; case 8: new.bits.u8 = v1.bits.u8 | v2.bits.u8; break; case 16: new.bits.u16 = v1.bits.u16 | v2.bits.u16; break; case 32: new.bits.u32 = v1.bits.u32 | v2.bits.u32; break; @@ -684,6 +685,7 @@ and_vbits(vbits_t v1, vbits_t v2) vbits_t new = { .num_bits = v1.num_bits }; switch (v1.num_bits) { + case 1: new.bits.u1 = (v1.bits.u1 & v2.bits.u1) & 1; break; case 8: new.bits.u8 = v1.bits.u8 & v2.bits.u8; break; case 16: new.bits.u16 = v1.bits.u16 & v2.bits.u16; break; case 32: new.bits.u32 = v1.bits.u32 & v2.bits.u32; break; diff --git a/memcheck/tests/vbit-test/vbits.h b/memcheck/tests/vbit-test/vbits.h index 545ec592a..ebeb33ad9 100644 --- a/memcheck/tests/vbit-test/vbits.h +++ b/memcheck/tests/vbit-test/vbits.h @@ -36,6 +36,7 @@ typedef uint64_t uint256_t[4]; typedef struct { unsigned num_bits; union { + uint8_t u1; uint8_t u8; uint16_t u16; uint32_t u32; @@ -46,10 +47,11 @@ typedef struct { } vbits_t; -/* A type large enough to hold any IRtype'd value. At this point +/* A type large enough to hold any IRType'd value. At this point we do not expect to test with specific floating point values. So we don't need to represent them. */ typedef union { + uint8_t u1; uint8_t u8; uint16_t u16; uint32_t u32; -- 2.11.4.GIT