xref: /qemu/docs/devel/tcg-ops.rst (revision ffd642cb2ca25262342311a3bf2e8a77a00e6dfd)
15e97a28aSMark Cave-Ayland.. _tcg-ops-ref:
25e97a28aSMark Cave-Ayland
35e97a28aSMark Cave-Ayland*******************************
45e97a28aSMark Cave-AylandTCG Intermediate Representation
55e97a28aSMark Cave-Ayland*******************************
65e97a28aSMark Cave-Ayland
75e97a28aSMark Cave-AylandIntroduction
85e97a28aSMark Cave-Ayland============
95e97a28aSMark Cave-Ayland
109644e714SRichard HendersonTCG (Tiny Code Generator) began as a generic backend for a C compiler.
119644e714SRichard HendersonIt was simplified to be used in QEMU.  It also has its roots in the
129644e714SRichard HendersonQOP code generator written by Paul Brook.
135e97a28aSMark Cave-Ayland
145e97a28aSMark Cave-AylandDefinitions
155e97a28aSMark Cave-Ayland===========
165e97a28aSMark Cave-Ayland
179644e714SRichard HendersonThe TCG *target* is the architecture for which we generate the code.
189644e714SRichard HendersonIt is of course not the same as the "target" of QEMU which is the
199644e714SRichard Hendersonemulated architecture.  As TCG started as a generic C backend used
209644e714SRichard Hendersonfor cross compiling, the assumption was that TCG target might be
219644e714SRichard Hendersondifferent from the host, although this is never the case for QEMU.
225e97a28aSMark Cave-Ayland
235e97a28aSMark Cave-AylandIn this document, we use *guest* to specify what architecture we are
245e97a28aSMark Cave-Aylandemulating; *target* always means the TCG target, the machine on which
255e97a28aSMark Cave-Aylandwe are running QEMU.
265e97a28aSMark Cave-Ayland
275e97a28aSMark Cave-AylandAn operation with *undefined behavior* may result in a crash.
285e97a28aSMark Cave-Ayland
295e97a28aSMark Cave-AylandAn operation with *unspecified behavior* shall not crash.  However,
305e97a28aSMark Cave-Aylandthe result may be one of several possibilities so may be considered
315e97a28aSMark Cave-Aylandan *undefined result*.
325e97a28aSMark Cave-Ayland
339644e714SRichard HendersonBasic Blocks
349644e714SRichard Henderson============
355e97a28aSMark Cave-Ayland
369644e714SRichard HendersonA TCG *basic block* is a single entry, multiple exit region which
379644e714SRichard Hendersoncorresponds to a list of instructions terminated by a label, or
389644e714SRichard Hendersonany branch instruction.
395e97a28aSMark Cave-Ayland
409644e714SRichard HendersonA TCG *extended basic block* is a single entry, multiple exit region
419644e714SRichard Hendersonwhich corresponds to a list of instructions terminated by a label or
429644e714SRichard Hendersonan unconditional branch.  Specifically, an extended basic block is
439644e714SRichard Hendersona sequence of basic blocks connected by the fall-through paths of
449644e714SRichard Hendersonzero or more conditional branch instructions.
455e97a28aSMark Cave-Ayland
469644e714SRichard HendersonOperations
479644e714SRichard Henderson==========
485e97a28aSMark Cave-Ayland
499644e714SRichard HendersonTCG instructions or *ops* operate on TCG *variables*, both of which
509644e714SRichard Hendersonare strongly typed.  Each instruction has a fixed number of output
519644e714SRichard Hendersonvariable operands, input variable operands and constant operands.
529644e714SRichard HendersonVector instructions have a field specifying the element size within
539644e714SRichard Hendersonthe vector.  The notable exception is the call instruction which has
549644e714SRichard Hendersona variable number of outputs and inputs.
555e97a28aSMark Cave-Ayland
565e97a28aSMark Cave-AylandIn the textual form, output operands usually come first, followed by
575e97a28aSMark Cave-Aylandinput operands, followed by constant operands. The output type is
585e97a28aSMark Cave-Aylandincluded in the instruction name. Constants are prefixed with a '$'.
595e97a28aSMark Cave-Ayland
605e97a28aSMark Cave-Ayland.. code-block:: none
615e97a28aSMark Cave-Ayland
625e97a28aSMark Cave-Ayland   add_i32 t0, t1, t2    /* (t0 <- t1 + t2) */
635e97a28aSMark Cave-Ayland
649644e714SRichard HendersonVariables
659644e714SRichard Henderson=========
665e97a28aSMark Cave-Ayland
679644e714SRichard Henderson* ``TEMP_FIXED``
685e97a28aSMark Cave-Ayland
699644e714SRichard Henderson  There is one TCG *fixed global* variable, ``cpu_env``, which is
709644e714SRichard Henderson  live in all translation blocks, and holds a pointer to ``CPUArchState``.
719644e714SRichard Henderson  This variable is held in a host cpu register at all times in all
729644e714SRichard Henderson  translation blocks.
735e97a28aSMark Cave-Ayland
749644e714SRichard Henderson* ``TEMP_GLOBAL``
755e97a28aSMark Cave-Ayland
769644e714SRichard Henderson  A TCG *global* is a variable which is live in all translation blocks,
779644e714SRichard Henderson  and corresponds to memory location that is within ``CPUArchState``.
789644e714SRichard Henderson  These may be specified as an offset from ``cpu_env``, in which case
799644e714SRichard Henderson  they are called *direct globals*, or may be specified as an offset
809644e714SRichard Henderson  from a direct global, in which case they are called *indirect globals*.
819644e714SRichard Henderson  Even indirect globals should still reference memory within
829644e714SRichard Henderson  ``CPUArchState``.  All TCG globals are defined during
839644e714SRichard Henderson  ``TCGCPUOps.initialize``, before any translation blocks are generated.
845e97a28aSMark Cave-Ayland
859644e714SRichard Henderson* ``TEMP_CONST``
865e97a28aSMark Cave-Ayland
879644e714SRichard Henderson  A TCG *constant* is a variable which is live throughout the entire
889644e714SRichard Henderson  translation block, and contains a constant value.  These variables
899644e714SRichard Henderson  are allocated on demand during translation and are hashed so that
909644e714SRichard Henderson  there is exactly one variable holding a given value.
915e97a28aSMark Cave-Ayland
929644e714SRichard Henderson* ``TEMP_TB``
935e97a28aSMark Cave-Ayland
949644e714SRichard Henderson  A TCG *translation block temporary* is a variable which is live
959644e714SRichard Henderson  throughout the entire translation block, but dies on any exit.
969644e714SRichard Henderson  These temporaries are allocated explicitly during translation.
975e97a28aSMark Cave-Ayland
989644e714SRichard Henderson* ``TEMP_EBB``
999644e714SRichard Henderson
1009644e714SRichard Henderson  A TCG *extended basic block temporary* is a variable which is live
1019644e714SRichard Henderson  throughout an extended basic block, but dies on any exit.
1029644e714SRichard Henderson  These temporaries are allocated explicitly during translation.
1039644e714SRichard Henderson
1049644e714SRichard HendersonTypes
1059644e714SRichard Henderson=====
1069644e714SRichard Henderson
1079644e714SRichard Henderson* ``TCG_TYPE_I32``
1089644e714SRichard Henderson
1099644e714SRichard Henderson  A 32-bit integer.
1109644e714SRichard Henderson
1119644e714SRichard Henderson* ``TCG_TYPE_I64``
1129644e714SRichard Henderson
1139644e714SRichard Henderson  A 64-bit integer.  For 32-bit hosts, such variables are split into a pair
1149644e714SRichard Henderson  of variables with ``type=TCG_TYPE_I32`` and ``base_type=TCG_TYPE_I64``.
1159644e714SRichard Henderson  The ``temp_subindex`` for each indicates where it falls within the
1169644e714SRichard Henderson  host-endian representation.
1179644e714SRichard Henderson
1189644e714SRichard Henderson* ``TCG_TYPE_PTR``
1199644e714SRichard Henderson
1209644e714SRichard Henderson  An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size
1219644e714SRichard Henderson  of a pointer for the host.
1229644e714SRichard Henderson
1239644e714SRichard Henderson* ``TCG_TYPE_REG``
1249644e714SRichard Henderson
1259644e714SRichard Henderson  An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size
1269644e714SRichard Henderson  of the integer registers for the host.  This may be larger
1279644e714SRichard Henderson  than ``TCG_TYPE_PTR`` depending on the host ABI.
1289644e714SRichard Henderson
1299644e714SRichard Henderson* ``TCG_TYPE_I128``
1309644e714SRichard Henderson
1319644e714SRichard Henderson  A 128-bit integer.  For all hosts, such variables are split into a number
1329644e714SRichard Henderson  of variables with ``type=TCG_TYPE_REG`` and ``base_type=TCG_TYPE_I128``.
1339644e714SRichard Henderson  The ``temp_subindex`` for each indicates where it falls within the
1349644e714SRichard Henderson  host-endian representation.
1359644e714SRichard Henderson
1369644e714SRichard Henderson* ``TCG_TYPE_V64``
1379644e714SRichard Henderson
1389644e714SRichard Henderson  A 64-bit vector.  This type is valid only if the TCG target
1399644e714SRichard Henderson  sets ``TCG_TARGET_HAS_v64``.
1409644e714SRichard Henderson
1419644e714SRichard Henderson* ``TCG_TYPE_V128``
1429644e714SRichard Henderson
1439644e714SRichard Henderson  A 128-bit vector.  This type is valid only if the TCG target
1449644e714SRichard Henderson  sets ``TCG_TARGET_HAS_v128``.
1459644e714SRichard Henderson
1469644e714SRichard Henderson* ``TCG_TYPE_V256``
1479644e714SRichard Henderson
1489644e714SRichard Henderson  A 256-bit vector.  This type is valid only if the TCG target
1499644e714SRichard Henderson  sets ``TCG_TARGET_HAS_v256``.
1505e97a28aSMark Cave-Ayland
1515e97a28aSMark Cave-AylandHelpers
1529644e714SRichard Henderson=======
1535e97a28aSMark Cave-Ayland
1549644e714SRichard HendersonHelpers are registered in a guest-specific ``helper.h``,
1559644e714SRichard Hendersonwhich is processed to generate ``tcg_gen_helper_*`` functions.
1569644e714SRichard HendersonWith these functions it is possible to call a function taking
1579644e714SRichard Hendersoni32, i64, i128 or pointer types.
1585e97a28aSMark Cave-Ayland
1599644e714SRichard HendersonBy default, before calling a helper, all globals are stored at their
1609644e714SRichard Hendersoncanonical location.  By default, the helper is allowed to modify the
1619644e714SRichard HendersonCPU state (including the state represented by tcg globals)
1629644e714SRichard Hendersonor may raise an exception.  This default can be overridden using the
1639644e714SRichard Hendersonfollowing function modifiers:
1645e97a28aSMark Cave-Ayland
1659644e714SRichard Henderson* ``TCG_CALL_NO_WRITE_GLOBALS``
1665e97a28aSMark Cave-Ayland
1679644e714SRichard Henderson  The helper does not modify any globals, but may read them.
1689644e714SRichard Henderson  Globals will be saved to their canonical location before calling helpers,
1699644e714SRichard Henderson  but need not be reloaded afterwards.
1705e97a28aSMark Cave-Ayland
1719644e714SRichard Henderson* ``TCG_CALL_NO_READ_GLOBALS``
1725e97a28aSMark Cave-Ayland
1739644e714SRichard Henderson  The helper does not read globals, either directly or via an exception.
1749644e714SRichard Henderson  They will not be saved to their canonical locations before calling
1759644e714SRichard Henderson  the helper.  This implies ``TCG_CALL_NO_WRITE_GLOBALS``.
1765e97a28aSMark Cave-Ayland
1779644e714SRichard Henderson* ``TCG_CALL_NO_SIDE_EFFECTS``
1785e97a28aSMark Cave-Ayland
1799644e714SRichard Henderson  The call to the helper function may be removed if the return value is
1809644e714SRichard Henderson  not used.  This means that it may not modify any CPU state nor may it
1819644e714SRichard Henderson  raise an exception.
1825e97a28aSMark Cave-Ayland
1835e97a28aSMark Cave-AylandCode Optimizations
1849644e714SRichard Henderson==================
1855e97a28aSMark Cave-Ayland
1865e97a28aSMark Cave-AylandWhen generating instructions, you can count on at least the following
1875e97a28aSMark Cave-Aylandoptimizations:
1885e97a28aSMark Cave-Ayland
1895e97a28aSMark Cave-Ayland- Single instructions are simplified, e.g.
1905e97a28aSMark Cave-Ayland
1915e97a28aSMark Cave-Ayland  .. code-block:: none
1925e97a28aSMark Cave-Ayland
1935e97a28aSMark Cave-Ayland     and_i32 t0, t0, $0xffffffff
1945e97a28aSMark Cave-Ayland
1955e97a28aSMark Cave-Ayland  is suppressed.
1965e97a28aSMark Cave-Ayland
1975e97a28aSMark Cave-Ayland- A liveness analysis is done at the basic block level. The
1985e97a28aSMark Cave-Ayland  information is used to suppress moves from a dead variable to
1995e97a28aSMark Cave-Ayland  another one. It is also used to remove instructions which compute
2005e97a28aSMark Cave-Ayland  dead results. The later is especially useful for condition code
2015e97a28aSMark Cave-Ayland  optimization in QEMU.
2025e97a28aSMark Cave-Ayland
2035e97a28aSMark Cave-Ayland  In the following example:
2045e97a28aSMark Cave-Ayland
2055e97a28aSMark Cave-Ayland  .. code-block:: none
2065e97a28aSMark Cave-Ayland
2075e97a28aSMark Cave-Ayland     add_i32 t0, t1, t2
2085e97a28aSMark Cave-Ayland     add_i32 t0, t0, $1
2095e97a28aSMark Cave-Ayland     mov_i32 t0, $1
2105e97a28aSMark Cave-Ayland
2115e97a28aSMark Cave-Ayland  only the last instruction is kept.
2125e97a28aSMark Cave-Ayland
2135e97a28aSMark Cave-Ayland
2145e97a28aSMark Cave-AylandInstruction Reference
2155e97a28aSMark Cave-Ayland=====================
2165e97a28aSMark Cave-Ayland
2175e97a28aSMark Cave-AylandFunction call
2185e97a28aSMark Cave-Ayland-------------
2195e97a28aSMark Cave-Ayland
2205e97a28aSMark Cave-Ayland.. list-table::
2215e97a28aSMark Cave-Ayland
2225e97a28aSMark Cave-Ayland   * - call *<ret>* *<params>* ptr
2235e97a28aSMark Cave-Ayland
2245e97a28aSMark Cave-Ayland     - |  call function 'ptr' (pointer type)
2255e97a28aSMark Cave-Ayland       |
2265e97a28aSMark Cave-Ayland       |  *<ret>* optional 32 bit or 64 bit return value
2275e97a28aSMark Cave-Ayland       |  *<params>* optional 32 bit or 64 bit parameters
2285e97a28aSMark Cave-Ayland
2295e97a28aSMark Cave-AylandJumps/Labels
2305e97a28aSMark Cave-Ayland------------
2315e97a28aSMark Cave-Ayland
2325e97a28aSMark Cave-Ayland.. list-table::
2335e97a28aSMark Cave-Ayland
2345e97a28aSMark Cave-Ayland   * - set_label $label
2355e97a28aSMark Cave-Ayland
2365e97a28aSMark Cave-Ayland     - | Define label 'label' at the current program point.
2375e97a28aSMark Cave-Ayland
2385e97a28aSMark Cave-Ayland   * - br $label
2395e97a28aSMark Cave-Ayland
2405e97a28aSMark Cave-Ayland     - | Jump to label.
2415e97a28aSMark Cave-Ayland
242b6d69fceSRichard Henderson   * - brcond *t0*, *t1*, *cond*, *label*
2435e97a28aSMark Cave-Ayland
2445e97a28aSMark Cave-Ayland     - | Conditional jump if *t0* *cond* *t1* is true. *cond* can be:
2455e97a28aSMark Cave-Ayland       |
2465e97a28aSMark Cave-Ayland       |   ``TCG_COND_EQ``
2475e97a28aSMark Cave-Ayland       |   ``TCG_COND_NE``
2485e97a28aSMark Cave-Ayland       |   ``TCG_COND_LT /* signed */``
2495e97a28aSMark Cave-Ayland       |   ``TCG_COND_GE /* signed */``
2505e97a28aSMark Cave-Ayland       |   ``TCG_COND_LE /* signed */``
2515e97a28aSMark Cave-Ayland       |   ``TCG_COND_GT /* signed */``
2525e97a28aSMark Cave-Ayland       |   ``TCG_COND_LTU /* unsigned */``
2535e97a28aSMark Cave-Ayland       |   ``TCG_COND_GEU /* unsigned */``
2545e97a28aSMark Cave-Ayland       |   ``TCG_COND_LEU /* unsigned */``
2555e97a28aSMark Cave-Ayland       |   ``TCG_COND_GTU /* unsigned */``
256d48097d0SRichard Henderson       |   ``TCG_COND_TSTEQ /* t1 & t2 == 0 */``
257d48097d0SRichard Henderson       |   ``TCG_COND_TSTNE /* t1 & t2 != 0 */``
2585e97a28aSMark Cave-Ayland
2595e97a28aSMark Cave-AylandArithmetic
2605e97a28aSMark Cave-Ayland----------
2615e97a28aSMark Cave-Ayland
2625e97a28aSMark Cave-Ayland.. list-table::
2635e97a28aSMark Cave-Ayland
26479602f63SRichard Henderson   * - add *t0*, *t1*, *t2*
2655e97a28aSMark Cave-Ayland
2665e97a28aSMark Cave-Ayland     - | *t0* = *t1* + *t2*
2675e97a28aSMark Cave-Ayland
26860f34f55SRichard Henderson   * - sub *t0*, *t1*, *t2*
2695e97a28aSMark Cave-Ayland
2705e97a28aSMark Cave-Ayland     - | *t0* = *t1* - *t2*
2715e97a28aSMark Cave-Ayland
27269713587SRichard Henderson   * - neg *t0*, *t1*
2735e97a28aSMark Cave-Ayland
2745e97a28aSMark Cave-Ayland     - | *t0* = -*t1* (two's complement)
2755e97a28aSMark Cave-Ayland
276d2c3ecadSRichard Henderson   * - mul *t0*, *t1*, *t2*
2775e97a28aSMark Cave-Ayland
2785e97a28aSMark Cave-Ayland     - | *t0* = *t1* * *t2*
2795e97a28aSMark Cave-Ayland
280b2c514f9SRichard Henderson   * - divs *t0*, *t1*, *t2*
2815e97a28aSMark Cave-Ayland
2825e97a28aSMark Cave-Ayland     - | *t0* = *t1* / *t2* (signed)
2835e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero or overflow.
2845e97a28aSMark Cave-Ayland
285961b80aeSRichard Henderson   * - divu *t0*, *t1*, *t2*
2865e97a28aSMark Cave-Ayland
2875e97a28aSMark Cave-Ayland     - | *t0* = *t1* / *t2* (unsigned)
2885e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero.
2895e97a28aSMark Cave-Ayland
2909a6bc184SRichard Henderson   * - rems *t0*, *t1*, *t2*
2915e97a28aSMark Cave-Ayland
2925e97a28aSMark Cave-Ayland     - | *t0* = *t1* % *t2* (signed)
2935e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero or overflow.
2945e97a28aSMark Cave-Ayland
295cd9acd20SRichard Henderson   * - remu *t0*, *t1*, *t2*
2965e97a28aSMark Cave-Ayland
2975e97a28aSMark Cave-Ayland     - | *t0* = *t1* % *t2* (unsigned)
2985e97a28aSMark Cave-Ayland       | Undefined behavior if division by zero.
2995e97a28aSMark Cave-Ayland
300ee1805b9SRichard Henderson   * - divs2 *q*, *r*, *nl*, *nh*, *d*
301ee1805b9SRichard Henderson
302ee1805b9SRichard Henderson     - | *q* = *nh:nl* / *d* (signed)
303ee1805b9SRichard Henderson       | *r* = *nh:nl* % *d*
304ee1805b9SRichard Henderson       | Undefined behaviour if division by zero, or the double-word
305ee1805b9SRichard Henderson         numerator divided by the single-word divisor does not fit
306ee1805b9SRichard Henderson         within the single-word quotient.  The code generator will
307ee1805b9SRichard Henderson         pass *nh* as a simple sign-extension of *nl*, so the only
308ee1805b9SRichard Henderson         overflow should be *INT_MIN* / -1.
3095e97a28aSMark Cave-Ayland
3108109598bSRichard Henderson   * - divu2 *q*, *r*, *nl*, *nh*, *d*
3118109598bSRichard Henderson
3128109598bSRichard Henderson     - | *q* = *nh:nl* / *d* (unsigned)
3138109598bSRichard Henderson       | *r* = *nh:nl* % *d*
3148109598bSRichard Henderson       | Undefined behaviour if division by zero, or the double-word
3158109598bSRichard Henderson         numerator divided by the single-word divisor does not fit
3168109598bSRichard Henderson         within the single-word quotient.  The code generator will
3178109598bSRichard Henderson         pass 0 to *nh* to make a simple zero-extension of *nl*,
3188109598bSRichard Henderson         so overflow should never occur.
3198109598bSRichard Henderson
3205e97a28aSMark Cave-AylandLogical
3215e97a28aSMark Cave-Ayland-------
3225e97a28aSMark Cave-Ayland
3235e97a28aSMark Cave-Ayland.. list-table::
3245e97a28aSMark Cave-Ayland
325c3b920b3SRichard Henderson   * - and *t0*, *t1*, *t2*
3265e97a28aSMark Cave-Ayland
3275e97a28aSMark Cave-Ayland     - | *t0* = *t1* & *t2*
3285e97a28aSMark Cave-Ayland
32949bd7514SRichard Henderson   * - or *t0*, *t1*, *t2*
3305e97a28aSMark Cave-Ayland
3315e97a28aSMark Cave-Ayland     - | *t0* = *t1* | *t2*
3325e97a28aSMark Cave-Ayland
333fffd3dc9SRichard Henderson   * - xor *t0*, *t1*, *t2*
3345e97a28aSMark Cave-Ayland
3355e97a28aSMark Cave-Ayland     - | *t0* = *t1* ^ *t2*
3365e97a28aSMark Cave-Ayland
3375c62d377SRichard Henderson   * - not *t0*, *t1*
3385e97a28aSMark Cave-Ayland
3395e97a28aSMark Cave-Ayland     - | *t0* = ~\ *t1*
3405e97a28aSMark Cave-Ayland
34146f96bffSRichard Henderson   * - andc *t0*, *t1*, *t2*
3425e97a28aSMark Cave-Ayland
3435e97a28aSMark Cave-Ayland     - | *t0* = *t1* & ~\ *t2*
3445e97a28aSMark Cave-Ayland
3455c0968a7SRichard Henderson   * - eqv *t0*, *t1*, *t2*
3465e97a28aSMark Cave-Ayland
3475e97a28aSMark Cave-Ayland     - | *t0* = ~(*t1* ^ *t2*), or equivalently, *t0* = *t1* ^ ~\ *t2*
3485e97a28aSMark Cave-Ayland
34959379a45SRichard Henderson   * - nand *t0*, *t1*, *t2*
3505e97a28aSMark Cave-Ayland
3515e97a28aSMark Cave-Ayland     - | *t0* = ~(*t1* & *t2*)
3525e97a28aSMark Cave-Ayland
3533a8c4e9eSRichard Henderson   * - nor *t0*, *t1*, *t2*
3545e97a28aSMark Cave-Ayland
3555e97a28aSMark Cave-Ayland     - | *t0* = ~(*t1* | *t2*)
3565e97a28aSMark Cave-Ayland
3576aba25ebSRichard Henderson   * - orc *t0*, *t1*, *t2*
3585e97a28aSMark Cave-Ayland
3595e97a28aSMark Cave-Ayland     - | *t0* = *t1* | ~\ *t2*
3605e97a28aSMark Cave-Ayland
3615a5bb0a5SRichard Henderson   * - clz *t0*, *t1*, *t2*
3625e97a28aSMark Cave-Ayland
3635e97a28aSMark Cave-Ayland     - | *t0* = *t1* ? clz(*t1*) : *t2*
3645e97a28aSMark Cave-Ayland
365c96447d8SRichard Henderson   * - ctz *t0*, *t1*, *t2*
3665e97a28aSMark Cave-Ayland
3675e97a28aSMark Cave-Ayland     - | *t0* = *t1* ? ctz(*t1*) : *t2*
3685e97a28aSMark Cave-Ayland
36997218ae9SRichard Henderson   * - ctpop *t0*, *t1*
3705e97a28aSMark Cave-Ayland
3715e97a28aSMark Cave-Ayland     - | *t0* = number of bits set in *t1*
3725e97a28aSMark Cave-Ayland       |
37397218ae9SRichard Henderson       | The name *ctpop* is short for "count population", and matches
37497218ae9SRichard Henderson         the function name used in ``include/qemu/host-utils.h``.
3755e97a28aSMark Cave-Ayland
3765e97a28aSMark Cave-Ayland
3775e97a28aSMark Cave-AylandShifts/Rotates
3785e97a28aSMark Cave-Ayland--------------
3795e97a28aSMark Cave-Ayland
3805e97a28aSMark Cave-Ayland.. list-table::
3815e97a28aSMark Cave-Ayland
3826ca59451SRichard Henderson   * - shl *t0*, *t1*, *t2*
3835e97a28aSMark Cave-Ayland
3845e97a28aSMark Cave-Ayland     - | *t0* = *t1* << *t2*
3856ca59451SRichard Henderson       | Unspecified behavior for negative or out-of-range shifts.
3865e97a28aSMark Cave-Ayland
38774dbd36fSRichard Henderson   * - shr *t0*, *t1*, *t2*
3885e97a28aSMark Cave-Ayland
3895e97a28aSMark Cave-Ayland     - | *t0* = *t1* >> *t2* (unsigned)
39074dbd36fSRichard Henderson       | Unspecified behavior for negative or out-of-range shifts.
3915e97a28aSMark Cave-Ayland
3923949f365SRichard Henderson   * - sar *t0*, *t1*, *t2*
3935e97a28aSMark Cave-Ayland
3945e97a28aSMark Cave-Ayland     - | *t0* = *t1* >> *t2* (signed)
3953949f365SRichard Henderson       | Unspecified behavior for negative or out-of-range shifts.
3965e97a28aSMark Cave-Ayland
397005a87e1SRichard Henderson   * - rotl *t0*, *t1*, *t2*
3985e97a28aSMark Cave-Ayland
3995e97a28aSMark Cave-Ayland     - | Rotation of *t2* bits to the left
400005a87e1SRichard Henderson       | Unspecified behavior for negative or out-of-range shifts.
4015e97a28aSMark Cave-Ayland
402005a87e1SRichard Henderson   * - rotr *t0*, *t1*, *t2*
4035e97a28aSMark Cave-Ayland
4045e97a28aSMark Cave-Ayland     - | Rotation of *t2* bits to the right.
405005a87e1SRichard Henderson       | Unspecified behavior for negative or out-of-range shifts.
4065e97a28aSMark Cave-Ayland
4075e97a28aSMark Cave-Ayland
4085e97a28aSMark Cave-AylandMisc
4095e97a28aSMark Cave-Ayland----
4105e97a28aSMark Cave-Ayland
4115e97a28aSMark Cave-Ayland.. list-table::
4125e97a28aSMark Cave-Ayland
413b5701261SRichard Henderson   * - mov *t0*, *t1*
4145e97a28aSMark Cave-Ayland
4155e97a28aSMark Cave-Ayland     - | *t0* = *t1*
416b5701261SRichard Henderson       | Move *t1* to *t0*.
4175e97a28aSMark Cave-Ayland
4180dd07ee1SRichard Henderson   * - bswap16 *t0*, *t1*, *flags*
4195e97a28aSMark Cave-Ayland
4205e97a28aSMark Cave-Ayland     - | 16 bit byte swap on the low bits of a 32/64 bit input.
4215e97a28aSMark Cave-Ayland       |
4225e97a28aSMark Cave-Ayland       | If *flags* & ``TCG_BSWAP_IZ``, then *t1* is known to be zero-extended from bit 15.
4235e97a28aSMark Cave-Ayland       | If *flags* & ``TCG_BSWAP_OZ``, then *t0* will be zero-extended from bit 15.
4245e97a28aSMark Cave-Ayland       | If *flags* & ``TCG_BSWAP_OS``, then *t0* will be sign-extended from bit 15.
4255e97a28aSMark Cave-Ayland       |
4265e97a28aSMark Cave-Ayland       | If neither ``TCG_BSWAP_OZ`` nor ``TCG_BSWAP_OS`` are set, then the bits of *t0* above bit 15 may contain any value.
4275e97a28aSMark Cave-Ayland
4287498d882SRichard Henderson   * - bswap32 *t0*, *t1*, *flags*
4295e97a28aSMark Cave-Ayland
4307498d882SRichard Henderson     - | 32 bit byte swap.  The flags are the same as for bswap16, except
4317498d882SRichard Henderson         they apply from bit 31 instead of bit 15.  On TCG_TYPE_I32, the
4327498d882SRichard Henderson         flags should be zero.
4335e97a28aSMark Cave-Ayland
4343ad5d4ccSRichard Henderson   * - bswap64 *t0*, *t1*, *flags*
4355e97a28aSMark Cave-Ayland
4367498d882SRichard Henderson     - | 64 bit byte swap. The flags are ignored, but still present
4373ad5d4ccSRichard Henderson         for consistency with the other bswap opcodes. For future
4383ad5d4ccSRichard Henderson         compatibility, the flags should be zero.
4395e97a28aSMark Cave-Ayland
4405e97a28aSMark Cave-Ayland   * - discard_i32/i64 *t0*
4415e97a28aSMark Cave-Ayland
4425e97a28aSMark Cave-Ayland     - | Indicate that the value of *t0* won't be used later. It is useful to
4435e97a28aSMark Cave-Ayland         force dead code elimination.
4445e97a28aSMark Cave-Ayland
4454d137ff8SRichard Henderson   * - deposit *dest*, *t1*, *t2*, *pos*, *len*
4465e97a28aSMark Cave-Ayland
4475e97a28aSMark Cave-Ayland     - | Deposit *t2* as a bitfield into *t1*, placing the result in *dest*.
4485e97a28aSMark Cave-Ayland       |
4495e97a28aSMark Cave-Ayland       | The bitfield is described by *pos*/*len*, which are immediate values:
4505e97a28aSMark Cave-Ayland       |
4515e97a28aSMark Cave-Ayland       |     *len* - the length of the bitfield
4525e97a28aSMark Cave-Ayland       |     *pos* - the position of the first bit, counting from the LSB
4535e97a28aSMark Cave-Ayland       |
4544d137ff8SRichard Henderson       | For example, "deposit dest, t1, t2, 8, 4" indicates a 4-bit field
4555e97a28aSMark Cave-Ayland         at bit 8. This operation would be equivalent to
4565e97a28aSMark Cave-Ayland       |
4575e97a28aSMark Cave-Ayland       |     *dest* = (*t1* & ~0x0f00) | ((*t2* << 8) & 0x0f00)
4584d137ff8SRichard Henderson       |
4594d137ff8SRichard Henderson       | on TCG_TYPE_I32.
4605e97a28aSMark Cave-Ayland
46107d5d502SRichard Henderson   * - extract *dest*, *t1*, *pos*, *len*
4625e97a28aSMark Cave-Ayland
463fa361eefSRichard Henderson       sextract *dest*, *t1*, *pos*, *len*
4645e97a28aSMark Cave-Ayland
4655e97a28aSMark Cave-Ayland     - | Extract a bitfield from *t1*, placing the result in *dest*.
4665e97a28aSMark Cave-Ayland       |
4675e97a28aSMark Cave-Ayland       | The bitfield is described by *pos*/*len*, which are immediate values,
4685e97a28aSMark Cave-Ayland         as above for deposit.  For extract_*, the result will be extended
4695e97a28aSMark Cave-Ayland         to the left with zeros; for sextract_*, the result will be extended
4705e97a28aSMark Cave-Ayland         to the left with copies of the bitfield sign bit at *pos* + *len* - 1.
4715e97a28aSMark Cave-Ayland       |
47207d5d502SRichard Henderson       | For example, "sextract dest, t1, 8, 4" indicates a 4-bit field
4735e97a28aSMark Cave-Ayland         at bit 8. This operation would be equivalent to
4745e97a28aSMark Cave-Ayland       |
4755e97a28aSMark Cave-Ayland       |    *dest* = (*t1* << 20) >> 28
4765e97a28aSMark Cave-Ayland       |
47707d5d502SRichard Henderson       | (using an arithmetic right shift) on TCG_TYPE_I32.
4785e97a28aSMark Cave-Ayland
47961d6a876SRichard Henderson   * - extract2 *dest*, *t1*, *t2*, *pos*
4805e97a28aSMark Cave-Ayland
48161d6a876SRichard Henderson     - | For TCG_TYPE_I{N}, extract an N-bit quantity from the concatenation
4825e97a28aSMark Cave-Ayland         of *t2*:*t1*, beginning at *pos*. The tcg_gen_extract2_{i32,i64} expander
4835e97a28aSMark Cave-Ayland         accepts 0 <= *pos* <= N as inputs. The backend code generator will
4845e97a28aSMark Cave-Ayland         not see either 0 or N as inputs for these opcodes.
4855e97a28aSMark Cave-Ayland
4865e97a28aSMark Cave-Ayland   * - extrl_i64_i32 *t0*, *t1*
4875e97a28aSMark Cave-Ayland
4885e97a28aSMark Cave-Ayland     - | For 64-bit hosts only, extract the low 32-bits of input *t1* and place it
4895e97a28aSMark Cave-Ayland         into 32-bit output *t0*.  Depending on the host, this may be a simple move,
4905e97a28aSMark Cave-Ayland         or may require additional canonicalization.
4915e97a28aSMark Cave-Ayland
4925e97a28aSMark Cave-Ayland   * - extrh_i64_i32 *t0*, *t1*
4935e97a28aSMark Cave-Ayland
4945e97a28aSMark Cave-Ayland     - | For 64-bit hosts only, extract the high 32-bits of input *t1* and place it
4955e97a28aSMark Cave-Ayland         into 32-bit output *t0*.  Depending on the host, this may be a simple shift,
4965e97a28aSMark Cave-Ayland         or may require additional canonicalization.
4975e97a28aSMark Cave-Ayland
4985e97a28aSMark Cave-Ayland
4995e97a28aSMark Cave-AylandConditional moves
5005e97a28aSMark Cave-Ayland-----------------
5015e97a28aSMark Cave-Ayland
5025e97a28aSMark Cave-Ayland.. list-table::
5035e97a28aSMark Cave-Ayland
504a363e1e1SRichard Henderson   * - setcond *dest*, *t1*, *t2*, *cond*
5055e97a28aSMark Cave-Ayland
5065e97a28aSMark Cave-Ayland     - | *dest* = (*t1* *cond* *t2*)
5075e97a28aSMark Cave-Ayland       |
5085e97a28aSMark Cave-Ayland       | Set *dest* to 1 if (*t1* *cond* *t2*) is true, otherwise set to 0.
5095e97a28aSMark Cave-Ayland
510a363e1e1SRichard Henderson   * - negsetcond *dest*, *t1*, *t2*, *cond*
5113635502dSRichard Henderson
5123635502dSRichard Henderson     - | *dest* = -(*t1* *cond* *t2*)
5133635502dSRichard Henderson       |
5143635502dSRichard Henderson       | Set *dest* to -1 if (*t1* *cond* *t2*) is true, otherwise set to 0.
5153635502dSRichard Henderson
516ea46c4bcSRichard Henderson   * - movcond *dest*, *c1*, *c2*, *v1*, *v2*, *cond*
5175e97a28aSMark Cave-Ayland
5185e97a28aSMark Cave-Ayland     - | *dest* = (*c1* *cond* *c2* ? *v1* : *v2*)
5195e97a28aSMark Cave-Ayland       |
5205e97a28aSMark Cave-Ayland       | Set *dest* to *v1* if (*c1* *cond* *c2*) is true, otherwise set to *v2*.
5215e97a28aSMark Cave-Ayland
5225e97a28aSMark Cave-Ayland
5235e97a28aSMark Cave-AylandType conversions
5245e97a28aSMark Cave-Ayland----------------
5255e97a28aSMark Cave-Ayland
5265e97a28aSMark Cave-Ayland.. list-table::
5275e97a28aSMark Cave-Ayland
5285e97a28aSMark Cave-Ayland   * - ext_i32_i64 *t0*, *t1*
5295e97a28aSMark Cave-Ayland
5305e97a28aSMark Cave-Ayland     - | Convert *t1* (32 bit) to *t0* (64 bit) and does sign extension
5315e97a28aSMark Cave-Ayland
5325e97a28aSMark Cave-Ayland   * - extu_i32_i64 *t0*, *t1*
5335e97a28aSMark Cave-Ayland
5345e97a28aSMark Cave-Ayland     - | Convert *t1* (32 bit) to *t0* (64 bit) and does zero extension
5355e97a28aSMark Cave-Ayland
5365e97a28aSMark Cave-Ayland   * - trunc_i64_i32 *t0*, *t1*
5375e97a28aSMark Cave-Ayland
5385e97a28aSMark Cave-Ayland     - | Truncate *t1* (64 bit) to *t0* (32 bit)
5395e97a28aSMark Cave-Ayland
5405e97a28aSMark Cave-Ayland   * - concat_i32_i64 *t0*, *t1*, *t2*
5415e97a28aSMark Cave-Ayland
5425e97a28aSMark Cave-Ayland     - | Construct *t0* (64-bit) taking the low half from *t1* (32 bit) and the high half
5435e97a28aSMark Cave-Ayland         from *t2* (32 bit).
5445e97a28aSMark Cave-Ayland
5455e97a28aSMark Cave-Ayland   * - concat32_i64 *t0*, *t1*, *t2*
5465e97a28aSMark Cave-Ayland
5475e97a28aSMark Cave-Ayland     - | Construct *t0* (64-bit) taking the low half from *t1* (64 bit) and the high half
5485e97a28aSMark Cave-Ayland         from *t2* (64 bit).
5495e97a28aSMark Cave-Ayland
5505e97a28aSMark Cave-Ayland
5515e97a28aSMark Cave-AylandLoad/Store
5525e97a28aSMark Cave-Ayland----------
5535e97a28aSMark Cave-Ayland
5545e97a28aSMark Cave-Ayland.. list-table::
5555e97a28aSMark Cave-Ayland
5565e97a28aSMark Cave-Ayland   * - ld_i32/i64 *t0*, *t1*, *offset*
5575e97a28aSMark Cave-Ayland
5585e97a28aSMark Cave-Ayland       ld8s_i32/i64 *t0*, *t1*, *offset*
5595e97a28aSMark Cave-Ayland
5605e97a28aSMark Cave-Ayland       ld8u_i32/i64 *t0*, *t1*, *offset*
5615e97a28aSMark Cave-Ayland
5625e97a28aSMark Cave-Ayland       ld16s_i32/i64 *t0*, *t1*, *offset*
5635e97a28aSMark Cave-Ayland
5645e97a28aSMark Cave-Ayland       ld16u_i32/i64 *t0*, *t1*, *offset*
5655e97a28aSMark Cave-Ayland
5665e97a28aSMark Cave-Ayland       ld32s_i64 t0, *t1*, *offset*
5675e97a28aSMark Cave-Ayland
5685e97a28aSMark Cave-Ayland       ld32u_i64 t0, *t1*, *offset*
5695e97a28aSMark Cave-Ayland
5705e97a28aSMark Cave-Ayland     - | *t0* = read(*t1* + *offset*)
5715e97a28aSMark Cave-Ayland       |
5725e97a28aSMark Cave-Ayland       | Load 8, 16, 32 or 64 bits with or without sign extension from host memory.
5735e97a28aSMark Cave-Ayland         *offset* must be a constant.
5745e97a28aSMark Cave-Ayland
5755e97a28aSMark Cave-Ayland   * - st_i32/i64 *t0*, *t1*, *offset*
5765e97a28aSMark Cave-Ayland
5775e97a28aSMark Cave-Ayland       st8_i32/i64 *t0*, *t1*, *offset*
5785e97a28aSMark Cave-Ayland
5795e97a28aSMark Cave-Ayland       st16_i32/i64 *t0*, *t1*, *offset*
5805e97a28aSMark Cave-Ayland
5815e97a28aSMark Cave-Ayland       st32_i64 *t0*, *t1*, *offset*
5825e97a28aSMark Cave-Ayland
5835e97a28aSMark Cave-Ayland     - | write(*t0*, *t1* + *offset*)
5845e97a28aSMark Cave-Ayland       |
5855e97a28aSMark Cave-Ayland       | Write 8, 16, 32 or 64 bits to host memory.
5865e97a28aSMark Cave-Ayland
5875e97a28aSMark Cave-AylandAll this opcodes assume that the pointed host memory doesn't correspond
5885e97a28aSMark Cave-Aylandto a global. In the latter case the behaviour is unpredictable.
5895e97a28aSMark Cave-Ayland
5905e97a28aSMark Cave-Ayland
5915e97a28aSMark Cave-AylandMultiword arithmetic support
5925e97a28aSMark Cave-Ayland----------------------------
5935e97a28aSMark Cave-Ayland
5945e97a28aSMark Cave-Ayland.. list-table::
5955e97a28aSMark Cave-Ayland
59676f42780SRichard Henderson   * - addco *t0*, *t1*, *t2*
59776f42780SRichard Henderson
59876f42780SRichard Henderson     - | Compute *t0* = *t1* + *t2* and in addition output to the
59976f42780SRichard Henderson         carry bit provided by the host architecture.
60076f42780SRichard Henderson
60176f42780SRichard Henderson   * - addci *t0, *t1*, *t2*
60276f42780SRichard Henderson
60376f42780SRichard Henderson     - | Compute *t0* = *t1* + *t2* + *C*, where *C* is the
60476f42780SRichard Henderson         input carry bit provided by the host architecture.
60576f42780SRichard Henderson         The output carry bit need not be computed.
60676f42780SRichard Henderson
60776f42780SRichard Henderson   * - addcio *t0, *t1*, *t2*
60876f42780SRichard Henderson
60976f42780SRichard Henderson     - | Compute *t0* = *t1* + *t2* + *C*, where *C* is the
61076f42780SRichard Henderson         input carry bit provided by the host architecture,
61176f42780SRichard Henderson         and also compute the output carry bit.
61276f42780SRichard Henderson
61376f42780SRichard Henderson   * - addc1o *t0, *t1*, *t2*
61476f42780SRichard Henderson
61576f42780SRichard Henderson     - | Compute *t0* = *t1* + *t2* + 1, and in addition output to the
61676f42780SRichard Henderson         carry bit provided by the host architecture.  This is akin to
61776f42780SRichard Henderson         *addcio* with a fixed carry-in value of 1.
61876f42780SRichard Henderson       | This is intended to be used by the optimization pass,
61976f42780SRichard Henderson         intermediate to complete folding of the addition chain.
62076f42780SRichard Henderson         In some cases complete folding is not possible and this
62176f42780SRichard Henderson         opcode will remain until output.  If this happens, the
62276f42780SRichard Henderson         code generator will use ``tcg_out_set_carry`` and then
62376f42780SRichard Henderson         the output routine for *addcio*.
62476f42780SRichard Henderson
62576f42780SRichard Henderson   * - subbo *t0*, *t1*, *t2*
62676f42780SRichard Henderson
62776f42780SRichard Henderson     - | Compute *t0* = *t1* - *t2* and in addition output to the
62876f42780SRichard Henderson         borrow bit provided by the host architecture.
62976f42780SRichard Henderson       | Depending on the host architecture, the carry bit may or may not be
63076f42780SRichard Henderson         identical to the borrow bit.  Thus the addc\* and subb\*
63176f42780SRichard Henderson         opcodes must not be mixed.
63276f42780SRichard Henderson
63376f42780SRichard Henderson   * - subbi *t0, *t1*, *t2*
63476f42780SRichard Henderson
63576f42780SRichard Henderson     - | Compute *t0* = *t1* - *t2* - *B*, where *B* is the
63676f42780SRichard Henderson         input borrow bit provided by the host architecture.
63776f42780SRichard Henderson         The output borrow bit need not be computed.
63876f42780SRichard Henderson
63976f42780SRichard Henderson   * - subbio *t0, *t1*, *t2*
64076f42780SRichard Henderson
64176f42780SRichard Henderson     - | Compute *t0* = *t1* - *t2* - *B*, where *B* is the
64276f42780SRichard Henderson         input borrow bit provided by the host architecture,
64376f42780SRichard Henderson         and also compute the output borrow bit.
64476f42780SRichard Henderson
64576f42780SRichard Henderson   * - subb1o *t0, *t1*, *t2*
64676f42780SRichard Henderson
64776f42780SRichard Henderson     - | Compute *t0* = *t1* - *t2* - 1, and in addition output to the
64876f42780SRichard Henderson         borrow bit provided by the host architecture.  This is akin to
64976f42780SRichard Henderson         *subbio* with a fixed borrow-in value of 1.
65076f42780SRichard Henderson       | This is intended to be used by the optimization pass,
65176f42780SRichard Henderson         intermediate to complete folding of the subtraction chain.
65276f42780SRichard Henderson         In some cases complete folding is not possible and this
65376f42780SRichard Henderson         opcode will remain until output.  If this happens, the
65476f42780SRichard Henderson         code generator will use ``tcg_out_set_borrow`` and then
65576f42780SRichard Henderson         the output routine for *subbio*.
65676f42780SRichard Henderson
657d776198cSRichard Henderson   * - mulu2 *t0_low*, *t0_high*, *t1*, *t2*
6585e97a28aSMark Cave-Ayland
6595e97a28aSMark Cave-Ayland     - | Similar to mul, except two unsigned inputs *t1* and *t2* yielding the full
6605e97a28aSMark Cave-Ayland         double-word product *t0*. The latter is returned in two single-word outputs.
6615e97a28aSMark Cave-Ayland
662bfe96480SRichard Henderson   * - muls2 *t0_low*, *t0_high*, *t1*, *t2*
6635e97a28aSMark Cave-Ayland
6645e97a28aSMark Cave-Ayland     - | Similar to mulu2, except the two inputs *t1* and *t2* are signed.
6655e97a28aSMark Cave-Ayland
666c742824dSRichard Henderson   * - mulsh *t0*, *t1*, *t2*
6675e97a28aSMark Cave-Ayland
668aa28c9efSRichard Henderson       muluh *t0*, *t1*, *t2*
6695e97a28aSMark Cave-Ayland
6705e97a28aSMark Cave-Ayland     - | Provide the high part of a signed or unsigned multiply, respectively.
6715e97a28aSMark Cave-Ayland       |
6725e97a28aSMark Cave-Ayland       | If mulu2/muls2 are not provided by the backend, the tcg-op generator
6735e97a28aSMark Cave-Ayland         can obtain the same results by emitting a pair of opcodes, mul + muluh/mulsh.
6745e97a28aSMark Cave-Ayland
6755e97a28aSMark Cave-Ayland
6765e97a28aSMark Cave-AylandMemory Barrier support
6775e97a28aSMark Cave-Ayland----------------------
6785e97a28aSMark Cave-Ayland
6795e97a28aSMark Cave-Ayland.. list-table::
6805e97a28aSMark Cave-Ayland
6815e97a28aSMark Cave-Ayland   * - mb *<$arg>*
6825e97a28aSMark Cave-Ayland
6835e97a28aSMark Cave-Ayland     - | Generate a target memory barrier instruction to ensure memory ordering
6845e97a28aSMark Cave-Ayland         as being  enforced by a corresponding guest memory barrier instruction.
6855e97a28aSMark Cave-Ayland       |
6865e97a28aSMark Cave-Ayland       | The ordering enforced by the backend may be stricter than the ordering
6875e97a28aSMark Cave-Ayland         required by the guest. It cannot be weaker. This opcode takes a constant
6885e97a28aSMark Cave-Ayland         argument which is required to generate the appropriate barrier
6895e97a28aSMark Cave-Ayland         instruction. The backend should take care to emit the target barrier
6905e97a28aSMark Cave-Ayland         instruction only when necessary i.e., for SMP guests and when MTTCG is
6915e97a28aSMark Cave-Ayland         enabled.
6925e97a28aSMark Cave-Ayland       |
6935e97a28aSMark Cave-Ayland       | The guest translators should generate this opcode for all guest instructions
6945e97a28aSMark Cave-Ayland         which have ordering side effects.
6955e97a28aSMark Cave-Ayland       |
6965e97a28aSMark Cave-Ayland       | Please see :ref:`atomics-ref` for more information on memory barriers.
6975e97a28aSMark Cave-Ayland
6985e97a28aSMark Cave-Ayland
6995e97a28aSMark Cave-Ayland64-bit guest on 32-bit host support
7005e97a28aSMark Cave-Ayland-----------------------------------
7015e97a28aSMark Cave-Ayland
7025e97a28aSMark Cave-AylandThe following opcodes are internal to TCG.  Thus they are to be implemented by
7035e97a28aSMark Cave-Ayland32-bit host code generators, but are not to be emitted by guest translators.
7045e97a28aSMark Cave-AylandThey are emitted as needed by inline functions within ``tcg-op.h``.
7055e97a28aSMark Cave-Ayland
7065e97a28aSMark Cave-Ayland.. list-table::
7075e97a28aSMark Cave-Ayland
7085e97a28aSMark Cave-Ayland   * - brcond2_i32 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *cond*, *label*
7095e97a28aSMark Cave-Ayland
7105e97a28aSMark Cave-Ayland     - | Similar to brcond, except that the 64-bit values *t0* and *t1*
7115e97a28aSMark Cave-Ayland         are formed from two 32-bit arguments.
7125e97a28aSMark Cave-Ayland
7135e97a28aSMark Cave-Ayland   * - setcond2_i32 *dest*, *t1_low*, *t1_high*, *t2_low*, *t2_high*, *cond*
7145e97a28aSMark Cave-Ayland
7155e97a28aSMark Cave-Ayland     - | Similar to setcond, except that the 64-bit values *t1* and *t2* are
7165e97a28aSMark Cave-Ayland         formed from two 32-bit arguments. The result is a 32-bit value.
7175e97a28aSMark Cave-Ayland
7185e97a28aSMark Cave-Ayland
7195e97a28aSMark Cave-AylandQEMU specific operations
7205e97a28aSMark Cave-Ayland------------------------
7215e97a28aSMark Cave-Ayland
7225e97a28aSMark Cave-Ayland.. list-table::
7235e97a28aSMark Cave-Ayland
7245e97a28aSMark Cave-Ayland   * - exit_tb *t0*
7255e97a28aSMark Cave-Ayland
7265e97a28aSMark Cave-Ayland     - | Exit the current TB and return the value *t0* (word type).
7275e97a28aSMark Cave-Ayland
7285e97a28aSMark Cave-Ayland   * - goto_tb *index*
7295e97a28aSMark Cave-Ayland
7305e97a28aSMark Cave-Ayland     - | Exit the current TB and jump to the TB index *index* (constant) if the
7315e97a28aSMark Cave-Ayland         current TB was linked to this TB. Otherwise execute the next
7325e97a28aSMark Cave-Ayland         instructions. Only indices 0 and 1 are valid and tcg_gen_goto_tb may be issued
7335e97a28aSMark Cave-Ayland         at most once with each slot index per TB.
7345e97a28aSMark Cave-Ayland
7355e97a28aSMark Cave-Ayland   * - lookup_and_goto_ptr *tb_addr*
7365e97a28aSMark Cave-Ayland
7375e97a28aSMark Cave-Ayland     - | Look up a TB address *tb_addr* and jump to it if valid. If not valid,
7385e97a28aSMark Cave-Ayland         jump to the TCG epilogue to go back to the exec loop.
7395e97a28aSMark Cave-Ayland       |
7405e97a28aSMark Cave-Ayland       | This operation is optional. If the TCG backend does not implement the
7415e97a28aSMark Cave-Ayland         goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0).
7425e97a28aSMark Cave-Ayland
74312fde9bcSRichard Henderson   * - qemu_ld_i32/i64/i128 *t0*, *t1*, *flags*, *memidx*
7445e97a28aSMark Cave-Ayland
74512fde9bcSRichard Henderson       qemu_st_i32/i64/i128 *t0*, *t1*, *flags*, *memidx*
7465e97a28aSMark Cave-Ayland
7475e97a28aSMark Cave-Ayland     - | Load data at the guest address *t1* into *t0*, or store data in *t0* at guest
74812fde9bcSRichard Henderson         address *t1*.  The _i32/_i64/_i128 size applies to the size of the input/output
7495e97a28aSMark Cave-Ayland         register *t0* only.  The address *t1* is always sized according to the guest,
7505e97a28aSMark Cave-Ayland         and the width of the memory operation is controlled by *flags*.
7515e97a28aSMark Cave-Ayland       |
7525e97a28aSMark Cave-Ayland       | Both *t0* and *t1* may be split into little-endian ordered pairs of registers
75312fde9bcSRichard Henderson         if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities on
75412fde9bcSRichard Henderson         a 64-bit host.
7555e97a28aSMark Cave-Ayland       |
7565e97a28aSMark Cave-Ayland       | The *memidx* selects the qemu tlb index to use (e.g. user or kernel access).
7575e97a28aSMark Cave-Ayland         The flags are the MemOp bits, selecting the sign, width, and endianness
7585e97a28aSMark Cave-Ayland         of the memory access.
7595e97a28aSMark Cave-Ayland       |
7605e97a28aSMark Cave-Ayland       | For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a
7615e97a28aSMark Cave-Ayland         64-bit memory access specified in *flags*.
7625e97a28aSMark Cave-Ayland       |
76312fde9bcSRichard Henderson       | For qemu_ld/st_i128, these are only supported for a 64-bit host.
7645e97a28aSMark Cave-Ayland
7655e97a28aSMark Cave-Ayland
7665e97a28aSMark Cave-AylandHost vector operations
7675e97a28aSMark Cave-Ayland----------------------
7685e97a28aSMark Cave-Ayland
7694d872218SRichard HendersonAll of the vector ops have two parameters, ``TCGOP_TYPE`` & ``TCGOP_VECE``.
7704d872218SRichard HendersonThe former specifies the length of the vector as a TCGType; the latter
7714d872218SRichard Hendersonspecifies the length of the element (if applicable) in log2 8-bit units.
7725e97a28aSMark Cave-Ayland
7735e97a28aSMark Cave-Ayland.. list-table::
7745e97a28aSMark Cave-Ayland
7755e97a28aSMark Cave-Ayland   * - mov_vec *v0*, *v1*
776b08caa6dSMark Cave-Ayland
7775e97a28aSMark Cave-Ayland       ld_vec *v0*, *t1*
778b08caa6dSMark Cave-Ayland
7795e97a28aSMark Cave-Ayland       st_vec *v0*, *t1*
7805e97a28aSMark Cave-Ayland
7815e97a28aSMark Cave-Ayland     - | Move, load and store.
7825e97a28aSMark Cave-Ayland
7835e97a28aSMark Cave-Ayland   * - dup_vec *v0*, *r1*
7845e97a28aSMark Cave-Ayland
7854d872218SRichard Henderson     - | Duplicate the low N bits of *r1* into TYPE/VECE copies across *v0*.
7865e97a28aSMark Cave-Ayland
7875e97a28aSMark Cave-Ayland   * - dupi_vec *v0*, *c*
7885e97a28aSMark Cave-Ayland
7895e97a28aSMark Cave-Ayland     - | Similarly, for a constant.
7905e97a28aSMark Cave-Ayland       | Smaller values will be replicated to host register size by the expanders.
7915e97a28aSMark Cave-Ayland
7925e97a28aSMark Cave-Ayland   * - dup2_vec *v0*, *r1*, *r2*
7935e97a28aSMark Cave-Ayland
7944d872218SRichard Henderson     - | Duplicate *r2*:*r1* into TYPE/64 copies across *v0*. This opcode is
7955e97a28aSMark Cave-Ayland         only present for 32-bit hosts.
7965e97a28aSMark Cave-Ayland
7975e97a28aSMark Cave-Ayland   * - add_vec *v0*, *v1*, *v2*
7985e97a28aSMark Cave-Ayland
7995e97a28aSMark Cave-Ayland     - | *v0* = *v1* + *v2*, in elements across the vector.
8005e97a28aSMark Cave-Ayland
8015e97a28aSMark Cave-Ayland   * - sub_vec *v0*, *v1*, *v2*
8025e97a28aSMark Cave-Ayland
8035e97a28aSMark Cave-Ayland     - | Similarly, *v0* = *v1* - *v2*.
8045e97a28aSMark Cave-Ayland
8055e97a28aSMark Cave-Ayland   * - mul_vec *v0*, *v1*, *v2*
8065e97a28aSMark Cave-Ayland
8075e97a28aSMark Cave-Ayland     - | Similarly, *v0* = *v1* * *v2*.
8085e97a28aSMark Cave-Ayland
8095e97a28aSMark Cave-Ayland   * - neg_vec *v0*, *v1*
8105e97a28aSMark Cave-Ayland
8115e97a28aSMark Cave-Ayland     - | Similarly, *v0* = -*v1*.
8125e97a28aSMark Cave-Ayland
8135e97a28aSMark Cave-Ayland   * - abs_vec *v0*, *v1*
8145e97a28aSMark Cave-Ayland
8155e97a28aSMark Cave-Ayland     - | Similarly, *v0* = *v1* < 0 ? -*v1* : *v1*, in elements across the vector.
8165e97a28aSMark Cave-Ayland
8175e97a28aSMark Cave-Ayland   * - smin_vec *v0*, *v1*, *v2*
8185e97a28aSMark Cave-Ayland
8195e97a28aSMark Cave-Ayland       umin_vec *v0*, *v1*, *v2*
8205e97a28aSMark Cave-Ayland
8215e97a28aSMark Cave-Ayland     - | Similarly, *v0* = MIN(*v1*, *v2*), for signed and unsigned element types.
8225e97a28aSMark Cave-Ayland
8235e97a28aSMark Cave-Ayland   * - smax_vec *v0*, *v1*, *v2*
8245e97a28aSMark Cave-Ayland
8255e97a28aSMark Cave-Ayland       umax_vec *v0*, *v1*, *v2*
8265e97a28aSMark Cave-Ayland
8275e97a28aSMark Cave-Ayland     - | Similarly, *v0* = MAX(*v1*, *v2*), for signed and unsigned element types.
8285e97a28aSMark Cave-Ayland
8295e97a28aSMark Cave-Ayland   * - ssadd_vec *v0*, *v1*, *v2*
8305e97a28aSMark Cave-Ayland
8315e97a28aSMark Cave-Ayland       sssub_vec *v0*, *v1*, *v2*
8325e97a28aSMark Cave-Ayland
8335e97a28aSMark Cave-Ayland       usadd_vec *v0*, *v1*, *v2*
8345e97a28aSMark Cave-Ayland
8355e97a28aSMark Cave-Ayland       ussub_vec *v0*, *v1*, *v2*
8365e97a28aSMark Cave-Ayland
8375e97a28aSMark Cave-Ayland     - | Signed and unsigned saturating addition and subtraction.
8385e97a28aSMark Cave-Ayland       |
8395e97a28aSMark Cave-Ayland       | If the true result is not representable within the element type, the
8405e97a28aSMark Cave-Ayland         element is set to the minimum or maximum value for the type.
8415e97a28aSMark Cave-Ayland
8425e97a28aSMark Cave-Ayland   * - and_vec *v0*, *v1*, *v2*
8435e97a28aSMark Cave-Ayland
8445e97a28aSMark Cave-Ayland       or_vec *v0*, *v1*, *v2*
8455e97a28aSMark Cave-Ayland
8465e97a28aSMark Cave-Ayland       xor_vec *v0*, *v1*, *v2*
8475e97a28aSMark Cave-Ayland
8485e97a28aSMark Cave-Ayland       andc_vec *v0*, *v1*, *v2*
8495e97a28aSMark Cave-Ayland
8505e97a28aSMark Cave-Ayland       orc_vec *v0*, *v1*, *v2*
8515e97a28aSMark Cave-Ayland
8525e97a28aSMark Cave-Ayland       not_vec *v0*, *v1*
8535e97a28aSMark Cave-Ayland
8545e97a28aSMark Cave-Ayland     - | Similarly, logical operations with and without complement.
8555e97a28aSMark Cave-Ayland       |
8565e97a28aSMark Cave-Ayland       | Note that VECE is unused.
8575e97a28aSMark Cave-Ayland
8585e97a28aSMark Cave-Ayland   * - shli_vec *v0*, *v1*, *i2*
8595e97a28aSMark Cave-Ayland
8605e97a28aSMark Cave-Ayland       shls_vec *v0*, *v1*, *s2*
8615e97a28aSMark Cave-Ayland
8625e97a28aSMark Cave-Ayland     - | Shift all elements from v1 by a scalar *i2*/*s2*. I.e.
8635e97a28aSMark Cave-Ayland
8645e97a28aSMark Cave-Ayland       .. code-block:: c
8655e97a28aSMark Cave-Ayland
8664d872218SRichard Henderson          for (i = 0; i < TYPE/VECE; ++i) {
8675e97a28aSMark Cave-Ayland              v0[i] = v1[i] << s2;
8685e97a28aSMark Cave-Ayland          }
8695e97a28aSMark Cave-Ayland
8705e97a28aSMark Cave-Ayland   * - shri_vec *v0*, *v1*, *i2*
8715e97a28aSMark Cave-Ayland
8725e97a28aSMark Cave-Ayland       sari_vec *v0*, *v1*, *i2*
8735e97a28aSMark Cave-Ayland
8745e97a28aSMark Cave-Ayland       rotli_vec *v0*, *v1*, *i2*
8755e97a28aSMark Cave-Ayland
8765e97a28aSMark Cave-Ayland       shrs_vec *v0*, *v1*, *s2*
8775e97a28aSMark Cave-Ayland
8785e97a28aSMark Cave-Ayland       sars_vec *v0*, *v1*, *s2*
8795e97a28aSMark Cave-Ayland
8805e97a28aSMark Cave-Ayland     - | Similarly for logical and arithmetic right shift, and left rotate.
8815e97a28aSMark Cave-Ayland
8825e97a28aSMark Cave-Ayland   * - shlv_vec *v0*, *v1*, *v2*
8835e97a28aSMark Cave-Ayland
8845e97a28aSMark Cave-Ayland     - | Shift elements from *v1* by elements from *v2*. I.e.
8855e97a28aSMark Cave-Ayland
8865e97a28aSMark Cave-Ayland       .. code-block:: c
8875e97a28aSMark Cave-Ayland
8884d872218SRichard Henderson          for (i = 0; i < TYPE/VECE; ++i) {
8895e97a28aSMark Cave-Ayland              v0[i] = v1[i] << v2[i];
8905e97a28aSMark Cave-Ayland          }
8915e97a28aSMark Cave-Ayland
8925e97a28aSMark Cave-Ayland   * - shrv_vec *v0*, *v1*, *v2*
8935e97a28aSMark Cave-Ayland
8945e97a28aSMark Cave-Ayland       sarv_vec *v0*, *v1*, *v2*
8955e97a28aSMark Cave-Ayland
8965e97a28aSMark Cave-Ayland       rotlv_vec *v0*, *v1*, *v2*
8975e97a28aSMark Cave-Ayland
8985e97a28aSMark Cave-Ayland       rotrv_vec *v0*, *v1*, *v2*
8995e97a28aSMark Cave-Ayland
9005e97a28aSMark Cave-Ayland     - | Similarly for logical and arithmetic right shift, and rotates.
9015e97a28aSMark Cave-Ayland
9025e97a28aSMark Cave-Ayland   * - cmp_vec *v0*, *v1*, *v2*, *cond*
9035e97a28aSMark Cave-Ayland
9045e97a28aSMark Cave-Ayland     - | Compare vectors by element, storing -1 for true and 0 for false.
9055e97a28aSMark Cave-Ayland
9065e97a28aSMark Cave-Ayland   * - bitsel_vec *v0*, *v1*, *v2*, *v3*
9075e97a28aSMark Cave-Ayland
9085e97a28aSMark Cave-Ayland     - | Bitwise select, *v0* = (*v2* & *v1*) | (*v3* & ~\ *v1*), across the entire vector.
9095e97a28aSMark Cave-Ayland
9105e97a28aSMark Cave-Ayland   * - cmpsel_vec *v0*, *c1*, *c2*, *v3*, *v4*, *cond*
9115e97a28aSMark Cave-Ayland
9125e97a28aSMark Cave-Ayland     - | Select elements based on comparison results:
9135e97a28aSMark Cave-Ayland
9145e97a28aSMark Cave-Ayland       .. code-block:: c
9155e97a28aSMark Cave-Ayland
9165e97a28aSMark Cave-Ayland          for (i = 0; i < n; ++i) {
9175e97a28aSMark Cave-Ayland              v0[i] = (c1[i] cond c2[i]) ? v3[i] : v4[i].
9185e97a28aSMark Cave-Ayland          }
9195e97a28aSMark Cave-Ayland
9205e97a28aSMark Cave-Ayland**Note 1**: Some shortcuts are defined when the last operand is known to be
9215e97a28aSMark Cave-Aylanda constant (e.g. addi for add, movi for mov).
9225e97a28aSMark Cave-Ayland
9235e97a28aSMark Cave-Ayland**Note 2**: When using TCG, the opcodes must never be generated directly
9245e97a28aSMark Cave-Aylandas some of them may not be available as "real" opcodes. Always use the
9255e97a28aSMark Cave-Aylandfunction tcg_gen_xxx(args).
9265e97a28aSMark Cave-Ayland
9275e97a28aSMark Cave-Ayland
9285e97a28aSMark Cave-AylandBackend
9295e97a28aSMark Cave-Ayland=======
9305e97a28aSMark Cave-Ayland
9315e97a28aSMark Cave-Ayland``tcg-target.h`` contains the target specific definitions. ``tcg-target.c.inc``
9325e97a28aSMark Cave-Aylandcontains the target specific code; it is #included by ``tcg/tcg.c``, rather
9335e97a28aSMark Cave-Aylandthan being a standalone C file.
9345e97a28aSMark Cave-Ayland
9355e97a28aSMark Cave-AylandAssumptions
9365e97a28aSMark Cave-Ayland-----------
9375e97a28aSMark Cave-Ayland
9385e97a28aSMark Cave-AylandThe target word size (``TCG_TARGET_REG_BITS``) is expected to be 32 bit or
9395e97a28aSMark Cave-Ayland64 bit. It is expected that the pointer has the same size as the word.
9405e97a28aSMark Cave-Ayland
941*f2b1708eSRichard HendersonOn a 32 bit target, all 64 bit operations are converted to 32 bits.
942*f2b1708eSRichard HendersonA few specific operations must be implemented to allow it
943*f2b1708eSRichard Henderson(see brcond2_i32, setcond2_i32).
9445e97a28aSMark Cave-Ayland
9455e97a28aSMark Cave-AylandOn a 64 bit target, the values are transferred between 32 and 64-bit
9465e97a28aSMark Cave-Aylandregisters using the following ops:
9475e97a28aSMark Cave-Ayland
948bb9d7ee8SPhilippe Mathieu-Daudé- extrl_i64_i32
949bb9d7ee8SPhilippe Mathieu-Daudé- extrh_i64_i32
9505e97a28aSMark Cave-Ayland- ext_i32_i64
9515e97a28aSMark Cave-Ayland- extu_i32_i64
9525e97a28aSMark Cave-Ayland
9535e97a28aSMark Cave-AylandThey ensure that the values are correctly truncated or extended when
9545e97a28aSMark Cave-Aylandmoved from a 32-bit to a 64-bit register or vice-versa. Note that the
955bb9d7ee8SPhilippe Mathieu-Daudéextrl_i64_i32 and extrh_i64_i32 are optional ops. It is not necessary
956bb9d7ee8SPhilippe Mathieu-Daudéto implement them if all the following conditions are met:
9575e97a28aSMark Cave-Ayland
9585e97a28aSMark Cave-Ayland- 64-bit registers can hold 32-bit values
9595e97a28aSMark Cave-Ayland- 32-bit values in a 64-bit register do not need to stay zero or
9605e97a28aSMark Cave-Ayland  sign extended
9615e97a28aSMark Cave-Ayland- all 32-bit TCG ops ignore the high part of 64-bit registers
9625e97a28aSMark Cave-Ayland
9635e97a28aSMark Cave-AylandFloating point operations are not supported in this version. A
9645e97a28aSMark Cave-Aylandprevious incarnation of the code generator had full support of them,
9655e97a28aSMark Cave-Aylandbut it is better to concentrate on integer operations first.
9665e97a28aSMark Cave-Ayland
9675e97a28aSMark Cave-AylandConstraints
9685e97a28aSMark Cave-Ayland----------------
9695e97a28aSMark Cave-Ayland
9705e97a28aSMark Cave-AylandGCC like constraints are used to define the constraints of every
9715e97a28aSMark Cave-Aylandinstruction. Memory constraints are not supported in this
9725e97a28aSMark Cave-Aylandversion. Aliases are specified in the input operands as for GCC.
9735e97a28aSMark Cave-Ayland
9745e97a28aSMark Cave-AylandThe same register may be used for both an input and an output, even when
9755e97a28aSMark Cave-Aylandthey are not explicitly aliased.  If an op expands to multiple target
9765e97a28aSMark Cave-Aylandinstructions then care must be taken to avoid clobbering input values.
9775e97a28aSMark Cave-AylandGCC style "early clobber" outputs are supported, with '``&``'.
9785e97a28aSMark Cave-Ayland
9795e97a28aSMark Cave-AylandA target can define specific register or constant constraints. If an
9805e97a28aSMark Cave-Aylandoperation uses a constant input constraint which does not allow all
9815e97a28aSMark Cave-Aylandconstants, it must also accept registers in order to have a fallback.
9825e97a28aSMark Cave-AylandThe constraint '``i``' is defined generically to accept any constant.
9835e97a28aSMark Cave-AylandThe constraint '``r``' is not defined generically, but is consistently
9846b8abd24SRichard Hendersonused by each backend to indicate all registers.  If ``TCG_REG_ZERO``
9856b8abd24SRichard Hendersonis defined by the backend, the constraint '``z``' is defined generically
9866b8abd24SRichard Hendersonto map constant 0 to the hardware zero register.
9875e97a28aSMark Cave-Ayland
9885e97a28aSMark Cave-AylandThe movi_i32 and movi_i64 operations must accept any constants.
9895e97a28aSMark Cave-Ayland
9905e97a28aSMark Cave-AylandThe mov_i32 and mov_i64 operations must accept any registers of the
9915e97a28aSMark Cave-Aylandsame type.
9925e97a28aSMark Cave-Ayland
9935e97a28aSMark Cave-AylandThe ld/st/sti instructions must accept signed 32 bit constant offsets.
9945e97a28aSMark Cave-AylandThis can be implemented by reserving a specific register in which to
9955e97a28aSMark Cave-Aylandcompute the address if the offset is too big.
9965e97a28aSMark Cave-Ayland
9975e97a28aSMark Cave-AylandThe ld/st instructions must accept any destination (ld) or source (st)
9985e97a28aSMark Cave-Aylandregister.
9995e97a28aSMark Cave-Ayland
10005e97a28aSMark Cave-AylandThe sti instruction may fail if it cannot store the given constant.
10015e97a28aSMark Cave-Ayland
10025e97a28aSMark Cave-AylandFunction call assumptions
10035e97a28aSMark Cave-Ayland-------------------------
10045e97a28aSMark Cave-Ayland
10055e97a28aSMark Cave-Ayland- The only supported types for parameters and return value are: 32 and
10065e97a28aSMark Cave-Ayland  64 bit integers and pointer.
10075e97a28aSMark Cave-Ayland- The stack grows downwards.
10085e97a28aSMark Cave-Ayland- The first N parameters are passed in registers.
10095e97a28aSMark Cave-Ayland- The next parameters are passed on the stack by storing them as words.
10105e97a28aSMark Cave-Ayland- Some registers are clobbered during the call.
10115e97a28aSMark Cave-Ayland- The function can return 0 or 1 value in registers. On a 32 bit
10125e97a28aSMark Cave-Ayland  target, functions must be able to return 2 values in registers for
10135e97a28aSMark Cave-Ayland  64 bit return type.
10145e97a28aSMark Cave-Ayland
10155e97a28aSMark Cave-Ayland
10165e97a28aSMark Cave-AylandRecommended coding rules for best performance
10175e97a28aSMark Cave-Ayland=============================================
10185e97a28aSMark Cave-Ayland
10195e97a28aSMark Cave-Ayland- Use globals to represent the parts of the QEMU CPU state which are
10205e97a28aSMark Cave-Ayland  often modified, e.g. the integer registers and the condition
10215e97a28aSMark Cave-Ayland  codes. TCG will be able to use host registers to store them.
10225e97a28aSMark Cave-Ayland
10235e97a28aSMark Cave-Ayland- Don't hesitate to use helpers for complicated or seldom used guest
10245e97a28aSMark Cave-Ayland  instructions. There is little performance advantage in using TCG to
10255e97a28aSMark Cave-Ayland  implement guest instructions taking more than about twenty TCG
10265e97a28aSMark Cave-Ayland  instructions. Note that this rule of thumb is more applicable to
10275e97a28aSMark Cave-Ayland  helpers doing complex logic or arithmetic, where the C compiler has
10285e97a28aSMark Cave-Ayland  scope to do a good job of optimisation; it is less relevant where
10295e97a28aSMark Cave-Ayland  the instruction is mostly doing loads and stores, and in those cases
10305e97a28aSMark Cave-Ayland  inline TCG may still be faster for longer sequences.
10315e97a28aSMark Cave-Ayland
10325e97a28aSMark Cave-Ayland- Use the 'discard' instruction if you know that TCG won't be able to
10335e97a28aSMark Cave-Ayland  prove that a given global is "dead" at a given program point. The
10345e97a28aSMark Cave-Ayland  x86 guest uses it to improve the condition codes optimisation.
1035