15e97a28aSMark Cave-Ayland.. _tcg-ops-ref: 25e97a28aSMark Cave-Ayland 35e97a28aSMark Cave-Ayland******************************* 45e97a28aSMark Cave-AylandTCG Intermediate Representation 55e97a28aSMark Cave-Ayland******************************* 65e97a28aSMark Cave-Ayland 75e97a28aSMark Cave-AylandIntroduction 85e97a28aSMark Cave-Ayland============ 95e97a28aSMark Cave-Ayland 109644e714SRichard HendersonTCG (Tiny Code Generator) began as a generic backend for a C compiler. 119644e714SRichard HendersonIt was simplified to be used in QEMU. It also has its roots in the 129644e714SRichard HendersonQOP code generator written by Paul Brook. 135e97a28aSMark Cave-Ayland 145e97a28aSMark Cave-AylandDefinitions 155e97a28aSMark Cave-Ayland=========== 165e97a28aSMark Cave-Ayland 179644e714SRichard HendersonThe TCG *target* is the architecture for which we generate the code. 189644e714SRichard HendersonIt is of course not the same as the "target" of QEMU which is the 199644e714SRichard Hendersonemulated architecture. As TCG started as a generic C backend used 209644e714SRichard Hendersonfor cross compiling, the assumption was that TCG target might be 219644e714SRichard Hendersondifferent from the host, although this is never the case for QEMU. 225e97a28aSMark Cave-Ayland 235e97a28aSMark Cave-AylandIn this document, we use *guest* to specify what architecture we are 245e97a28aSMark Cave-Aylandemulating; *target* always means the TCG target, the machine on which 255e97a28aSMark Cave-Aylandwe are running QEMU. 265e97a28aSMark Cave-Ayland 275e97a28aSMark Cave-AylandAn operation with *undefined behavior* may result in a crash. 285e97a28aSMark Cave-Ayland 295e97a28aSMark Cave-AylandAn operation with *unspecified behavior* shall not crash. However, 305e97a28aSMark Cave-Aylandthe result may be one of several possibilities so may be considered 315e97a28aSMark Cave-Aylandan *undefined result*. 325e97a28aSMark Cave-Ayland 339644e714SRichard HendersonBasic Blocks 349644e714SRichard Henderson============ 355e97a28aSMark Cave-Ayland 369644e714SRichard HendersonA TCG *basic block* is a single entry, multiple exit region which 379644e714SRichard Hendersoncorresponds to a list of instructions terminated by a label, or 389644e714SRichard Hendersonany branch instruction. 395e97a28aSMark Cave-Ayland 409644e714SRichard HendersonA TCG *extended basic block* is a single entry, multiple exit region 419644e714SRichard Hendersonwhich corresponds to a list of instructions terminated by a label or 429644e714SRichard Hendersonan unconditional branch. Specifically, an extended basic block is 439644e714SRichard Hendersona sequence of basic blocks connected by the fall-through paths of 449644e714SRichard Hendersonzero or more conditional branch instructions. 455e97a28aSMark Cave-Ayland 469644e714SRichard HendersonOperations 479644e714SRichard Henderson========== 485e97a28aSMark Cave-Ayland 499644e714SRichard HendersonTCG instructions or *ops* operate on TCG *variables*, both of which 509644e714SRichard Hendersonare strongly typed. Each instruction has a fixed number of output 519644e714SRichard Hendersonvariable operands, input variable operands and constant operands. 529644e714SRichard HendersonVector instructions have a field specifying the element size within 539644e714SRichard Hendersonthe vector. The notable exception is the call instruction which has 549644e714SRichard Hendersona variable number of outputs and inputs. 555e97a28aSMark Cave-Ayland 565e97a28aSMark Cave-AylandIn the textual form, output operands usually come first, followed by 575e97a28aSMark Cave-Aylandinput operands, followed by constant operands. The output type is 585e97a28aSMark Cave-Aylandincluded in the instruction name. Constants are prefixed with a '$'. 595e97a28aSMark Cave-Ayland 605e97a28aSMark Cave-Ayland.. code-block:: none 615e97a28aSMark Cave-Ayland 625e97a28aSMark Cave-Ayland add_i32 t0, t1, t2 /* (t0 <- t1 + t2) */ 635e97a28aSMark Cave-Ayland 649644e714SRichard HendersonVariables 659644e714SRichard Henderson========= 665e97a28aSMark Cave-Ayland 679644e714SRichard Henderson* ``TEMP_FIXED`` 685e97a28aSMark Cave-Ayland 699644e714SRichard Henderson There is one TCG *fixed global* variable, ``cpu_env``, which is 709644e714SRichard Henderson live in all translation blocks, and holds a pointer to ``CPUArchState``. 719644e714SRichard Henderson This variable is held in a host cpu register at all times in all 729644e714SRichard Henderson translation blocks. 735e97a28aSMark Cave-Ayland 749644e714SRichard Henderson* ``TEMP_GLOBAL`` 755e97a28aSMark Cave-Ayland 769644e714SRichard Henderson A TCG *global* is a variable which is live in all translation blocks, 779644e714SRichard Henderson and corresponds to memory location that is within ``CPUArchState``. 789644e714SRichard Henderson These may be specified as an offset from ``cpu_env``, in which case 799644e714SRichard Henderson they are called *direct globals*, or may be specified as an offset 809644e714SRichard Henderson from a direct global, in which case they are called *indirect globals*. 819644e714SRichard Henderson Even indirect globals should still reference memory within 829644e714SRichard Henderson ``CPUArchState``. All TCG globals are defined during 839644e714SRichard Henderson ``TCGCPUOps.initialize``, before any translation blocks are generated. 845e97a28aSMark Cave-Ayland 859644e714SRichard Henderson* ``TEMP_CONST`` 865e97a28aSMark Cave-Ayland 879644e714SRichard Henderson A TCG *constant* is a variable which is live throughout the entire 889644e714SRichard Henderson translation block, and contains a constant value. These variables 899644e714SRichard Henderson are allocated on demand during translation and are hashed so that 909644e714SRichard Henderson there is exactly one variable holding a given value. 915e97a28aSMark Cave-Ayland 929644e714SRichard Henderson* ``TEMP_TB`` 935e97a28aSMark Cave-Ayland 949644e714SRichard Henderson A TCG *translation block temporary* is a variable which is live 959644e714SRichard Henderson throughout the entire translation block, but dies on any exit. 969644e714SRichard Henderson These temporaries are allocated explicitly during translation. 975e97a28aSMark Cave-Ayland 989644e714SRichard Henderson* ``TEMP_EBB`` 999644e714SRichard Henderson 1009644e714SRichard Henderson A TCG *extended basic block temporary* is a variable which is live 1019644e714SRichard Henderson throughout an extended basic block, but dies on any exit. 1029644e714SRichard Henderson These temporaries are allocated explicitly during translation. 1039644e714SRichard Henderson 1049644e714SRichard HendersonTypes 1059644e714SRichard Henderson===== 1069644e714SRichard Henderson 1079644e714SRichard Henderson* ``TCG_TYPE_I32`` 1089644e714SRichard Henderson 1099644e714SRichard Henderson A 32-bit integer. 1109644e714SRichard Henderson 1119644e714SRichard Henderson* ``TCG_TYPE_I64`` 1129644e714SRichard Henderson 1139644e714SRichard Henderson A 64-bit integer. For 32-bit hosts, such variables are split into a pair 1149644e714SRichard Henderson of variables with ``type=TCG_TYPE_I32`` and ``base_type=TCG_TYPE_I64``. 1159644e714SRichard Henderson The ``temp_subindex`` for each indicates where it falls within the 1169644e714SRichard Henderson host-endian representation. 1179644e714SRichard Henderson 1189644e714SRichard Henderson* ``TCG_TYPE_PTR`` 1199644e714SRichard Henderson 1209644e714SRichard Henderson An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size 1219644e714SRichard Henderson of a pointer for the host. 1229644e714SRichard Henderson 1239644e714SRichard Henderson* ``TCG_TYPE_REG`` 1249644e714SRichard Henderson 1259644e714SRichard Henderson An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size 1269644e714SRichard Henderson of the integer registers for the host. This may be larger 1279644e714SRichard Henderson than ``TCG_TYPE_PTR`` depending on the host ABI. 1289644e714SRichard Henderson 1299644e714SRichard Henderson* ``TCG_TYPE_I128`` 1309644e714SRichard Henderson 1319644e714SRichard Henderson A 128-bit integer. For all hosts, such variables are split into a number 1329644e714SRichard Henderson of variables with ``type=TCG_TYPE_REG`` and ``base_type=TCG_TYPE_I128``. 1339644e714SRichard Henderson The ``temp_subindex`` for each indicates where it falls within the 1349644e714SRichard Henderson host-endian representation. 1359644e714SRichard Henderson 1369644e714SRichard Henderson* ``TCG_TYPE_V64`` 1379644e714SRichard Henderson 1389644e714SRichard Henderson A 64-bit vector. This type is valid only if the TCG target 1399644e714SRichard Henderson sets ``TCG_TARGET_HAS_v64``. 1409644e714SRichard Henderson 1419644e714SRichard Henderson* ``TCG_TYPE_V128`` 1429644e714SRichard Henderson 1439644e714SRichard Henderson A 128-bit vector. This type is valid only if the TCG target 1449644e714SRichard Henderson sets ``TCG_TARGET_HAS_v128``. 1459644e714SRichard Henderson 1469644e714SRichard Henderson* ``TCG_TYPE_V256`` 1479644e714SRichard Henderson 1489644e714SRichard Henderson A 256-bit vector. This type is valid only if the TCG target 1499644e714SRichard Henderson sets ``TCG_TARGET_HAS_v256``. 1505e97a28aSMark Cave-Ayland 1515e97a28aSMark Cave-AylandHelpers 1529644e714SRichard Henderson======= 1535e97a28aSMark Cave-Ayland 1549644e714SRichard HendersonHelpers are registered in a guest-specific ``helper.h``, 1559644e714SRichard Hendersonwhich is processed to generate ``tcg_gen_helper_*`` functions. 1569644e714SRichard HendersonWith these functions it is possible to call a function taking 1579644e714SRichard Hendersoni32, i64, i128 or pointer types. 1585e97a28aSMark Cave-Ayland 1599644e714SRichard HendersonBy default, before calling a helper, all globals are stored at their 1609644e714SRichard Hendersoncanonical location. By default, the helper is allowed to modify the 1619644e714SRichard HendersonCPU state (including the state represented by tcg globals) 1629644e714SRichard Hendersonor may raise an exception. This default can be overridden using the 1639644e714SRichard Hendersonfollowing function modifiers: 1645e97a28aSMark Cave-Ayland 1659644e714SRichard Henderson* ``TCG_CALL_NO_WRITE_GLOBALS`` 1665e97a28aSMark Cave-Ayland 1679644e714SRichard Henderson The helper does not modify any globals, but may read them. 1689644e714SRichard Henderson Globals will be saved to their canonical location before calling helpers, 1699644e714SRichard Henderson but need not be reloaded afterwards. 1705e97a28aSMark Cave-Ayland 1719644e714SRichard Henderson* ``TCG_CALL_NO_READ_GLOBALS`` 1725e97a28aSMark Cave-Ayland 1739644e714SRichard Henderson The helper does not read globals, either directly or via an exception. 1749644e714SRichard Henderson They will not be saved to their canonical locations before calling 1759644e714SRichard Henderson the helper. This implies ``TCG_CALL_NO_WRITE_GLOBALS``. 1765e97a28aSMark Cave-Ayland 1779644e714SRichard Henderson* ``TCG_CALL_NO_SIDE_EFFECTS`` 1785e97a28aSMark Cave-Ayland 1799644e714SRichard Henderson The call to the helper function may be removed if the return value is 1809644e714SRichard Henderson not used. This means that it may not modify any CPU state nor may it 1819644e714SRichard Henderson raise an exception. 1825e97a28aSMark Cave-Ayland 1835e97a28aSMark Cave-AylandCode Optimizations 1849644e714SRichard Henderson================== 1855e97a28aSMark Cave-Ayland 1865e97a28aSMark Cave-AylandWhen generating instructions, you can count on at least the following 1875e97a28aSMark Cave-Aylandoptimizations: 1885e97a28aSMark Cave-Ayland 1895e97a28aSMark Cave-Ayland- Single instructions are simplified, e.g. 1905e97a28aSMark Cave-Ayland 1915e97a28aSMark Cave-Ayland .. code-block:: none 1925e97a28aSMark Cave-Ayland 1935e97a28aSMark Cave-Ayland and_i32 t0, t0, $0xffffffff 1945e97a28aSMark Cave-Ayland 1955e97a28aSMark Cave-Ayland is suppressed. 1965e97a28aSMark Cave-Ayland 1975e97a28aSMark Cave-Ayland- A liveness analysis is done at the basic block level. The 1985e97a28aSMark Cave-Ayland information is used to suppress moves from a dead variable to 1995e97a28aSMark Cave-Ayland another one. It is also used to remove instructions which compute 2005e97a28aSMark Cave-Ayland dead results. The later is especially useful for condition code 2015e97a28aSMark Cave-Ayland optimization in QEMU. 2025e97a28aSMark Cave-Ayland 2035e97a28aSMark Cave-Ayland In the following example: 2045e97a28aSMark Cave-Ayland 2055e97a28aSMark Cave-Ayland .. code-block:: none 2065e97a28aSMark Cave-Ayland 2075e97a28aSMark Cave-Ayland add_i32 t0, t1, t2 2085e97a28aSMark Cave-Ayland add_i32 t0, t0, $1 2095e97a28aSMark Cave-Ayland mov_i32 t0, $1 2105e97a28aSMark Cave-Ayland 2115e97a28aSMark Cave-Ayland only the last instruction is kept. 2125e97a28aSMark Cave-Ayland 2135e97a28aSMark Cave-Ayland 2145e97a28aSMark Cave-AylandInstruction Reference 2155e97a28aSMark Cave-Ayland===================== 2165e97a28aSMark Cave-Ayland 2175e97a28aSMark Cave-AylandFunction call 2185e97a28aSMark Cave-Ayland------------- 2195e97a28aSMark Cave-Ayland 2205e97a28aSMark Cave-Ayland.. list-table:: 2215e97a28aSMark Cave-Ayland 2225e97a28aSMark Cave-Ayland * - call *<ret>* *<params>* ptr 2235e97a28aSMark Cave-Ayland 2245e97a28aSMark Cave-Ayland - | call function 'ptr' (pointer type) 2255e97a28aSMark Cave-Ayland | 2265e97a28aSMark Cave-Ayland | *<ret>* optional 32 bit or 64 bit return value 2275e97a28aSMark Cave-Ayland | *<params>* optional 32 bit or 64 bit parameters 2285e97a28aSMark Cave-Ayland 2295e97a28aSMark Cave-AylandJumps/Labels 2305e97a28aSMark Cave-Ayland------------ 2315e97a28aSMark Cave-Ayland 2325e97a28aSMark Cave-Ayland.. list-table:: 2335e97a28aSMark Cave-Ayland 2345e97a28aSMark Cave-Ayland * - set_label $label 2355e97a28aSMark Cave-Ayland 2365e97a28aSMark Cave-Ayland - | Define label 'label' at the current program point. 2375e97a28aSMark Cave-Ayland 2385e97a28aSMark Cave-Ayland * - br $label 2395e97a28aSMark Cave-Ayland 2405e97a28aSMark Cave-Ayland - | Jump to label. 2415e97a28aSMark Cave-Ayland 242b6d69fceSRichard Henderson * - brcond *t0*, *t1*, *cond*, *label* 2435e97a28aSMark Cave-Ayland 2445e97a28aSMark Cave-Ayland - | Conditional jump if *t0* *cond* *t1* is true. *cond* can be: 2455e97a28aSMark Cave-Ayland | 2465e97a28aSMark Cave-Ayland | ``TCG_COND_EQ`` 2475e97a28aSMark Cave-Ayland | ``TCG_COND_NE`` 2485e97a28aSMark Cave-Ayland | ``TCG_COND_LT /* signed */`` 2495e97a28aSMark Cave-Ayland | ``TCG_COND_GE /* signed */`` 2505e97a28aSMark Cave-Ayland | ``TCG_COND_LE /* signed */`` 2515e97a28aSMark Cave-Ayland | ``TCG_COND_GT /* signed */`` 2525e97a28aSMark Cave-Ayland | ``TCG_COND_LTU /* unsigned */`` 2535e97a28aSMark Cave-Ayland | ``TCG_COND_GEU /* unsigned */`` 2545e97a28aSMark Cave-Ayland | ``TCG_COND_LEU /* unsigned */`` 2555e97a28aSMark Cave-Ayland | ``TCG_COND_GTU /* unsigned */`` 256d48097d0SRichard Henderson | ``TCG_COND_TSTEQ /* t1 & t2 == 0 */`` 257d48097d0SRichard Henderson | ``TCG_COND_TSTNE /* t1 & t2 != 0 */`` 2585e97a28aSMark Cave-Ayland 2595e97a28aSMark Cave-AylandArithmetic 2605e97a28aSMark Cave-Ayland---------- 2615e97a28aSMark Cave-Ayland 2625e97a28aSMark Cave-Ayland.. list-table:: 2635e97a28aSMark Cave-Ayland 26479602f63SRichard Henderson * - add *t0*, *t1*, *t2* 2655e97a28aSMark Cave-Ayland 2665e97a28aSMark Cave-Ayland - | *t0* = *t1* + *t2* 2675e97a28aSMark Cave-Ayland 26860f34f55SRichard Henderson * - sub *t0*, *t1*, *t2* 2695e97a28aSMark Cave-Ayland 2705e97a28aSMark Cave-Ayland - | *t0* = *t1* - *t2* 2715e97a28aSMark Cave-Ayland 27269713587SRichard Henderson * - neg *t0*, *t1* 2735e97a28aSMark Cave-Ayland 2745e97a28aSMark Cave-Ayland - | *t0* = -*t1* (two's complement) 2755e97a28aSMark Cave-Ayland 276d2c3ecadSRichard Henderson * - mul *t0*, *t1*, *t2* 2775e97a28aSMark Cave-Ayland 2785e97a28aSMark Cave-Ayland - | *t0* = *t1* * *t2* 2795e97a28aSMark Cave-Ayland 280b2c514f9SRichard Henderson * - divs *t0*, *t1*, *t2* 2815e97a28aSMark Cave-Ayland 2825e97a28aSMark Cave-Ayland - | *t0* = *t1* / *t2* (signed) 2835e97a28aSMark Cave-Ayland | Undefined behavior if division by zero or overflow. 2845e97a28aSMark Cave-Ayland 285961b80aeSRichard Henderson * - divu *t0*, *t1*, *t2* 2865e97a28aSMark Cave-Ayland 2875e97a28aSMark Cave-Ayland - | *t0* = *t1* / *t2* (unsigned) 2885e97a28aSMark Cave-Ayland | Undefined behavior if division by zero. 2895e97a28aSMark Cave-Ayland 2909a6bc184SRichard Henderson * - rems *t0*, *t1*, *t2* 2915e97a28aSMark Cave-Ayland 2925e97a28aSMark Cave-Ayland - | *t0* = *t1* % *t2* (signed) 2935e97a28aSMark Cave-Ayland | Undefined behavior if division by zero or overflow. 2945e97a28aSMark Cave-Ayland 295cd9acd20SRichard Henderson * - remu *t0*, *t1*, *t2* 2965e97a28aSMark Cave-Ayland 2975e97a28aSMark Cave-Ayland - | *t0* = *t1* % *t2* (unsigned) 2985e97a28aSMark Cave-Ayland | Undefined behavior if division by zero. 2995e97a28aSMark Cave-Ayland 300ee1805b9SRichard Henderson * - divs2 *q*, *r*, *nl*, *nh*, *d* 301ee1805b9SRichard Henderson 302ee1805b9SRichard Henderson - | *q* = *nh:nl* / *d* (signed) 303ee1805b9SRichard Henderson | *r* = *nh:nl* % *d* 304ee1805b9SRichard Henderson | Undefined behaviour if division by zero, or the double-word 305ee1805b9SRichard Henderson numerator divided by the single-word divisor does not fit 306ee1805b9SRichard Henderson within the single-word quotient. The code generator will 307ee1805b9SRichard Henderson pass *nh* as a simple sign-extension of *nl*, so the only 308ee1805b9SRichard Henderson overflow should be *INT_MIN* / -1. 3095e97a28aSMark Cave-Ayland 3108109598bSRichard Henderson * - divu2 *q*, *r*, *nl*, *nh*, *d* 3118109598bSRichard Henderson 3128109598bSRichard Henderson - | *q* = *nh:nl* / *d* (unsigned) 3138109598bSRichard Henderson | *r* = *nh:nl* % *d* 3148109598bSRichard Henderson | Undefined behaviour if division by zero, or the double-word 3158109598bSRichard Henderson numerator divided by the single-word divisor does not fit 3168109598bSRichard Henderson within the single-word quotient. The code generator will 3178109598bSRichard Henderson pass 0 to *nh* to make a simple zero-extension of *nl*, 3188109598bSRichard Henderson so overflow should never occur. 3198109598bSRichard Henderson 3205e97a28aSMark Cave-AylandLogical 3215e97a28aSMark Cave-Ayland------- 3225e97a28aSMark Cave-Ayland 3235e97a28aSMark Cave-Ayland.. list-table:: 3245e97a28aSMark Cave-Ayland 325c3b920b3SRichard Henderson * - and *t0*, *t1*, *t2* 3265e97a28aSMark Cave-Ayland 3275e97a28aSMark Cave-Ayland - | *t0* = *t1* & *t2* 3285e97a28aSMark Cave-Ayland 32949bd7514SRichard Henderson * - or *t0*, *t1*, *t2* 3305e97a28aSMark Cave-Ayland 3315e97a28aSMark Cave-Ayland - | *t0* = *t1* | *t2* 3325e97a28aSMark Cave-Ayland 333fffd3dc9SRichard Henderson * - xor *t0*, *t1*, *t2* 3345e97a28aSMark Cave-Ayland 3355e97a28aSMark Cave-Ayland - | *t0* = *t1* ^ *t2* 3365e97a28aSMark Cave-Ayland 3375c62d377SRichard Henderson * - not *t0*, *t1* 3385e97a28aSMark Cave-Ayland 3395e97a28aSMark Cave-Ayland - | *t0* = ~\ *t1* 3405e97a28aSMark Cave-Ayland 34146f96bffSRichard Henderson * - andc *t0*, *t1*, *t2* 3425e97a28aSMark Cave-Ayland 3435e97a28aSMark Cave-Ayland - | *t0* = *t1* & ~\ *t2* 3445e97a28aSMark Cave-Ayland 3455c0968a7SRichard Henderson * - eqv *t0*, *t1*, *t2* 3465e97a28aSMark Cave-Ayland 3475e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* ^ *t2*), or equivalently, *t0* = *t1* ^ ~\ *t2* 3485e97a28aSMark Cave-Ayland 34959379a45SRichard Henderson * - nand *t0*, *t1*, *t2* 3505e97a28aSMark Cave-Ayland 3515e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* & *t2*) 3525e97a28aSMark Cave-Ayland 3533a8c4e9eSRichard Henderson * - nor *t0*, *t1*, *t2* 3545e97a28aSMark Cave-Ayland 3555e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* | *t2*) 3565e97a28aSMark Cave-Ayland 3576aba25ebSRichard Henderson * - orc *t0*, *t1*, *t2* 3585e97a28aSMark Cave-Ayland 3595e97a28aSMark Cave-Ayland - | *t0* = *t1* | ~\ *t2* 3605e97a28aSMark Cave-Ayland 3615a5bb0a5SRichard Henderson * - clz *t0*, *t1*, *t2* 3625e97a28aSMark Cave-Ayland 3635e97a28aSMark Cave-Ayland - | *t0* = *t1* ? clz(*t1*) : *t2* 3645e97a28aSMark Cave-Ayland 365c96447d8SRichard Henderson * - ctz *t0*, *t1*, *t2* 3665e97a28aSMark Cave-Ayland 3675e97a28aSMark Cave-Ayland - | *t0* = *t1* ? ctz(*t1*) : *t2* 3685e97a28aSMark Cave-Ayland 36997218ae9SRichard Henderson * - ctpop *t0*, *t1* 3705e97a28aSMark Cave-Ayland 3715e97a28aSMark Cave-Ayland - | *t0* = number of bits set in *t1* 3725e97a28aSMark Cave-Ayland | 37397218ae9SRichard Henderson | The name *ctpop* is short for "count population", and matches 37497218ae9SRichard Henderson the function name used in ``include/qemu/host-utils.h``. 3755e97a28aSMark Cave-Ayland 3765e97a28aSMark Cave-Ayland 3775e97a28aSMark Cave-AylandShifts/Rotates 3785e97a28aSMark Cave-Ayland-------------- 3795e97a28aSMark Cave-Ayland 3805e97a28aSMark Cave-Ayland.. list-table:: 3815e97a28aSMark Cave-Ayland 3826ca59451SRichard Henderson * - shl *t0*, *t1*, *t2* 3835e97a28aSMark Cave-Ayland 3845e97a28aSMark Cave-Ayland - | *t0* = *t1* << *t2* 3856ca59451SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 3865e97a28aSMark Cave-Ayland 38774dbd36fSRichard Henderson * - shr *t0*, *t1*, *t2* 3885e97a28aSMark Cave-Ayland 3895e97a28aSMark Cave-Ayland - | *t0* = *t1* >> *t2* (unsigned) 39074dbd36fSRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 3915e97a28aSMark Cave-Ayland 3923949f365SRichard Henderson * - sar *t0*, *t1*, *t2* 3935e97a28aSMark Cave-Ayland 3945e97a28aSMark Cave-Ayland - | *t0* = *t1* >> *t2* (signed) 3953949f365SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 3965e97a28aSMark Cave-Ayland 397005a87e1SRichard Henderson * - rotl *t0*, *t1*, *t2* 3985e97a28aSMark Cave-Ayland 3995e97a28aSMark Cave-Ayland - | Rotation of *t2* bits to the left 400005a87e1SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 4015e97a28aSMark Cave-Ayland 402005a87e1SRichard Henderson * - rotr *t0*, *t1*, *t2* 4035e97a28aSMark Cave-Ayland 4045e97a28aSMark Cave-Ayland - | Rotation of *t2* bits to the right. 405005a87e1SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 4065e97a28aSMark Cave-Ayland 4075e97a28aSMark Cave-Ayland 4085e97a28aSMark Cave-AylandMisc 4095e97a28aSMark Cave-Ayland---- 4105e97a28aSMark Cave-Ayland 4115e97a28aSMark Cave-Ayland.. list-table:: 4125e97a28aSMark Cave-Ayland 413b5701261SRichard Henderson * - mov *t0*, *t1* 4145e97a28aSMark Cave-Ayland 4155e97a28aSMark Cave-Ayland - | *t0* = *t1* 416b5701261SRichard Henderson | Move *t1* to *t0*. 4175e97a28aSMark Cave-Ayland 4180dd07ee1SRichard Henderson * - bswap16 *t0*, *t1*, *flags* 4195e97a28aSMark Cave-Ayland 4205e97a28aSMark Cave-Ayland - | 16 bit byte swap on the low bits of a 32/64 bit input. 4215e97a28aSMark Cave-Ayland | 4225e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_IZ``, then *t1* is known to be zero-extended from bit 15. 4235e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_OZ``, then *t0* will be zero-extended from bit 15. 4245e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_OS``, then *t0* will be sign-extended from bit 15. 4255e97a28aSMark Cave-Ayland | 4265e97a28aSMark Cave-Ayland | If neither ``TCG_BSWAP_OZ`` nor ``TCG_BSWAP_OS`` are set, then the bits of *t0* above bit 15 may contain any value. 4275e97a28aSMark Cave-Ayland 4287498d882SRichard Henderson * - bswap32 *t0*, *t1*, *flags* 4295e97a28aSMark Cave-Ayland 4307498d882SRichard Henderson - | 32 bit byte swap. The flags are the same as for bswap16, except 4317498d882SRichard Henderson they apply from bit 31 instead of bit 15. On TCG_TYPE_I32, the 4327498d882SRichard Henderson flags should be zero. 4335e97a28aSMark Cave-Ayland 4343ad5d4ccSRichard Henderson * - bswap64 *t0*, *t1*, *flags* 4355e97a28aSMark Cave-Ayland 4367498d882SRichard Henderson - | 64 bit byte swap. The flags are ignored, but still present 4373ad5d4ccSRichard Henderson for consistency with the other bswap opcodes. For future 4383ad5d4ccSRichard Henderson compatibility, the flags should be zero. 4395e97a28aSMark Cave-Ayland 4405e97a28aSMark Cave-Ayland * - discard_i32/i64 *t0* 4415e97a28aSMark Cave-Ayland 4425e97a28aSMark Cave-Ayland - | Indicate that the value of *t0* won't be used later. It is useful to 4435e97a28aSMark Cave-Ayland force dead code elimination. 4445e97a28aSMark Cave-Ayland 4454d137ff8SRichard Henderson * - deposit *dest*, *t1*, *t2*, *pos*, *len* 4465e97a28aSMark Cave-Ayland 4475e97a28aSMark Cave-Ayland - | Deposit *t2* as a bitfield into *t1*, placing the result in *dest*. 4485e97a28aSMark Cave-Ayland | 4495e97a28aSMark Cave-Ayland | The bitfield is described by *pos*/*len*, which are immediate values: 4505e97a28aSMark Cave-Ayland | 4515e97a28aSMark Cave-Ayland | *len* - the length of the bitfield 4525e97a28aSMark Cave-Ayland | *pos* - the position of the first bit, counting from the LSB 4535e97a28aSMark Cave-Ayland | 4544d137ff8SRichard Henderson | For example, "deposit dest, t1, t2, 8, 4" indicates a 4-bit field 4555e97a28aSMark Cave-Ayland at bit 8. This operation would be equivalent to 4565e97a28aSMark Cave-Ayland | 4575e97a28aSMark Cave-Ayland | *dest* = (*t1* & ~0x0f00) | ((*t2* << 8) & 0x0f00) 4584d137ff8SRichard Henderson | 4594d137ff8SRichard Henderson | on TCG_TYPE_I32. 4605e97a28aSMark Cave-Ayland 46107d5d502SRichard Henderson * - extract *dest*, *t1*, *pos*, *len* 4625e97a28aSMark Cave-Ayland 463fa361eefSRichard Henderson sextract *dest*, *t1*, *pos*, *len* 4645e97a28aSMark Cave-Ayland 4655e97a28aSMark Cave-Ayland - | Extract a bitfield from *t1*, placing the result in *dest*. 4665e97a28aSMark Cave-Ayland | 4675e97a28aSMark Cave-Ayland | The bitfield is described by *pos*/*len*, which are immediate values, 4685e97a28aSMark Cave-Ayland as above for deposit. For extract_*, the result will be extended 4695e97a28aSMark Cave-Ayland to the left with zeros; for sextract_*, the result will be extended 4705e97a28aSMark Cave-Ayland to the left with copies of the bitfield sign bit at *pos* + *len* - 1. 4715e97a28aSMark Cave-Ayland | 47207d5d502SRichard Henderson | For example, "sextract dest, t1, 8, 4" indicates a 4-bit field 4735e97a28aSMark Cave-Ayland at bit 8. This operation would be equivalent to 4745e97a28aSMark Cave-Ayland | 4755e97a28aSMark Cave-Ayland | *dest* = (*t1* << 20) >> 28 4765e97a28aSMark Cave-Ayland | 47707d5d502SRichard Henderson | (using an arithmetic right shift) on TCG_TYPE_I32. 4785e97a28aSMark Cave-Ayland 47961d6a876SRichard Henderson * - extract2 *dest*, *t1*, *t2*, *pos* 4805e97a28aSMark Cave-Ayland 48161d6a876SRichard Henderson - | For TCG_TYPE_I{N}, extract an N-bit quantity from the concatenation 4825e97a28aSMark Cave-Ayland of *t2*:*t1*, beginning at *pos*. The tcg_gen_extract2_{i32,i64} expander 4835e97a28aSMark Cave-Ayland accepts 0 <= *pos* <= N as inputs. The backend code generator will 4845e97a28aSMark Cave-Ayland not see either 0 or N as inputs for these opcodes. 4855e97a28aSMark Cave-Ayland 4865e97a28aSMark Cave-Ayland * - extrl_i64_i32 *t0*, *t1* 4875e97a28aSMark Cave-Ayland 4885e97a28aSMark Cave-Ayland - | For 64-bit hosts only, extract the low 32-bits of input *t1* and place it 4895e97a28aSMark Cave-Ayland into 32-bit output *t0*. Depending on the host, this may be a simple move, 4905e97a28aSMark Cave-Ayland or may require additional canonicalization. 4915e97a28aSMark Cave-Ayland 4925e97a28aSMark Cave-Ayland * - extrh_i64_i32 *t0*, *t1* 4935e97a28aSMark Cave-Ayland 4945e97a28aSMark Cave-Ayland - | For 64-bit hosts only, extract the high 32-bits of input *t1* and place it 4955e97a28aSMark Cave-Ayland into 32-bit output *t0*. Depending on the host, this may be a simple shift, 4965e97a28aSMark Cave-Ayland or may require additional canonicalization. 4975e97a28aSMark Cave-Ayland 4985e97a28aSMark Cave-Ayland 4995e97a28aSMark Cave-AylandConditional moves 5005e97a28aSMark Cave-Ayland----------------- 5015e97a28aSMark Cave-Ayland 5025e97a28aSMark Cave-Ayland.. list-table:: 5035e97a28aSMark Cave-Ayland 504a363e1e1SRichard Henderson * - setcond *dest*, *t1*, *t2*, *cond* 5055e97a28aSMark Cave-Ayland 5065e97a28aSMark Cave-Ayland - | *dest* = (*t1* *cond* *t2*) 5075e97a28aSMark Cave-Ayland | 5085e97a28aSMark Cave-Ayland | Set *dest* to 1 if (*t1* *cond* *t2*) is true, otherwise set to 0. 5095e97a28aSMark Cave-Ayland 510a363e1e1SRichard Henderson * - negsetcond *dest*, *t1*, *t2*, *cond* 5113635502dSRichard Henderson 5123635502dSRichard Henderson - | *dest* = -(*t1* *cond* *t2*) 5133635502dSRichard Henderson | 5143635502dSRichard Henderson | Set *dest* to -1 if (*t1* *cond* *t2*) is true, otherwise set to 0. 5153635502dSRichard Henderson 516ea46c4bcSRichard Henderson * - movcond *dest*, *c1*, *c2*, *v1*, *v2*, *cond* 5175e97a28aSMark Cave-Ayland 5185e97a28aSMark Cave-Ayland - | *dest* = (*c1* *cond* *c2* ? *v1* : *v2*) 5195e97a28aSMark Cave-Ayland | 5205e97a28aSMark Cave-Ayland | Set *dest* to *v1* if (*c1* *cond* *c2*) is true, otherwise set to *v2*. 5215e97a28aSMark Cave-Ayland 5225e97a28aSMark Cave-Ayland 5235e97a28aSMark Cave-AylandType conversions 5245e97a28aSMark Cave-Ayland---------------- 5255e97a28aSMark Cave-Ayland 5265e97a28aSMark Cave-Ayland.. list-table:: 5275e97a28aSMark Cave-Ayland 5285e97a28aSMark Cave-Ayland * - ext_i32_i64 *t0*, *t1* 5295e97a28aSMark Cave-Ayland 5305e97a28aSMark Cave-Ayland - | Convert *t1* (32 bit) to *t0* (64 bit) and does sign extension 5315e97a28aSMark Cave-Ayland 5325e97a28aSMark Cave-Ayland * - extu_i32_i64 *t0*, *t1* 5335e97a28aSMark Cave-Ayland 5345e97a28aSMark Cave-Ayland - | Convert *t1* (32 bit) to *t0* (64 bit) and does zero extension 5355e97a28aSMark Cave-Ayland 5365e97a28aSMark Cave-Ayland * - trunc_i64_i32 *t0*, *t1* 5375e97a28aSMark Cave-Ayland 5385e97a28aSMark Cave-Ayland - | Truncate *t1* (64 bit) to *t0* (32 bit) 5395e97a28aSMark Cave-Ayland 5405e97a28aSMark Cave-Ayland * - concat_i32_i64 *t0*, *t1*, *t2* 5415e97a28aSMark Cave-Ayland 5425e97a28aSMark Cave-Ayland - | Construct *t0* (64-bit) taking the low half from *t1* (32 bit) and the high half 5435e97a28aSMark Cave-Ayland from *t2* (32 bit). 5445e97a28aSMark Cave-Ayland 5455e97a28aSMark Cave-Ayland * - concat32_i64 *t0*, *t1*, *t2* 5465e97a28aSMark Cave-Ayland 5475e97a28aSMark Cave-Ayland - | Construct *t0* (64-bit) taking the low half from *t1* (64 bit) and the high half 5485e97a28aSMark Cave-Ayland from *t2* (64 bit). 5495e97a28aSMark Cave-Ayland 5505e97a28aSMark Cave-Ayland 5515e97a28aSMark Cave-AylandLoad/Store 5525e97a28aSMark Cave-Ayland---------- 5535e97a28aSMark Cave-Ayland 5545e97a28aSMark Cave-Ayland.. list-table:: 5555e97a28aSMark Cave-Ayland 5565e97a28aSMark Cave-Ayland * - ld_i32/i64 *t0*, *t1*, *offset* 5575e97a28aSMark Cave-Ayland 5585e97a28aSMark Cave-Ayland ld8s_i32/i64 *t0*, *t1*, *offset* 5595e97a28aSMark Cave-Ayland 5605e97a28aSMark Cave-Ayland ld8u_i32/i64 *t0*, *t1*, *offset* 5615e97a28aSMark Cave-Ayland 5625e97a28aSMark Cave-Ayland ld16s_i32/i64 *t0*, *t1*, *offset* 5635e97a28aSMark Cave-Ayland 5645e97a28aSMark Cave-Ayland ld16u_i32/i64 *t0*, *t1*, *offset* 5655e97a28aSMark Cave-Ayland 5665e97a28aSMark Cave-Ayland ld32s_i64 t0, *t1*, *offset* 5675e97a28aSMark Cave-Ayland 5685e97a28aSMark Cave-Ayland ld32u_i64 t0, *t1*, *offset* 5695e97a28aSMark Cave-Ayland 5705e97a28aSMark Cave-Ayland - | *t0* = read(*t1* + *offset*) 5715e97a28aSMark Cave-Ayland | 5725e97a28aSMark Cave-Ayland | Load 8, 16, 32 or 64 bits with or without sign extension from host memory. 5735e97a28aSMark Cave-Ayland *offset* must be a constant. 5745e97a28aSMark Cave-Ayland 5755e97a28aSMark Cave-Ayland * - st_i32/i64 *t0*, *t1*, *offset* 5765e97a28aSMark Cave-Ayland 5775e97a28aSMark Cave-Ayland st8_i32/i64 *t0*, *t1*, *offset* 5785e97a28aSMark Cave-Ayland 5795e97a28aSMark Cave-Ayland st16_i32/i64 *t0*, *t1*, *offset* 5805e97a28aSMark Cave-Ayland 5815e97a28aSMark Cave-Ayland st32_i64 *t0*, *t1*, *offset* 5825e97a28aSMark Cave-Ayland 5835e97a28aSMark Cave-Ayland - | write(*t0*, *t1* + *offset*) 5845e97a28aSMark Cave-Ayland | 5855e97a28aSMark Cave-Ayland | Write 8, 16, 32 or 64 bits to host memory. 5865e97a28aSMark Cave-Ayland 5875e97a28aSMark Cave-AylandAll this opcodes assume that the pointed host memory doesn't correspond 5885e97a28aSMark Cave-Aylandto a global. In the latter case the behaviour is unpredictable. 5895e97a28aSMark Cave-Ayland 5905e97a28aSMark Cave-Ayland 5915e97a28aSMark Cave-AylandMultiword arithmetic support 5925e97a28aSMark Cave-Ayland---------------------------- 5935e97a28aSMark Cave-Ayland 5945e97a28aSMark Cave-Ayland.. list-table:: 5955e97a28aSMark Cave-Ayland 59676f42780SRichard Henderson * - addco *t0*, *t1*, *t2* 59776f42780SRichard Henderson 59876f42780SRichard Henderson - | Compute *t0* = *t1* + *t2* and in addition output to the 59976f42780SRichard Henderson carry bit provided by the host architecture. 60076f42780SRichard Henderson 60176f42780SRichard Henderson * - addci *t0, *t1*, *t2* 60276f42780SRichard Henderson 60376f42780SRichard Henderson - | Compute *t0* = *t1* + *t2* + *C*, where *C* is the 60476f42780SRichard Henderson input carry bit provided by the host architecture. 60576f42780SRichard Henderson The output carry bit need not be computed. 60676f42780SRichard Henderson 60776f42780SRichard Henderson * - addcio *t0, *t1*, *t2* 60876f42780SRichard Henderson 60976f42780SRichard Henderson - | Compute *t0* = *t1* + *t2* + *C*, where *C* is the 61076f42780SRichard Henderson input carry bit provided by the host architecture, 61176f42780SRichard Henderson and also compute the output carry bit. 61276f42780SRichard Henderson 61376f42780SRichard Henderson * - addc1o *t0, *t1*, *t2* 61476f42780SRichard Henderson 61576f42780SRichard Henderson - | Compute *t0* = *t1* + *t2* + 1, and in addition output to the 61676f42780SRichard Henderson carry bit provided by the host architecture. This is akin to 61776f42780SRichard Henderson *addcio* with a fixed carry-in value of 1. 61876f42780SRichard Henderson | This is intended to be used by the optimization pass, 61976f42780SRichard Henderson intermediate to complete folding of the addition chain. 62076f42780SRichard Henderson In some cases complete folding is not possible and this 62176f42780SRichard Henderson opcode will remain until output. If this happens, the 62276f42780SRichard Henderson code generator will use ``tcg_out_set_carry`` and then 62376f42780SRichard Henderson the output routine for *addcio*. 62476f42780SRichard Henderson 62576f42780SRichard Henderson * - subbo *t0*, *t1*, *t2* 62676f42780SRichard Henderson 62776f42780SRichard Henderson - | Compute *t0* = *t1* - *t2* and in addition output to the 62876f42780SRichard Henderson borrow bit provided by the host architecture. 62976f42780SRichard Henderson | Depending on the host architecture, the carry bit may or may not be 63076f42780SRichard Henderson identical to the borrow bit. Thus the addc\* and subb\* 63176f42780SRichard Henderson opcodes must not be mixed. 63276f42780SRichard Henderson 63376f42780SRichard Henderson * - subbi *t0, *t1*, *t2* 63476f42780SRichard Henderson 63576f42780SRichard Henderson - | Compute *t0* = *t1* - *t2* - *B*, where *B* is the 63676f42780SRichard Henderson input borrow bit provided by the host architecture. 63776f42780SRichard Henderson The output borrow bit need not be computed. 63876f42780SRichard Henderson 63976f42780SRichard Henderson * - subbio *t0, *t1*, *t2* 64076f42780SRichard Henderson 64176f42780SRichard Henderson - | Compute *t0* = *t1* - *t2* - *B*, where *B* is the 64276f42780SRichard Henderson input borrow bit provided by the host architecture, 64376f42780SRichard Henderson and also compute the output borrow bit. 64476f42780SRichard Henderson 64576f42780SRichard Henderson * - subb1o *t0, *t1*, *t2* 64676f42780SRichard Henderson 64776f42780SRichard Henderson - | Compute *t0* = *t1* - *t2* - 1, and in addition output to the 64876f42780SRichard Henderson borrow bit provided by the host architecture. This is akin to 64976f42780SRichard Henderson *subbio* with a fixed borrow-in value of 1. 65076f42780SRichard Henderson | This is intended to be used by the optimization pass, 65176f42780SRichard Henderson intermediate to complete folding of the subtraction chain. 65276f42780SRichard Henderson In some cases complete folding is not possible and this 65376f42780SRichard Henderson opcode will remain until output. If this happens, the 65476f42780SRichard Henderson code generator will use ``tcg_out_set_borrow`` and then 65576f42780SRichard Henderson the output routine for *subbio*. 65676f42780SRichard Henderson 657d776198cSRichard Henderson * - mulu2 *t0_low*, *t0_high*, *t1*, *t2* 6585e97a28aSMark Cave-Ayland 6595e97a28aSMark Cave-Ayland - | Similar to mul, except two unsigned inputs *t1* and *t2* yielding the full 6605e97a28aSMark Cave-Ayland double-word product *t0*. The latter is returned in two single-word outputs. 6615e97a28aSMark Cave-Ayland 662bfe96480SRichard Henderson * - muls2 *t0_low*, *t0_high*, *t1*, *t2* 6635e97a28aSMark Cave-Ayland 6645e97a28aSMark Cave-Ayland - | Similar to mulu2, except the two inputs *t1* and *t2* are signed. 6655e97a28aSMark Cave-Ayland 666c742824dSRichard Henderson * - mulsh *t0*, *t1*, *t2* 6675e97a28aSMark Cave-Ayland 668aa28c9efSRichard Henderson muluh *t0*, *t1*, *t2* 6695e97a28aSMark Cave-Ayland 6705e97a28aSMark Cave-Ayland - | Provide the high part of a signed or unsigned multiply, respectively. 6715e97a28aSMark Cave-Ayland | 6725e97a28aSMark Cave-Ayland | If mulu2/muls2 are not provided by the backend, the tcg-op generator 6735e97a28aSMark Cave-Ayland can obtain the same results by emitting a pair of opcodes, mul + muluh/mulsh. 6745e97a28aSMark Cave-Ayland 6755e97a28aSMark Cave-Ayland 6765e97a28aSMark Cave-AylandMemory Barrier support 6775e97a28aSMark Cave-Ayland---------------------- 6785e97a28aSMark Cave-Ayland 6795e97a28aSMark Cave-Ayland.. list-table:: 6805e97a28aSMark Cave-Ayland 6815e97a28aSMark Cave-Ayland * - mb *<$arg>* 6825e97a28aSMark Cave-Ayland 6835e97a28aSMark Cave-Ayland - | Generate a target memory barrier instruction to ensure memory ordering 6845e97a28aSMark Cave-Ayland as being enforced by a corresponding guest memory barrier instruction. 6855e97a28aSMark Cave-Ayland | 6865e97a28aSMark Cave-Ayland | The ordering enforced by the backend may be stricter than the ordering 6875e97a28aSMark Cave-Ayland required by the guest. It cannot be weaker. This opcode takes a constant 6885e97a28aSMark Cave-Ayland argument which is required to generate the appropriate barrier 6895e97a28aSMark Cave-Ayland instruction. The backend should take care to emit the target barrier 6905e97a28aSMark Cave-Ayland instruction only when necessary i.e., for SMP guests and when MTTCG is 6915e97a28aSMark Cave-Ayland enabled. 6925e97a28aSMark Cave-Ayland | 6935e97a28aSMark Cave-Ayland | The guest translators should generate this opcode for all guest instructions 6945e97a28aSMark Cave-Ayland which have ordering side effects. 6955e97a28aSMark Cave-Ayland | 6965e97a28aSMark Cave-Ayland | Please see :ref:`atomics-ref` for more information on memory barriers. 6975e97a28aSMark Cave-Ayland 6985e97a28aSMark Cave-Ayland 6995e97a28aSMark Cave-Ayland64-bit guest on 32-bit host support 7005e97a28aSMark Cave-Ayland----------------------------------- 7015e97a28aSMark Cave-Ayland 7025e97a28aSMark Cave-AylandThe following opcodes are internal to TCG. Thus they are to be implemented by 7035e97a28aSMark Cave-Ayland32-bit host code generators, but are not to be emitted by guest translators. 7045e97a28aSMark Cave-AylandThey are emitted as needed by inline functions within ``tcg-op.h``. 7055e97a28aSMark Cave-Ayland 7065e97a28aSMark Cave-Ayland.. list-table:: 7075e97a28aSMark Cave-Ayland 7085e97a28aSMark Cave-Ayland * - brcond2_i32 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *cond*, *label* 7095e97a28aSMark Cave-Ayland 7105e97a28aSMark Cave-Ayland - | Similar to brcond, except that the 64-bit values *t0* and *t1* 7115e97a28aSMark Cave-Ayland are formed from two 32-bit arguments. 7125e97a28aSMark Cave-Ayland 7135e97a28aSMark Cave-Ayland * - setcond2_i32 *dest*, *t1_low*, *t1_high*, *t2_low*, *t2_high*, *cond* 7145e97a28aSMark Cave-Ayland 7155e97a28aSMark Cave-Ayland - | Similar to setcond, except that the 64-bit values *t1* and *t2* are 7165e97a28aSMark Cave-Ayland formed from two 32-bit arguments. The result is a 32-bit value. 7175e97a28aSMark Cave-Ayland 7185e97a28aSMark Cave-Ayland 7195e97a28aSMark Cave-AylandQEMU specific operations 7205e97a28aSMark Cave-Ayland------------------------ 7215e97a28aSMark Cave-Ayland 7225e97a28aSMark Cave-Ayland.. list-table:: 7235e97a28aSMark Cave-Ayland 7245e97a28aSMark Cave-Ayland * - exit_tb *t0* 7255e97a28aSMark Cave-Ayland 7265e97a28aSMark Cave-Ayland - | Exit the current TB and return the value *t0* (word type). 7275e97a28aSMark Cave-Ayland 7285e97a28aSMark Cave-Ayland * - goto_tb *index* 7295e97a28aSMark Cave-Ayland 7305e97a28aSMark Cave-Ayland - | Exit the current TB and jump to the TB index *index* (constant) if the 7315e97a28aSMark Cave-Ayland current TB was linked to this TB. Otherwise execute the next 7325e97a28aSMark Cave-Ayland instructions. Only indices 0 and 1 are valid and tcg_gen_goto_tb may be issued 7335e97a28aSMark Cave-Ayland at most once with each slot index per TB. 7345e97a28aSMark Cave-Ayland 7355e97a28aSMark Cave-Ayland * - lookup_and_goto_ptr *tb_addr* 7365e97a28aSMark Cave-Ayland 7375e97a28aSMark Cave-Ayland - | Look up a TB address *tb_addr* and jump to it if valid. If not valid, 7385e97a28aSMark Cave-Ayland jump to the TCG epilogue to go back to the exec loop. 7395e97a28aSMark Cave-Ayland | 7405e97a28aSMark Cave-Ayland | This operation is optional. If the TCG backend does not implement the 7415e97a28aSMark Cave-Ayland goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0). 7425e97a28aSMark Cave-Ayland 74312fde9bcSRichard Henderson * - qemu_ld_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* 7445e97a28aSMark Cave-Ayland 74512fde9bcSRichard Henderson qemu_st_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* 7465e97a28aSMark Cave-Ayland 7475e97a28aSMark Cave-Ayland - | Load data at the guest address *t1* into *t0*, or store data in *t0* at guest 74812fde9bcSRichard Henderson address *t1*. The _i32/_i64/_i128 size applies to the size of the input/output 7495e97a28aSMark Cave-Ayland register *t0* only. The address *t1* is always sized according to the guest, 7505e97a28aSMark Cave-Ayland and the width of the memory operation is controlled by *flags*. 7515e97a28aSMark Cave-Ayland | 7525e97a28aSMark Cave-Ayland | Both *t0* and *t1* may be split into little-endian ordered pairs of registers 75312fde9bcSRichard Henderson if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities on 75412fde9bcSRichard Henderson a 64-bit host. 7555e97a28aSMark Cave-Ayland | 7565e97a28aSMark Cave-Ayland | The *memidx* selects the qemu tlb index to use (e.g. user or kernel access). 7575e97a28aSMark Cave-Ayland The flags are the MemOp bits, selecting the sign, width, and endianness 7585e97a28aSMark Cave-Ayland of the memory access. 7595e97a28aSMark Cave-Ayland | 7605e97a28aSMark Cave-Ayland | For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a 7615e97a28aSMark Cave-Ayland 64-bit memory access specified in *flags*. 7625e97a28aSMark Cave-Ayland | 76312fde9bcSRichard Henderson | For qemu_ld/st_i128, these are only supported for a 64-bit host. 7645e97a28aSMark Cave-Ayland 7655e97a28aSMark Cave-Ayland 7665e97a28aSMark Cave-AylandHost vector operations 7675e97a28aSMark Cave-Ayland---------------------- 7685e97a28aSMark Cave-Ayland 7694d872218SRichard HendersonAll of the vector ops have two parameters, ``TCGOP_TYPE`` & ``TCGOP_VECE``. 7704d872218SRichard HendersonThe former specifies the length of the vector as a TCGType; the latter 7714d872218SRichard Hendersonspecifies the length of the element (if applicable) in log2 8-bit units. 7725e97a28aSMark Cave-Ayland 7735e97a28aSMark Cave-Ayland.. list-table:: 7745e97a28aSMark Cave-Ayland 7755e97a28aSMark Cave-Ayland * - mov_vec *v0*, *v1* 776b08caa6dSMark Cave-Ayland 7775e97a28aSMark Cave-Ayland ld_vec *v0*, *t1* 778b08caa6dSMark Cave-Ayland 7795e97a28aSMark Cave-Ayland st_vec *v0*, *t1* 7805e97a28aSMark Cave-Ayland 7815e97a28aSMark Cave-Ayland - | Move, load and store. 7825e97a28aSMark Cave-Ayland 7835e97a28aSMark Cave-Ayland * - dup_vec *v0*, *r1* 7845e97a28aSMark Cave-Ayland 7854d872218SRichard Henderson - | Duplicate the low N bits of *r1* into TYPE/VECE copies across *v0*. 7865e97a28aSMark Cave-Ayland 7875e97a28aSMark Cave-Ayland * - dupi_vec *v0*, *c* 7885e97a28aSMark Cave-Ayland 7895e97a28aSMark Cave-Ayland - | Similarly, for a constant. 7905e97a28aSMark Cave-Ayland | Smaller values will be replicated to host register size by the expanders. 7915e97a28aSMark Cave-Ayland 7925e97a28aSMark Cave-Ayland * - dup2_vec *v0*, *r1*, *r2* 7935e97a28aSMark Cave-Ayland 7944d872218SRichard Henderson - | Duplicate *r2*:*r1* into TYPE/64 copies across *v0*. This opcode is 7955e97a28aSMark Cave-Ayland only present for 32-bit hosts. 7965e97a28aSMark Cave-Ayland 7975e97a28aSMark Cave-Ayland * - add_vec *v0*, *v1*, *v2* 7985e97a28aSMark Cave-Ayland 7995e97a28aSMark Cave-Ayland - | *v0* = *v1* + *v2*, in elements across the vector. 8005e97a28aSMark Cave-Ayland 8015e97a28aSMark Cave-Ayland * - sub_vec *v0*, *v1*, *v2* 8025e97a28aSMark Cave-Ayland 8035e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* - *v2*. 8045e97a28aSMark Cave-Ayland 8055e97a28aSMark Cave-Ayland * - mul_vec *v0*, *v1*, *v2* 8065e97a28aSMark Cave-Ayland 8075e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* * *v2*. 8085e97a28aSMark Cave-Ayland 8095e97a28aSMark Cave-Ayland * - neg_vec *v0*, *v1* 8105e97a28aSMark Cave-Ayland 8115e97a28aSMark Cave-Ayland - | Similarly, *v0* = -*v1*. 8125e97a28aSMark Cave-Ayland 8135e97a28aSMark Cave-Ayland * - abs_vec *v0*, *v1* 8145e97a28aSMark Cave-Ayland 8155e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* < 0 ? -*v1* : *v1*, in elements across the vector. 8165e97a28aSMark Cave-Ayland 8175e97a28aSMark Cave-Ayland * - smin_vec *v0*, *v1*, *v2* 8185e97a28aSMark Cave-Ayland 8195e97a28aSMark Cave-Ayland umin_vec *v0*, *v1*, *v2* 8205e97a28aSMark Cave-Ayland 8215e97a28aSMark Cave-Ayland - | Similarly, *v0* = MIN(*v1*, *v2*), for signed and unsigned element types. 8225e97a28aSMark Cave-Ayland 8235e97a28aSMark Cave-Ayland * - smax_vec *v0*, *v1*, *v2* 8245e97a28aSMark Cave-Ayland 8255e97a28aSMark Cave-Ayland umax_vec *v0*, *v1*, *v2* 8265e97a28aSMark Cave-Ayland 8275e97a28aSMark Cave-Ayland - | Similarly, *v0* = MAX(*v1*, *v2*), for signed and unsigned element types. 8285e97a28aSMark Cave-Ayland 8295e97a28aSMark Cave-Ayland * - ssadd_vec *v0*, *v1*, *v2* 8305e97a28aSMark Cave-Ayland 8315e97a28aSMark Cave-Ayland sssub_vec *v0*, *v1*, *v2* 8325e97a28aSMark Cave-Ayland 8335e97a28aSMark Cave-Ayland usadd_vec *v0*, *v1*, *v2* 8345e97a28aSMark Cave-Ayland 8355e97a28aSMark Cave-Ayland ussub_vec *v0*, *v1*, *v2* 8365e97a28aSMark Cave-Ayland 8375e97a28aSMark Cave-Ayland - | Signed and unsigned saturating addition and subtraction. 8385e97a28aSMark Cave-Ayland | 8395e97a28aSMark Cave-Ayland | If the true result is not representable within the element type, the 8405e97a28aSMark Cave-Ayland element is set to the minimum or maximum value for the type. 8415e97a28aSMark Cave-Ayland 8425e97a28aSMark Cave-Ayland * - and_vec *v0*, *v1*, *v2* 8435e97a28aSMark Cave-Ayland 8445e97a28aSMark Cave-Ayland or_vec *v0*, *v1*, *v2* 8455e97a28aSMark Cave-Ayland 8465e97a28aSMark Cave-Ayland xor_vec *v0*, *v1*, *v2* 8475e97a28aSMark Cave-Ayland 8485e97a28aSMark Cave-Ayland andc_vec *v0*, *v1*, *v2* 8495e97a28aSMark Cave-Ayland 8505e97a28aSMark Cave-Ayland orc_vec *v0*, *v1*, *v2* 8515e97a28aSMark Cave-Ayland 8525e97a28aSMark Cave-Ayland not_vec *v0*, *v1* 8535e97a28aSMark Cave-Ayland 8545e97a28aSMark Cave-Ayland - | Similarly, logical operations with and without complement. 8555e97a28aSMark Cave-Ayland | 8565e97a28aSMark Cave-Ayland | Note that VECE is unused. 8575e97a28aSMark Cave-Ayland 8585e97a28aSMark Cave-Ayland * - shli_vec *v0*, *v1*, *i2* 8595e97a28aSMark Cave-Ayland 8605e97a28aSMark Cave-Ayland shls_vec *v0*, *v1*, *s2* 8615e97a28aSMark Cave-Ayland 8625e97a28aSMark Cave-Ayland - | Shift all elements from v1 by a scalar *i2*/*s2*. I.e. 8635e97a28aSMark Cave-Ayland 8645e97a28aSMark Cave-Ayland .. code-block:: c 8655e97a28aSMark Cave-Ayland 8664d872218SRichard Henderson for (i = 0; i < TYPE/VECE; ++i) { 8675e97a28aSMark Cave-Ayland v0[i] = v1[i] << s2; 8685e97a28aSMark Cave-Ayland } 8695e97a28aSMark Cave-Ayland 8705e97a28aSMark Cave-Ayland * - shri_vec *v0*, *v1*, *i2* 8715e97a28aSMark Cave-Ayland 8725e97a28aSMark Cave-Ayland sari_vec *v0*, *v1*, *i2* 8735e97a28aSMark Cave-Ayland 8745e97a28aSMark Cave-Ayland rotli_vec *v0*, *v1*, *i2* 8755e97a28aSMark Cave-Ayland 8765e97a28aSMark Cave-Ayland shrs_vec *v0*, *v1*, *s2* 8775e97a28aSMark Cave-Ayland 8785e97a28aSMark Cave-Ayland sars_vec *v0*, *v1*, *s2* 8795e97a28aSMark Cave-Ayland 8805e97a28aSMark Cave-Ayland - | Similarly for logical and arithmetic right shift, and left rotate. 8815e97a28aSMark Cave-Ayland 8825e97a28aSMark Cave-Ayland * - shlv_vec *v0*, *v1*, *v2* 8835e97a28aSMark Cave-Ayland 8845e97a28aSMark Cave-Ayland - | Shift elements from *v1* by elements from *v2*. I.e. 8855e97a28aSMark Cave-Ayland 8865e97a28aSMark Cave-Ayland .. code-block:: c 8875e97a28aSMark Cave-Ayland 8884d872218SRichard Henderson for (i = 0; i < TYPE/VECE; ++i) { 8895e97a28aSMark Cave-Ayland v0[i] = v1[i] << v2[i]; 8905e97a28aSMark Cave-Ayland } 8915e97a28aSMark Cave-Ayland 8925e97a28aSMark Cave-Ayland * - shrv_vec *v0*, *v1*, *v2* 8935e97a28aSMark Cave-Ayland 8945e97a28aSMark Cave-Ayland sarv_vec *v0*, *v1*, *v2* 8955e97a28aSMark Cave-Ayland 8965e97a28aSMark Cave-Ayland rotlv_vec *v0*, *v1*, *v2* 8975e97a28aSMark Cave-Ayland 8985e97a28aSMark Cave-Ayland rotrv_vec *v0*, *v1*, *v2* 8995e97a28aSMark Cave-Ayland 9005e97a28aSMark Cave-Ayland - | Similarly for logical and arithmetic right shift, and rotates. 9015e97a28aSMark Cave-Ayland 9025e97a28aSMark Cave-Ayland * - cmp_vec *v0*, *v1*, *v2*, *cond* 9035e97a28aSMark Cave-Ayland 9045e97a28aSMark Cave-Ayland - | Compare vectors by element, storing -1 for true and 0 for false. 9055e97a28aSMark Cave-Ayland 9065e97a28aSMark Cave-Ayland * - bitsel_vec *v0*, *v1*, *v2*, *v3* 9075e97a28aSMark Cave-Ayland 9085e97a28aSMark Cave-Ayland - | Bitwise select, *v0* = (*v2* & *v1*) | (*v3* & ~\ *v1*), across the entire vector. 9095e97a28aSMark Cave-Ayland 9105e97a28aSMark Cave-Ayland * - cmpsel_vec *v0*, *c1*, *c2*, *v3*, *v4*, *cond* 9115e97a28aSMark Cave-Ayland 9125e97a28aSMark Cave-Ayland - | Select elements based on comparison results: 9135e97a28aSMark Cave-Ayland 9145e97a28aSMark Cave-Ayland .. code-block:: c 9155e97a28aSMark Cave-Ayland 9165e97a28aSMark Cave-Ayland for (i = 0; i < n; ++i) { 9175e97a28aSMark Cave-Ayland v0[i] = (c1[i] cond c2[i]) ? v3[i] : v4[i]. 9185e97a28aSMark Cave-Ayland } 9195e97a28aSMark Cave-Ayland 9205e97a28aSMark Cave-Ayland**Note 1**: Some shortcuts are defined when the last operand is known to be 9215e97a28aSMark Cave-Aylanda constant (e.g. addi for add, movi for mov). 9225e97a28aSMark Cave-Ayland 9235e97a28aSMark Cave-Ayland**Note 2**: When using TCG, the opcodes must never be generated directly 9245e97a28aSMark Cave-Aylandas some of them may not be available as "real" opcodes. Always use the 9255e97a28aSMark Cave-Aylandfunction tcg_gen_xxx(args). 9265e97a28aSMark Cave-Ayland 9275e97a28aSMark Cave-Ayland 9285e97a28aSMark Cave-AylandBackend 9295e97a28aSMark Cave-Ayland======= 9305e97a28aSMark Cave-Ayland 9315e97a28aSMark Cave-Ayland``tcg-target.h`` contains the target specific definitions. ``tcg-target.c.inc`` 9325e97a28aSMark Cave-Aylandcontains the target specific code; it is #included by ``tcg/tcg.c``, rather 9335e97a28aSMark Cave-Aylandthan being a standalone C file. 9345e97a28aSMark Cave-Ayland 9355e97a28aSMark Cave-AylandAssumptions 9365e97a28aSMark Cave-Ayland----------- 9375e97a28aSMark Cave-Ayland 9385e97a28aSMark Cave-AylandThe target word size (``TCG_TARGET_REG_BITS``) is expected to be 32 bit or 9395e97a28aSMark Cave-Ayland64 bit. It is expected that the pointer has the same size as the word. 9405e97a28aSMark Cave-Ayland 941*f2b1708eSRichard HendersonOn a 32 bit target, all 64 bit operations are converted to 32 bits. 942*f2b1708eSRichard HendersonA few specific operations must be implemented to allow it 943*f2b1708eSRichard Henderson(see brcond2_i32, setcond2_i32). 9445e97a28aSMark Cave-Ayland 9455e97a28aSMark Cave-AylandOn a 64 bit target, the values are transferred between 32 and 64-bit 9465e97a28aSMark Cave-Aylandregisters using the following ops: 9475e97a28aSMark Cave-Ayland 948bb9d7ee8SPhilippe Mathieu-Daudé- extrl_i64_i32 949bb9d7ee8SPhilippe Mathieu-Daudé- extrh_i64_i32 9505e97a28aSMark Cave-Ayland- ext_i32_i64 9515e97a28aSMark Cave-Ayland- extu_i32_i64 9525e97a28aSMark Cave-Ayland 9535e97a28aSMark Cave-AylandThey ensure that the values are correctly truncated or extended when 9545e97a28aSMark Cave-Aylandmoved from a 32-bit to a 64-bit register or vice-versa. Note that the 955bb9d7ee8SPhilippe Mathieu-Daudéextrl_i64_i32 and extrh_i64_i32 are optional ops. It is not necessary 956bb9d7ee8SPhilippe Mathieu-Daudéto implement them if all the following conditions are met: 9575e97a28aSMark Cave-Ayland 9585e97a28aSMark Cave-Ayland- 64-bit registers can hold 32-bit values 9595e97a28aSMark Cave-Ayland- 32-bit values in a 64-bit register do not need to stay zero or 9605e97a28aSMark Cave-Ayland sign extended 9615e97a28aSMark Cave-Ayland- all 32-bit TCG ops ignore the high part of 64-bit registers 9625e97a28aSMark Cave-Ayland 9635e97a28aSMark Cave-AylandFloating point operations are not supported in this version. A 9645e97a28aSMark Cave-Aylandprevious incarnation of the code generator had full support of them, 9655e97a28aSMark Cave-Aylandbut it is better to concentrate on integer operations first. 9665e97a28aSMark Cave-Ayland 9675e97a28aSMark Cave-AylandConstraints 9685e97a28aSMark Cave-Ayland---------------- 9695e97a28aSMark Cave-Ayland 9705e97a28aSMark Cave-AylandGCC like constraints are used to define the constraints of every 9715e97a28aSMark Cave-Aylandinstruction. Memory constraints are not supported in this 9725e97a28aSMark Cave-Aylandversion. Aliases are specified in the input operands as for GCC. 9735e97a28aSMark Cave-Ayland 9745e97a28aSMark Cave-AylandThe same register may be used for both an input and an output, even when 9755e97a28aSMark Cave-Aylandthey are not explicitly aliased. If an op expands to multiple target 9765e97a28aSMark Cave-Aylandinstructions then care must be taken to avoid clobbering input values. 9775e97a28aSMark Cave-AylandGCC style "early clobber" outputs are supported, with '``&``'. 9785e97a28aSMark Cave-Ayland 9795e97a28aSMark Cave-AylandA target can define specific register or constant constraints. If an 9805e97a28aSMark Cave-Aylandoperation uses a constant input constraint which does not allow all 9815e97a28aSMark Cave-Aylandconstants, it must also accept registers in order to have a fallback. 9825e97a28aSMark Cave-AylandThe constraint '``i``' is defined generically to accept any constant. 9835e97a28aSMark Cave-AylandThe constraint '``r``' is not defined generically, but is consistently 9846b8abd24SRichard Hendersonused by each backend to indicate all registers. If ``TCG_REG_ZERO`` 9856b8abd24SRichard Hendersonis defined by the backend, the constraint '``z``' is defined generically 9866b8abd24SRichard Hendersonto map constant 0 to the hardware zero register. 9875e97a28aSMark Cave-Ayland 9885e97a28aSMark Cave-AylandThe movi_i32 and movi_i64 operations must accept any constants. 9895e97a28aSMark Cave-Ayland 9905e97a28aSMark Cave-AylandThe mov_i32 and mov_i64 operations must accept any registers of the 9915e97a28aSMark Cave-Aylandsame type. 9925e97a28aSMark Cave-Ayland 9935e97a28aSMark Cave-AylandThe ld/st/sti instructions must accept signed 32 bit constant offsets. 9945e97a28aSMark Cave-AylandThis can be implemented by reserving a specific register in which to 9955e97a28aSMark Cave-Aylandcompute the address if the offset is too big. 9965e97a28aSMark Cave-Ayland 9975e97a28aSMark Cave-AylandThe ld/st instructions must accept any destination (ld) or source (st) 9985e97a28aSMark Cave-Aylandregister. 9995e97a28aSMark Cave-Ayland 10005e97a28aSMark Cave-AylandThe sti instruction may fail if it cannot store the given constant. 10015e97a28aSMark Cave-Ayland 10025e97a28aSMark Cave-AylandFunction call assumptions 10035e97a28aSMark Cave-Ayland------------------------- 10045e97a28aSMark Cave-Ayland 10055e97a28aSMark Cave-Ayland- The only supported types for parameters and return value are: 32 and 10065e97a28aSMark Cave-Ayland 64 bit integers and pointer. 10075e97a28aSMark Cave-Ayland- The stack grows downwards. 10085e97a28aSMark Cave-Ayland- The first N parameters are passed in registers. 10095e97a28aSMark Cave-Ayland- The next parameters are passed on the stack by storing them as words. 10105e97a28aSMark Cave-Ayland- Some registers are clobbered during the call. 10115e97a28aSMark Cave-Ayland- The function can return 0 or 1 value in registers. On a 32 bit 10125e97a28aSMark Cave-Ayland target, functions must be able to return 2 values in registers for 10135e97a28aSMark Cave-Ayland 64 bit return type. 10145e97a28aSMark Cave-Ayland 10155e97a28aSMark Cave-Ayland 10165e97a28aSMark Cave-AylandRecommended coding rules for best performance 10175e97a28aSMark Cave-Ayland============================================= 10185e97a28aSMark Cave-Ayland 10195e97a28aSMark Cave-Ayland- Use globals to represent the parts of the QEMU CPU state which are 10205e97a28aSMark Cave-Ayland often modified, e.g. the integer registers and the condition 10215e97a28aSMark Cave-Ayland codes. TCG will be able to use host registers to store them. 10225e97a28aSMark Cave-Ayland 10235e97a28aSMark Cave-Ayland- Don't hesitate to use helpers for complicated or seldom used guest 10245e97a28aSMark Cave-Ayland instructions. There is little performance advantage in using TCG to 10255e97a28aSMark Cave-Ayland implement guest instructions taking more than about twenty TCG 10265e97a28aSMark Cave-Ayland instructions. Note that this rule of thumb is more applicable to 10275e97a28aSMark Cave-Ayland helpers doing complex logic or arithmetic, where the C compiler has 10285e97a28aSMark Cave-Ayland scope to do a good job of optimisation; it is less relevant where 10295e97a28aSMark Cave-Ayland the instruction is mostly doing loads and stores, and in those cases 10305e97a28aSMark Cave-Ayland inline TCG may still be faster for longer sequences. 10315e97a28aSMark Cave-Ayland 10325e97a28aSMark Cave-Ayland- Use the 'discard' instruction if you know that TCG won't be able to 10335e97a28aSMark Cave-Ayland prove that a given global is "dead" at a given program point. The 10345e97a28aSMark Cave-Ayland x86 guest uses it to improve the condition codes optimisation. 1035