15e97a28aSMark Cave-Ayland.. _tcg-ops-ref: 25e97a28aSMark Cave-Ayland 35e97a28aSMark Cave-Ayland******************************* 45e97a28aSMark Cave-AylandTCG Intermediate Representation 55e97a28aSMark Cave-Ayland******************************* 65e97a28aSMark Cave-Ayland 75e97a28aSMark Cave-AylandIntroduction 85e97a28aSMark Cave-Ayland============ 95e97a28aSMark Cave-Ayland 109644e714SRichard HendersonTCG (Tiny Code Generator) began as a generic backend for a C compiler. 119644e714SRichard HendersonIt was simplified to be used in QEMU. It also has its roots in the 129644e714SRichard HendersonQOP code generator written by Paul Brook. 135e97a28aSMark Cave-Ayland 145e97a28aSMark Cave-AylandDefinitions 155e97a28aSMark Cave-Ayland=========== 165e97a28aSMark Cave-Ayland 179644e714SRichard HendersonThe TCG *target* is the architecture for which we generate the code. 189644e714SRichard HendersonIt is of course not the same as the "target" of QEMU which is the 199644e714SRichard Hendersonemulated architecture. As TCG started as a generic C backend used 209644e714SRichard Hendersonfor cross compiling, the assumption was that TCG target might be 219644e714SRichard Hendersondifferent from the host, although this is never the case for QEMU. 225e97a28aSMark Cave-Ayland 235e97a28aSMark Cave-AylandIn this document, we use *guest* to specify what architecture we are 245e97a28aSMark Cave-Aylandemulating; *target* always means the TCG target, the machine on which 255e97a28aSMark Cave-Aylandwe are running QEMU. 265e97a28aSMark Cave-Ayland 275e97a28aSMark Cave-AylandAn operation with *undefined behavior* may result in a crash. 285e97a28aSMark Cave-Ayland 295e97a28aSMark Cave-AylandAn operation with *unspecified behavior* shall not crash. However, 305e97a28aSMark Cave-Aylandthe result may be one of several possibilities so may be considered 315e97a28aSMark Cave-Aylandan *undefined result*. 325e97a28aSMark Cave-Ayland 339644e714SRichard HendersonBasic Blocks 349644e714SRichard Henderson============ 355e97a28aSMark Cave-Ayland 369644e714SRichard HendersonA TCG *basic block* is a single entry, multiple exit region which 379644e714SRichard Hendersoncorresponds to a list of instructions terminated by a label, or 389644e714SRichard Hendersonany branch instruction. 395e97a28aSMark Cave-Ayland 409644e714SRichard HendersonA TCG *extended basic block* is a single entry, multiple exit region 419644e714SRichard Hendersonwhich corresponds to a list of instructions terminated by a label or 429644e714SRichard Hendersonan unconditional branch. Specifically, an extended basic block is 439644e714SRichard Hendersona sequence of basic blocks connected by the fall-through paths of 449644e714SRichard Hendersonzero or more conditional branch instructions. 455e97a28aSMark Cave-Ayland 469644e714SRichard HendersonOperations 479644e714SRichard Henderson========== 485e97a28aSMark Cave-Ayland 499644e714SRichard HendersonTCG instructions or *ops* operate on TCG *variables*, both of which 509644e714SRichard Hendersonare strongly typed. Each instruction has a fixed number of output 519644e714SRichard Hendersonvariable operands, input variable operands and constant operands. 529644e714SRichard HendersonVector instructions have a field specifying the element size within 539644e714SRichard Hendersonthe vector. The notable exception is the call instruction which has 549644e714SRichard Hendersona variable number of outputs and inputs. 555e97a28aSMark Cave-Ayland 565e97a28aSMark Cave-AylandIn the textual form, output operands usually come first, followed by 575e97a28aSMark Cave-Aylandinput operands, followed by constant operands. The output type is 585e97a28aSMark Cave-Aylandincluded in the instruction name. Constants are prefixed with a '$'. 595e97a28aSMark Cave-Ayland 605e97a28aSMark Cave-Ayland.. code-block:: none 615e97a28aSMark Cave-Ayland 625e97a28aSMark Cave-Ayland add_i32 t0, t1, t2 /* (t0 <- t1 + t2) */ 635e97a28aSMark Cave-Ayland 649644e714SRichard HendersonVariables 659644e714SRichard Henderson========= 665e97a28aSMark Cave-Ayland 679644e714SRichard Henderson* ``TEMP_FIXED`` 685e97a28aSMark Cave-Ayland 699644e714SRichard Henderson There is one TCG *fixed global* variable, ``cpu_env``, which is 709644e714SRichard Henderson live in all translation blocks, and holds a pointer to ``CPUArchState``. 719644e714SRichard Henderson This variable is held in a host cpu register at all times in all 729644e714SRichard Henderson translation blocks. 735e97a28aSMark Cave-Ayland 749644e714SRichard Henderson* ``TEMP_GLOBAL`` 755e97a28aSMark Cave-Ayland 769644e714SRichard Henderson A TCG *global* is a variable which is live in all translation blocks, 779644e714SRichard Henderson and corresponds to memory location that is within ``CPUArchState``. 789644e714SRichard Henderson These may be specified as an offset from ``cpu_env``, in which case 799644e714SRichard Henderson they are called *direct globals*, or may be specified as an offset 809644e714SRichard Henderson from a direct global, in which case they are called *indirect globals*. 819644e714SRichard Henderson Even indirect globals should still reference memory within 829644e714SRichard Henderson ``CPUArchState``. All TCG globals are defined during 839644e714SRichard Henderson ``TCGCPUOps.initialize``, before any translation blocks are generated. 845e97a28aSMark Cave-Ayland 859644e714SRichard Henderson* ``TEMP_CONST`` 865e97a28aSMark Cave-Ayland 879644e714SRichard Henderson A TCG *constant* is a variable which is live throughout the entire 889644e714SRichard Henderson translation block, and contains a constant value. These variables 899644e714SRichard Henderson are allocated on demand during translation and are hashed so that 909644e714SRichard Henderson there is exactly one variable holding a given value. 915e97a28aSMark Cave-Ayland 929644e714SRichard Henderson* ``TEMP_TB`` 935e97a28aSMark Cave-Ayland 949644e714SRichard Henderson A TCG *translation block temporary* is a variable which is live 959644e714SRichard Henderson throughout the entire translation block, but dies on any exit. 969644e714SRichard Henderson These temporaries are allocated explicitly during translation. 975e97a28aSMark Cave-Ayland 989644e714SRichard Henderson* ``TEMP_EBB`` 999644e714SRichard Henderson 1009644e714SRichard Henderson A TCG *extended basic block temporary* is a variable which is live 1019644e714SRichard Henderson throughout an extended basic block, but dies on any exit. 1029644e714SRichard Henderson These temporaries are allocated explicitly during translation. 1039644e714SRichard Henderson 1049644e714SRichard HendersonTypes 1059644e714SRichard Henderson===== 1069644e714SRichard Henderson 1079644e714SRichard Henderson* ``TCG_TYPE_I32`` 1089644e714SRichard Henderson 1099644e714SRichard Henderson A 32-bit integer. 1109644e714SRichard Henderson 1119644e714SRichard Henderson* ``TCG_TYPE_I64`` 1129644e714SRichard Henderson 1139644e714SRichard Henderson A 64-bit integer. For 32-bit hosts, such variables are split into a pair 1149644e714SRichard Henderson of variables with ``type=TCG_TYPE_I32`` and ``base_type=TCG_TYPE_I64``. 1159644e714SRichard Henderson The ``temp_subindex`` for each indicates where it falls within the 1169644e714SRichard Henderson host-endian representation. 1179644e714SRichard Henderson 1189644e714SRichard Henderson* ``TCG_TYPE_PTR`` 1199644e714SRichard Henderson 1209644e714SRichard Henderson An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size 1219644e714SRichard Henderson of a pointer for the host. 1229644e714SRichard Henderson 1239644e714SRichard Henderson* ``TCG_TYPE_REG`` 1249644e714SRichard Henderson 1259644e714SRichard Henderson An alias for ``TCG_TYPE_I32`` or ``TCG_TYPE_I64``, depending on the size 1269644e714SRichard Henderson of the integer registers for the host. This may be larger 1279644e714SRichard Henderson than ``TCG_TYPE_PTR`` depending on the host ABI. 1289644e714SRichard Henderson 1299644e714SRichard Henderson* ``TCG_TYPE_I128`` 1309644e714SRichard Henderson 1319644e714SRichard Henderson A 128-bit integer. For all hosts, such variables are split into a number 1329644e714SRichard Henderson of variables with ``type=TCG_TYPE_REG`` and ``base_type=TCG_TYPE_I128``. 1339644e714SRichard Henderson The ``temp_subindex`` for each indicates where it falls within the 1349644e714SRichard Henderson host-endian representation. 1359644e714SRichard Henderson 1369644e714SRichard Henderson* ``TCG_TYPE_V64`` 1379644e714SRichard Henderson 1389644e714SRichard Henderson A 64-bit vector. This type is valid only if the TCG target 1399644e714SRichard Henderson sets ``TCG_TARGET_HAS_v64``. 1409644e714SRichard Henderson 1419644e714SRichard Henderson* ``TCG_TYPE_V128`` 1429644e714SRichard Henderson 1439644e714SRichard Henderson A 128-bit vector. This type is valid only if the TCG target 1449644e714SRichard Henderson sets ``TCG_TARGET_HAS_v128``. 1459644e714SRichard Henderson 1469644e714SRichard Henderson* ``TCG_TYPE_V256`` 1479644e714SRichard Henderson 1489644e714SRichard Henderson A 256-bit vector. This type is valid only if the TCG target 1499644e714SRichard Henderson sets ``TCG_TARGET_HAS_v256``. 1505e97a28aSMark Cave-Ayland 1515e97a28aSMark Cave-AylandHelpers 1529644e714SRichard Henderson======= 1535e97a28aSMark Cave-Ayland 1549644e714SRichard HendersonHelpers are registered in a guest-specific ``helper.h``, 1559644e714SRichard Hendersonwhich is processed to generate ``tcg_gen_helper_*`` functions. 1569644e714SRichard HendersonWith these functions it is possible to call a function taking 1579644e714SRichard Hendersoni32, i64, i128 or pointer types. 1585e97a28aSMark Cave-Ayland 1599644e714SRichard HendersonBy default, before calling a helper, all globals are stored at their 1609644e714SRichard Hendersoncanonical location. By default, the helper is allowed to modify the 1619644e714SRichard HendersonCPU state (including the state represented by tcg globals) 1629644e714SRichard Hendersonor may raise an exception. This default can be overridden using the 1639644e714SRichard Hendersonfollowing function modifiers: 1645e97a28aSMark Cave-Ayland 1659644e714SRichard Henderson* ``TCG_CALL_NO_WRITE_GLOBALS`` 1665e97a28aSMark Cave-Ayland 1679644e714SRichard Henderson The helper does not modify any globals, but may read them. 1689644e714SRichard Henderson Globals will be saved to their canonical location before calling helpers, 1699644e714SRichard Henderson but need not be reloaded afterwards. 1705e97a28aSMark Cave-Ayland 1719644e714SRichard Henderson* ``TCG_CALL_NO_READ_GLOBALS`` 1725e97a28aSMark Cave-Ayland 1739644e714SRichard Henderson The helper does not read globals, either directly or via an exception. 1749644e714SRichard Henderson They will not be saved to their canonical locations before calling 1759644e714SRichard Henderson the helper. This implies ``TCG_CALL_NO_WRITE_GLOBALS``. 1765e97a28aSMark Cave-Ayland 1779644e714SRichard Henderson* ``TCG_CALL_NO_SIDE_EFFECTS`` 1785e97a28aSMark Cave-Ayland 1799644e714SRichard Henderson The call to the helper function may be removed if the return value is 1809644e714SRichard Henderson not used. This means that it may not modify any CPU state nor may it 1819644e714SRichard Henderson raise an exception. 1825e97a28aSMark Cave-Ayland 1835e97a28aSMark Cave-AylandCode Optimizations 1849644e714SRichard Henderson================== 1855e97a28aSMark Cave-Ayland 1865e97a28aSMark Cave-AylandWhen generating instructions, you can count on at least the following 1875e97a28aSMark Cave-Aylandoptimizations: 1885e97a28aSMark Cave-Ayland 1895e97a28aSMark Cave-Ayland- Single instructions are simplified, e.g. 1905e97a28aSMark Cave-Ayland 1915e97a28aSMark Cave-Ayland .. code-block:: none 1925e97a28aSMark Cave-Ayland 1935e97a28aSMark Cave-Ayland and_i32 t0, t0, $0xffffffff 1945e97a28aSMark Cave-Ayland 1955e97a28aSMark Cave-Ayland is suppressed. 1965e97a28aSMark Cave-Ayland 1975e97a28aSMark Cave-Ayland- A liveness analysis is done at the basic block level. The 1985e97a28aSMark Cave-Ayland information is used to suppress moves from a dead variable to 1995e97a28aSMark Cave-Ayland another one. It is also used to remove instructions which compute 2005e97a28aSMark Cave-Ayland dead results. The later is especially useful for condition code 2015e97a28aSMark Cave-Ayland optimization in QEMU. 2025e97a28aSMark Cave-Ayland 2035e97a28aSMark Cave-Ayland In the following example: 2045e97a28aSMark Cave-Ayland 2055e97a28aSMark Cave-Ayland .. code-block:: none 2065e97a28aSMark Cave-Ayland 2075e97a28aSMark Cave-Ayland add_i32 t0, t1, t2 2085e97a28aSMark Cave-Ayland add_i32 t0, t0, $1 2095e97a28aSMark Cave-Ayland mov_i32 t0, $1 2105e97a28aSMark Cave-Ayland 2115e97a28aSMark Cave-Ayland only the last instruction is kept. 2125e97a28aSMark Cave-Ayland 2135e97a28aSMark Cave-Ayland 2145e97a28aSMark Cave-AylandInstruction Reference 2155e97a28aSMark Cave-Ayland===================== 2165e97a28aSMark Cave-Ayland 2175e97a28aSMark Cave-AylandFunction call 2185e97a28aSMark Cave-Ayland------------- 2195e97a28aSMark Cave-Ayland 2205e97a28aSMark Cave-Ayland.. list-table:: 2215e97a28aSMark Cave-Ayland 2225e97a28aSMark Cave-Ayland * - call *<ret>* *<params>* ptr 2235e97a28aSMark Cave-Ayland 2245e97a28aSMark Cave-Ayland - | call function 'ptr' (pointer type) 2255e97a28aSMark Cave-Ayland | 2265e97a28aSMark Cave-Ayland | *<ret>* optional 32 bit or 64 bit return value 2275e97a28aSMark Cave-Ayland | *<params>* optional 32 bit or 64 bit parameters 2285e97a28aSMark Cave-Ayland 2295e97a28aSMark Cave-AylandJumps/Labels 2305e97a28aSMark Cave-Ayland------------ 2315e97a28aSMark Cave-Ayland 2325e97a28aSMark Cave-Ayland.. list-table:: 2335e97a28aSMark Cave-Ayland 2345e97a28aSMark Cave-Ayland * - set_label $label 2355e97a28aSMark Cave-Ayland 2365e97a28aSMark Cave-Ayland - | Define label 'label' at the current program point. 2375e97a28aSMark Cave-Ayland 2385e97a28aSMark Cave-Ayland * - br $label 2395e97a28aSMark Cave-Ayland 2405e97a28aSMark Cave-Ayland - | Jump to label. 2415e97a28aSMark Cave-Ayland 242b6d69fceSRichard Henderson * - brcond *t0*, *t1*, *cond*, *label* 2435e97a28aSMark Cave-Ayland 2445e97a28aSMark Cave-Ayland - | Conditional jump if *t0* *cond* *t1* is true. *cond* can be: 2455e97a28aSMark Cave-Ayland | 2465e97a28aSMark Cave-Ayland | ``TCG_COND_EQ`` 2475e97a28aSMark Cave-Ayland | ``TCG_COND_NE`` 2485e97a28aSMark Cave-Ayland | ``TCG_COND_LT /* signed */`` 2495e97a28aSMark Cave-Ayland | ``TCG_COND_GE /* signed */`` 2505e97a28aSMark Cave-Ayland | ``TCG_COND_LE /* signed */`` 2515e97a28aSMark Cave-Ayland | ``TCG_COND_GT /* signed */`` 2525e97a28aSMark Cave-Ayland | ``TCG_COND_LTU /* unsigned */`` 2535e97a28aSMark Cave-Ayland | ``TCG_COND_GEU /* unsigned */`` 2545e97a28aSMark Cave-Ayland | ``TCG_COND_LEU /* unsigned */`` 2555e97a28aSMark Cave-Ayland | ``TCG_COND_GTU /* unsigned */`` 256d48097d0SRichard Henderson | ``TCG_COND_TSTEQ /* t1 & t2 == 0 */`` 257d48097d0SRichard Henderson | ``TCG_COND_TSTNE /* t1 & t2 != 0 */`` 2585e97a28aSMark Cave-Ayland 2595e97a28aSMark Cave-AylandArithmetic 2605e97a28aSMark Cave-Ayland---------- 2615e97a28aSMark Cave-Ayland 2625e97a28aSMark Cave-Ayland.. list-table:: 2635e97a28aSMark Cave-Ayland 26479602f63SRichard Henderson * - add *t0*, *t1*, *t2* 2655e97a28aSMark Cave-Ayland 2665e97a28aSMark Cave-Ayland - | *t0* = *t1* + *t2* 2675e97a28aSMark Cave-Ayland 26860f34f55SRichard Henderson * - sub *t0*, *t1*, *t2* 2695e97a28aSMark Cave-Ayland 2705e97a28aSMark Cave-Ayland - | *t0* = *t1* - *t2* 2715e97a28aSMark Cave-Ayland 27269713587SRichard Henderson * - neg *t0*, *t1* 2735e97a28aSMark Cave-Ayland 2745e97a28aSMark Cave-Ayland - | *t0* = -*t1* (two's complement) 2755e97a28aSMark Cave-Ayland 276d2c3ecadSRichard Henderson * - mul *t0*, *t1*, *t2* 2775e97a28aSMark Cave-Ayland 2785e97a28aSMark Cave-Ayland - | *t0* = *t1* * *t2* 2795e97a28aSMark Cave-Ayland 280b2c514f9SRichard Henderson * - divs *t0*, *t1*, *t2* 2815e97a28aSMark Cave-Ayland 2825e97a28aSMark Cave-Ayland - | *t0* = *t1* / *t2* (signed) 2835e97a28aSMark Cave-Ayland | Undefined behavior if division by zero or overflow. 2845e97a28aSMark Cave-Ayland 285961b80aeSRichard Henderson * - divu *t0*, *t1*, *t2* 2865e97a28aSMark Cave-Ayland 2875e97a28aSMark Cave-Ayland - | *t0* = *t1* / *t2* (unsigned) 2885e97a28aSMark Cave-Ayland | Undefined behavior if division by zero. 2895e97a28aSMark Cave-Ayland 2909a6bc184SRichard Henderson * - rems *t0*, *t1*, *t2* 2915e97a28aSMark Cave-Ayland 2925e97a28aSMark Cave-Ayland - | *t0* = *t1* % *t2* (signed) 2935e97a28aSMark Cave-Ayland | Undefined behavior if division by zero or overflow. 2945e97a28aSMark Cave-Ayland 295cd9acd20SRichard Henderson * - remu *t0*, *t1*, *t2* 2965e97a28aSMark Cave-Ayland 2975e97a28aSMark Cave-Ayland - | *t0* = *t1* % *t2* (unsigned) 2985e97a28aSMark Cave-Ayland | Undefined behavior if division by zero. 2995e97a28aSMark Cave-Ayland 300ee1805b9SRichard Henderson * - divs2 *q*, *r*, *nl*, *nh*, *d* 301ee1805b9SRichard Henderson 302ee1805b9SRichard Henderson - | *q* = *nh:nl* / *d* (signed) 303ee1805b9SRichard Henderson | *r* = *nh:nl* % *d* 304ee1805b9SRichard Henderson | Undefined behaviour if division by zero, or the double-word 305ee1805b9SRichard Henderson numerator divided by the single-word divisor does not fit 306ee1805b9SRichard Henderson within the single-word quotient. The code generator will 307ee1805b9SRichard Henderson pass *nh* as a simple sign-extension of *nl*, so the only 308ee1805b9SRichard Henderson overflow should be *INT_MIN* / -1. 3095e97a28aSMark Cave-Ayland 3108109598bSRichard Henderson * - divu2 *q*, *r*, *nl*, *nh*, *d* 3118109598bSRichard Henderson 3128109598bSRichard Henderson - | *q* = *nh:nl* / *d* (unsigned) 3138109598bSRichard Henderson | *r* = *nh:nl* % *d* 3148109598bSRichard Henderson | Undefined behaviour if division by zero, or the double-word 3158109598bSRichard Henderson numerator divided by the single-word divisor does not fit 3168109598bSRichard Henderson within the single-word quotient. The code generator will 3178109598bSRichard Henderson pass 0 to *nh* to make a simple zero-extension of *nl*, 3188109598bSRichard Henderson so overflow should never occur. 3198109598bSRichard Henderson 3205e97a28aSMark Cave-AylandLogical 3215e97a28aSMark Cave-Ayland------- 3225e97a28aSMark Cave-Ayland 3235e97a28aSMark Cave-Ayland.. list-table:: 3245e97a28aSMark Cave-Ayland 325c3b920b3SRichard Henderson * - and *t0*, *t1*, *t2* 3265e97a28aSMark Cave-Ayland 3275e97a28aSMark Cave-Ayland - | *t0* = *t1* & *t2* 3285e97a28aSMark Cave-Ayland 32949bd7514SRichard Henderson * - or *t0*, *t1*, *t2* 3305e97a28aSMark Cave-Ayland 3315e97a28aSMark Cave-Ayland - | *t0* = *t1* | *t2* 3325e97a28aSMark Cave-Ayland 333fffd3dc9SRichard Henderson * - xor *t0*, *t1*, *t2* 3345e97a28aSMark Cave-Ayland 3355e97a28aSMark Cave-Ayland - | *t0* = *t1* ^ *t2* 3365e97a28aSMark Cave-Ayland 3375c62d377SRichard Henderson * - not *t0*, *t1* 3385e97a28aSMark Cave-Ayland 3395e97a28aSMark Cave-Ayland - | *t0* = ~\ *t1* 3405e97a28aSMark Cave-Ayland 34146f96bffSRichard Henderson * - andc *t0*, *t1*, *t2* 3425e97a28aSMark Cave-Ayland 3435e97a28aSMark Cave-Ayland - | *t0* = *t1* & ~\ *t2* 3445e97a28aSMark Cave-Ayland 3455c0968a7SRichard Henderson * - eqv *t0*, *t1*, *t2* 3465e97a28aSMark Cave-Ayland 3475e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* ^ *t2*), or equivalently, *t0* = *t1* ^ ~\ *t2* 3485e97a28aSMark Cave-Ayland 34959379a45SRichard Henderson * - nand *t0*, *t1*, *t2* 3505e97a28aSMark Cave-Ayland 3515e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* & *t2*) 3525e97a28aSMark Cave-Ayland 3533a8c4e9eSRichard Henderson * - nor *t0*, *t1*, *t2* 3545e97a28aSMark Cave-Ayland 3555e97a28aSMark Cave-Ayland - | *t0* = ~(*t1* | *t2*) 3565e97a28aSMark Cave-Ayland 3576aba25ebSRichard Henderson * - orc *t0*, *t1*, *t2* 3585e97a28aSMark Cave-Ayland 3595e97a28aSMark Cave-Ayland - | *t0* = *t1* | ~\ *t2* 3605e97a28aSMark Cave-Ayland 3615a5bb0a5SRichard Henderson * - clz *t0*, *t1*, *t2* 3625e97a28aSMark Cave-Ayland 3635e97a28aSMark Cave-Ayland - | *t0* = *t1* ? clz(*t1*) : *t2* 3645e97a28aSMark Cave-Ayland 365c96447d8SRichard Henderson * - ctz *t0*, *t1*, *t2* 3665e97a28aSMark Cave-Ayland 3675e97a28aSMark Cave-Ayland - | *t0* = *t1* ? ctz(*t1*) : *t2* 3685e97a28aSMark Cave-Ayland 36997218ae9SRichard Henderson * - ctpop *t0*, *t1* 3705e97a28aSMark Cave-Ayland 3715e97a28aSMark Cave-Ayland - | *t0* = number of bits set in *t1* 3725e97a28aSMark Cave-Ayland | 37397218ae9SRichard Henderson | The name *ctpop* is short for "count population", and matches 37497218ae9SRichard Henderson the function name used in ``include/qemu/host-utils.h``. 3755e97a28aSMark Cave-Ayland 3765e97a28aSMark Cave-Ayland 3775e97a28aSMark Cave-AylandShifts/Rotates 3785e97a28aSMark Cave-Ayland-------------- 3795e97a28aSMark Cave-Ayland 3805e97a28aSMark Cave-Ayland.. list-table:: 3815e97a28aSMark Cave-Ayland 3826ca59451SRichard Henderson * - shl *t0*, *t1*, *t2* 3835e97a28aSMark Cave-Ayland 3845e97a28aSMark Cave-Ayland - | *t0* = *t1* << *t2* 3856ca59451SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 3865e97a28aSMark Cave-Ayland 38774dbd36fSRichard Henderson * - shr *t0*, *t1*, *t2* 3885e97a28aSMark Cave-Ayland 3895e97a28aSMark Cave-Ayland - | *t0* = *t1* >> *t2* (unsigned) 39074dbd36fSRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 3915e97a28aSMark Cave-Ayland 3923949f365SRichard Henderson * - sar *t0*, *t1*, *t2* 3935e97a28aSMark Cave-Ayland 3945e97a28aSMark Cave-Ayland - | *t0* = *t1* >> *t2* (signed) 3953949f365SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 3965e97a28aSMark Cave-Ayland 397005a87e1SRichard Henderson * - rotl *t0*, *t1*, *t2* 3985e97a28aSMark Cave-Ayland 3995e97a28aSMark Cave-Ayland - | Rotation of *t2* bits to the left 400005a87e1SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 4015e97a28aSMark Cave-Ayland 402005a87e1SRichard Henderson * - rotr *t0*, *t1*, *t2* 4035e97a28aSMark Cave-Ayland 4045e97a28aSMark Cave-Ayland - | Rotation of *t2* bits to the right. 405005a87e1SRichard Henderson | Unspecified behavior for negative or out-of-range shifts. 4065e97a28aSMark Cave-Ayland 4075e97a28aSMark Cave-Ayland 4085e97a28aSMark Cave-AylandMisc 4095e97a28aSMark Cave-Ayland---- 4105e97a28aSMark Cave-Ayland 4115e97a28aSMark Cave-Ayland.. list-table:: 4125e97a28aSMark Cave-Ayland 413b5701261SRichard Henderson * - mov *t0*, *t1* 4145e97a28aSMark Cave-Ayland 4155e97a28aSMark Cave-Ayland - | *t0* = *t1* 416b5701261SRichard Henderson | Move *t1* to *t0*. 4175e97a28aSMark Cave-Ayland 4180dd07ee1SRichard Henderson * - bswap16 *t0*, *t1*, *flags* 4195e97a28aSMark Cave-Ayland 4205e97a28aSMark Cave-Ayland - | 16 bit byte swap on the low bits of a 32/64 bit input. 4215e97a28aSMark Cave-Ayland | 4225e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_IZ``, then *t1* is known to be zero-extended from bit 15. 4235e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_OZ``, then *t0* will be zero-extended from bit 15. 4245e97a28aSMark Cave-Ayland | If *flags* & ``TCG_BSWAP_OS``, then *t0* will be sign-extended from bit 15. 4255e97a28aSMark Cave-Ayland | 4265e97a28aSMark Cave-Ayland | If neither ``TCG_BSWAP_OZ`` nor ``TCG_BSWAP_OS`` are set, then the bits of *t0* above bit 15 may contain any value. 4275e97a28aSMark Cave-Ayland 4287498d882SRichard Henderson * - bswap32 *t0*, *t1*, *flags* 4295e97a28aSMark Cave-Ayland 4307498d882SRichard Henderson - | 32 bit byte swap. The flags are the same as for bswap16, except 4317498d882SRichard Henderson they apply from bit 31 instead of bit 15. On TCG_TYPE_I32, the 4327498d882SRichard Henderson flags should be zero. 4335e97a28aSMark Cave-Ayland 4343ad5d4ccSRichard Henderson * - bswap64 *t0*, *t1*, *flags* 4355e97a28aSMark Cave-Ayland 4367498d882SRichard Henderson - | 64 bit byte swap. The flags are ignored, but still present 4373ad5d4ccSRichard Henderson for consistency with the other bswap opcodes. For future 4383ad5d4ccSRichard Henderson compatibility, the flags should be zero. 4395e97a28aSMark Cave-Ayland 4405e97a28aSMark Cave-Ayland * - discard_i32/i64 *t0* 4415e97a28aSMark Cave-Ayland 4425e97a28aSMark Cave-Ayland - | Indicate that the value of *t0* won't be used later. It is useful to 4435e97a28aSMark Cave-Ayland force dead code elimination. 4445e97a28aSMark Cave-Ayland 4455e97a28aSMark Cave-Ayland * - deposit_i32/i64 *dest*, *t1*, *t2*, *pos*, *len* 4465e97a28aSMark Cave-Ayland 4475e97a28aSMark Cave-Ayland - | Deposit *t2* as a bitfield into *t1*, placing the result in *dest*. 4485e97a28aSMark Cave-Ayland | 4495e97a28aSMark Cave-Ayland | The bitfield is described by *pos*/*len*, which are immediate values: 4505e97a28aSMark Cave-Ayland | 4515e97a28aSMark Cave-Ayland | *len* - the length of the bitfield 4525e97a28aSMark Cave-Ayland | *pos* - the position of the first bit, counting from the LSB 4535e97a28aSMark Cave-Ayland | 4545e97a28aSMark Cave-Ayland | For example, "deposit_i32 dest, t1, t2, 8, 4" indicates a 4-bit field 4555e97a28aSMark Cave-Ayland at bit 8. This operation would be equivalent to 4565e97a28aSMark Cave-Ayland | 4575e97a28aSMark Cave-Ayland | *dest* = (*t1* & ~0x0f00) | ((*t2* << 8) & 0x0f00) 4585e97a28aSMark Cave-Ayland 45907d5d502SRichard Henderson * - extract *dest*, *t1*, *pos*, *len* 4605e97a28aSMark Cave-Ayland 461*fa361eefSRichard Henderson sextract *dest*, *t1*, *pos*, *len* 4625e97a28aSMark Cave-Ayland 4635e97a28aSMark Cave-Ayland - | Extract a bitfield from *t1*, placing the result in *dest*. 4645e97a28aSMark Cave-Ayland | 4655e97a28aSMark Cave-Ayland | The bitfield is described by *pos*/*len*, which are immediate values, 4665e97a28aSMark Cave-Ayland as above for deposit. For extract_*, the result will be extended 4675e97a28aSMark Cave-Ayland to the left with zeros; for sextract_*, the result will be extended 4685e97a28aSMark Cave-Ayland to the left with copies of the bitfield sign bit at *pos* + *len* - 1. 4695e97a28aSMark Cave-Ayland | 47007d5d502SRichard Henderson | For example, "sextract dest, t1, 8, 4" indicates a 4-bit field 4715e97a28aSMark Cave-Ayland at bit 8. This operation would be equivalent to 4725e97a28aSMark Cave-Ayland | 4735e97a28aSMark Cave-Ayland | *dest* = (*t1* << 20) >> 28 4745e97a28aSMark Cave-Ayland | 47507d5d502SRichard Henderson | (using an arithmetic right shift) on TCG_TYPE_I32. 4765e97a28aSMark Cave-Ayland 4775e97a28aSMark Cave-Ayland * - extract2_i32/i64 *dest*, *t1*, *t2*, *pos* 4785e97a28aSMark Cave-Ayland 4795e97a28aSMark Cave-Ayland - | For N = {32,64}, extract an N-bit quantity from the concatenation 4805e97a28aSMark Cave-Ayland of *t2*:*t1*, beginning at *pos*. The tcg_gen_extract2_{i32,i64} expander 4815e97a28aSMark Cave-Ayland accepts 0 <= *pos* <= N as inputs. The backend code generator will 4825e97a28aSMark Cave-Ayland not see either 0 or N as inputs for these opcodes. 4835e97a28aSMark Cave-Ayland 4845e97a28aSMark Cave-Ayland * - extrl_i64_i32 *t0*, *t1* 4855e97a28aSMark Cave-Ayland 4865e97a28aSMark Cave-Ayland - | For 64-bit hosts only, extract the low 32-bits of input *t1* and place it 4875e97a28aSMark Cave-Ayland into 32-bit output *t0*. Depending on the host, this may be a simple move, 4885e97a28aSMark Cave-Ayland or may require additional canonicalization. 4895e97a28aSMark Cave-Ayland 4905e97a28aSMark Cave-Ayland * - extrh_i64_i32 *t0*, *t1* 4915e97a28aSMark Cave-Ayland 4925e97a28aSMark Cave-Ayland - | For 64-bit hosts only, extract the high 32-bits of input *t1* and place it 4935e97a28aSMark Cave-Ayland into 32-bit output *t0*. Depending on the host, this may be a simple shift, 4945e97a28aSMark Cave-Ayland or may require additional canonicalization. 4955e97a28aSMark Cave-Ayland 4965e97a28aSMark Cave-Ayland 4975e97a28aSMark Cave-AylandConditional moves 4985e97a28aSMark Cave-Ayland----------------- 4995e97a28aSMark Cave-Ayland 5005e97a28aSMark Cave-Ayland.. list-table:: 5015e97a28aSMark Cave-Ayland 502a363e1e1SRichard Henderson * - setcond *dest*, *t1*, *t2*, *cond* 5035e97a28aSMark Cave-Ayland 5045e97a28aSMark Cave-Ayland - | *dest* = (*t1* *cond* *t2*) 5055e97a28aSMark Cave-Ayland | 5065e97a28aSMark Cave-Ayland | Set *dest* to 1 if (*t1* *cond* *t2*) is true, otherwise set to 0. 5075e97a28aSMark Cave-Ayland 508a363e1e1SRichard Henderson * - negsetcond *dest*, *t1*, *t2*, *cond* 5093635502dSRichard Henderson 5103635502dSRichard Henderson - | *dest* = -(*t1* *cond* *t2*) 5113635502dSRichard Henderson | 5123635502dSRichard Henderson | Set *dest* to -1 if (*t1* *cond* *t2*) is true, otherwise set to 0. 5133635502dSRichard Henderson 514ea46c4bcSRichard Henderson * - movcond *dest*, *c1*, *c2*, *v1*, *v2*, *cond* 5155e97a28aSMark Cave-Ayland 5165e97a28aSMark Cave-Ayland - | *dest* = (*c1* *cond* *c2* ? *v1* : *v2*) 5175e97a28aSMark Cave-Ayland | 5185e97a28aSMark Cave-Ayland | Set *dest* to *v1* if (*c1* *cond* *c2*) is true, otherwise set to *v2*. 5195e97a28aSMark Cave-Ayland 5205e97a28aSMark Cave-Ayland 5215e97a28aSMark Cave-AylandType conversions 5225e97a28aSMark Cave-Ayland---------------- 5235e97a28aSMark Cave-Ayland 5245e97a28aSMark Cave-Ayland.. list-table:: 5255e97a28aSMark Cave-Ayland 5265e97a28aSMark Cave-Ayland * - ext_i32_i64 *t0*, *t1* 5275e97a28aSMark Cave-Ayland 5285e97a28aSMark Cave-Ayland - | Convert *t1* (32 bit) to *t0* (64 bit) and does sign extension 5295e97a28aSMark Cave-Ayland 5305e97a28aSMark Cave-Ayland * - extu_i32_i64 *t0*, *t1* 5315e97a28aSMark Cave-Ayland 5325e97a28aSMark Cave-Ayland - | Convert *t1* (32 bit) to *t0* (64 bit) and does zero extension 5335e97a28aSMark Cave-Ayland 5345e97a28aSMark Cave-Ayland * - trunc_i64_i32 *t0*, *t1* 5355e97a28aSMark Cave-Ayland 5365e97a28aSMark Cave-Ayland - | Truncate *t1* (64 bit) to *t0* (32 bit) 5375e97a28aSMark Cave-Ayland 5385e97a28aSMark Cave-Ayland * - concat_i32_i64 *t0*, *t1*, *t2* 5395e97a28aSMark Cave-Ayland 5405e97a28aSMark Cave-Ayland - | Construct *t0* (64-bit) taking the low half from *t1* (32 bit) and the high half 5415e97a28aSMark Cave-Ayland from *t2* (32 bit). 5425e97a28aSMark Cave-Ayland 5435e97a28aSMark Cave-Ayland * - concat32_i64 *t0*, *t1*, *t2* 5445e97a28aSMark Cave-Ayland 5455e97a28aSMark Cave-Ayland - | Construct *t0* (64-bit) taking the low half from *t1* (64 bit) and the high half 5465e97a28aSMark Cave-Ayland from *t2* (64 bit). 5475e97a28aSMark Cave-Ayland 5485e97a28aSMark Cave-Ayland 5495e97a28aSMark Cave-AylandLoad/Store 5505e97a28aSMark Cave-Ayland---------- 5515e97a28aSMark Cave-Ayland 5525e97a28aSMark Cave-Ayland.. list-table:: 5535e97a28aSMark Cave-Ayland 5545e97a28aSMark Cave-Ayland * - ld_i32/i64 *t0*, *t1*, *offset* 5555e97a28aSMark Cave-Ayland 5565e97a28aSMark Cave-Ayland ld8s_i32/i64 *t0*, *t1*, *offset* 5575e97a28aSMark Cave-Ayland 5585e97a28aSMark Cave-Ayland ld8u_i32/i64 *t0*, *t1*, *offset* 5595e97a28aSMark Cave-Ayland 5605e97a28aSMark Cave-Ayland ld16s_i32/i64 *t0*, *t1*, *offset* 5615e97a28aSMark Cave-Ayland 5625e97a28aSMark Cave-Ayland ld16u_i32/i64 *t0*, *t1*, *offset* 5635e97a28aSMark Cave-Ayland 5645e97a28aSMark Cave-Ayland ld32s_i64 t0, *t1*, *offset* 5655e97a28aSMark Cave-Ayland 5665e97a28aSMark Cave-Ayland ld32u_i64 t0, *t1*, *offset* 5675e97a28aSMark Cave-Ayland 5685e97a28aSMark Cave-Ayland - | *t0* = read(*t1* + *offset*) 5695e97a28aSMark Cave-Ayland | 5705e97a28aSMark Cave-Ayland | Load 8, 16, 32 or 64 bits with or without sign extension from host memory. 5715e97a28aSMark Cave-Ayland *offset* must be a constant. 5725e97a28aSMark Cave-Ayland 5735e97a28aSMark Cave-Ayland * - st_i32/i64 *t0*, *t1*, *offset* 5745e97a28aSMark Cave-Ayland 5755e97a28aSMark Cave-Ayland st8_i32/i64 *t0*, *t1*, *offset* 5765e97a28aSMark Cave-Ayland 5775e97a28aSMark Cave-Ayland st16_i32/i64 *t0*, *t1*, *offset* 5785e97a28aSMark Cave-Ayland 5795e97a28aSMark Cave-Ayland st32_i64 *t0*, *t1*, *offset* 5805e97a28aSMark Cave-Ayland 5815e97a28aSMark Cave-Ayland - | write(*t0*, *t1* + *offset*) 5825e97a28aSMark Cave-Ayland | 5835e97a28aSMark Cave-Ayland | Write 8, 16, 32 or 64 bits to host memory. 5845e97a28aSMark Cave-Ayland 5855e97a28aSMark Cave-AylandAll this opcodes assume that the pointed host memory doesn't correspond 5865e97a28aSMark Cave-Aylandto a global. In the latter case the behaviour is unpredictable. 5875e97a28aSMark Cave-Ayland 5885e97a28aSMark Cave-Ayland 5895e97a28aSMark Cave-AylandMultiword arithmetic support 5905e97a28aSMark Cave-Ayland---------------------------- 5915e97a28aSMark Cave-Ayland 5925e97a28aSMark Cave-Ayland.. list-table:: 5935e97a28aSMark Cave-Ayland 5945e97a28aSMark Cave-Ayland * - add2_i32/i64 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *t2_low*, *t2_high* 5955e97a28aSMark Cave-Ayland 5965e97a28aSMark Cave-Ayland sub2_i32/i64 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *t2_low*, *t2_high* 5975e97a28aSMark Cave-Ayland 5985e97a28aSMark Cave-Ayland - | Similar to add/sub, except that the double-word inputs *t1* and *t2* are 5995e97a28aSMark Cave-Ayland formed from two single-word arguments, and the double-word output *t0* 6005e97a28aSMark Cave-Ayland is returned in two single-word outputs. 6015e97a28aSMark Cave-Ayland 602d776198cSRichard Henderson * - mulu2 *t0_low*, *t0_high*, *t1*, *t2* 6035e97a28aSMark Cave-Ayland 6045e97a28aSMark Cave-Ayland - | Similar to mul, except two unsigned inputs *t1* and *t2* yielding the full 6055e97a28aSMark Cave-Ayland double-word product *t0*. The latter is returned in two single-word outputs. 6065e97a28aSMark Cave-Ayland 607bfe96480SRichard Henderson * - muls2 *t0_low*, *t0_high*, *t1*, *t2* 6085e97a28aSMark Cave-Ayland 6095e97a28aSMark Cave-Ayland - | Similar to mulu2, except the two inputs *t1* and *t2* are signed. 6105e97a28aSMark Cave-Ayland 611c742824dSRichard Henderson * - mulsh *t0*, *t1*, *t2* 6125e97a28aSMark Cave-Ayland 613aa28c9efSRichard Henderson muluh *t0*, *t1*, *t2* 6145e97a28aSMark Cave-Ayland 6155e97a28aSMark Cave-Ayland - | Provide the high part of a signed or unsigned multiply, respectively. 6165e97a28aSMark Cave-Ayland | 6175e97a28aSMark Cave-Ayland | If mulu2/muls2 are not provided by the backend, the tcg-op generator 6185e97a28aSMark Cave-Ayland can obtain the same results by emitting a pair of opcodes, mul + muluh/mulsh. 6195e97a28aSMark Cave-Ayland 6205e97a28aSMark Cave-Ayland 6215e97a28aSMark Cave-AylandMemory Barrier support 6225e97a28aSMark Cave-Ayland---------------------- 6235e97a28aSMark Cave-Ayland 6245e97a28aSMark Cave-Ayland.. list-table:: 6255e97a28aSMark Cave-Ayland 6265e97a28aSMark Cave-Ayland * - mb *<$arg>* 6275e97a28aSMark Cave-Ayland 6285e97a28aSMark Cave-Ayland - | Generate a target memory barrier instruction to ensure memory ordering 6295e97a28aSMark Cave-Ayland as being enforced by a corresponding guest memory barrier instruction. 6305e97a28aSMark Cave-Ayland | 6315e97a28aSMark Cave-Ayland | The ordering enforced by the backend may be stricter than the ordering 6325e97a28aSMark Cave-Ayland required by the guest. It cannot be weaker. This opcode takes a constant 6335e97a28aSMark Cave-Ayland argument which is required to generate the appropriate barrier 6345e97a28aSMark Cave-Ayland instruction. The backend should take care to emit the target barrier 6355e97a28aSMark Cave-Ayland instruction only when necessary i.e., for SMP guests and when MTTCG is 6365e97a28aSMark Cave-Ayland enabled. 6375e97a28aSMark Cave-Ayland | 6385e97a28aSMark Cave-Ayland | The guest translators should generate this opcode for all guest instructions 6395e97a28aSMark Cave-Ayland which have ordering side effects. 6405e97a28aSMark Cave-Ayland | 6415e97a28aSMark Cave-Ayland | Please see :ref:`atomics-ref` for more information on memory barriers. 6425e97a28aSMark Cave-Ayland 6435e97a28aSMark Cave-Ayland 6445e97a28aSMark Cave-Ayland64-bit guest on 32-bit host support 6455e97a28aSMark Cave-Ayland----------------------------------- 6465e97a28aSMark Cave-Ayland 6475e97a28aSMark Cave-AylandThe following opcodes are internal to TCG. Thus they are to be implemented by 6485e97a28aSMark Cave-Ayland32-bit host code generators, but are not to be emitted by guest translators. 6495e97a28aSMark Cave-AylandThey are emitted as needed by inline functions within ``tcg-op.h``. 6505e97a28aSMark Cave-Ayland 6515e97a28aSMark Cave-Ayland.. list-table:: 6525e97a28aSMark Cave-Ayland 6535e97a28aSMark Cave-Ayland * - brcond2_i32 *t0_low*, *t0_high*, *t1_low*, *t1_high*, *cond*, *label* 6545e97a28aSMark Cave-Ayland 6555e97a28aSMark Cave-Ayland - | Similar to brcond, except that the 64-bit values *t0* and *t1* 6565e97a28aSMark Cave-Ayland are formed from two 32-bit arguments. 6575e97a28aSMark Cave-Ayland 6585e97a28aSMark Cave-Ayland * - setcond2_i32 *dest*, *t1_low*, *t1_high*, *t2_low*, *t2_high*, *cond* 6595e97a28aSMark Cave-Ayland 6605e97a28aSMark Cave-Ayland - | Similar to setcond, except that the 64-bit values *t1* and *t2* are 6615e97a28aSMark Cave-Ayland formed from two 32-bit arguments. The result is a 32-bit value. 6625e97a28aSMark Cave-Ayland 6635e97a28aSMark Cave-Ayland 6645e97a28aSMark Cave-AylandQEMU specific operations 6655e97a28aSMark Cave-Ayland------------------------ 6665e97a28aSMark Cave-Ayland 6675e97a28aSMark Cave-Ayland.. list-table:: 6685e97a28aSMark Cave-Ayland 6695e97a28aSMark Cave-Ayland * - exit_tb *t0* 6705e97a28aSMark Cave-Ayland 6715e97a28aSMark Cave-Ayland - | Exit the current TB and return the value *t0* (word type). 6725e97a28aSMark Cave-Ayland 6735e97a28aSMark Cave-Ayland * - goto_tb *index* 6745e97a28aSMark Cave-Ayland 6755e97a28aSMark Cave-Ayland - | Exit the current TB and jump to the TB index *index* (constant) if the 6765e97a28aSMark Cave-Ayland current TB was linked to this TB. Otherwise execute the next 6775e97a28aSMark Cave-Ayland instructions. Only indices 0 and 1 are valid and tcg_gen_goto_tb may be issued 6785e97a28aSMark Cave-Ayland at most once with each slot index per TB. 6795e97a28aSMark Cave-Ayland 6805e97a28aSMark Cave-Ayland * - lookup_and_goto_ptr *tb_addr* 6815e97a28aSMark Cave-Ayland 6825e97a28aSMark Cave-Ayland - | Look up a TB address *tb_addr* and jump to it if valid. If not valid, 6835e97a28aSMark Cave-Ayland jump to the TCG epilogue to go back to the exec loop. 6845e97a28aSMark Cave-Ayland | 6855e97a28aSMark Cave-Ayland | This operation is optional. If the TCG backend does not implement the 6865e97a28aSMark Cave-Ayland goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0). 6875e97a28aSMark Cave-Ayland 68812fde9bcSRichard Henderson * - qemu_ld_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* 6895e97a28aSMark Cave-Ayland 69012fde9bcSRichard Henderson qemu_st_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* 6915e97a28aSMark Cave-Ayland 6925e97a28aSMark Cave-Ayland qemu_st8_i32 *t0*, *t1*, *flags*, *memidx* 6935e97a28aSMark Cave-Ayland 6945e97a28aSMark Cave-Ayland - | Load data at the guest address *t1* into *t0*, or store data in *t0* at guest 69512fde9bcSRichard Henderson address *t1*. The _i32/_i64/_i128 size applies to the size of the input/output 6965e97a28aSMark Cave-Ayland register *t0* only. The address *t1* is always sized according to the guest, 6975e97a28aSMark Cave-Ayland and the width of the memory operation is controlled by *flags*. 6985e97a28aSMark Cave-Ayland | 6995e97a28aSMark Cave-Ayland | Both *t0* and *t1* may be split into little-endian ordered pairs of registers 70012fde9bcSRichard Henderson if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities on 70112fde9bcSRichard Henderson a 64-bit host. 7025e97a28aSMark Cave-Ayland | 7035e97a28aSMark Cave-Ayland | The *memidx* selects the qemu tlb index to use (e.g. user or kernel access). 7045e97a28aSMark Cave-Ayland The flags are the MemOp bits, selecting the sign, width, and endianness 7055e97a28aSMark Cave-Ayland of the memory access. 7065e97a28aSMark Cave-Ayland | 7075e97a28aSMark Cave-Ayland | For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a 7085e97a28aSMark Cave-Ayland 64-bit memory access specified in *flags*. 7095e97a28aSMark Cave-Ayland | 71012fde9bcSRichard Henderson | For qemu_ld/st_i128, these are only supported for a 64-bit host. 71112fde9bcSRichard Henderson | 7125e97a28aSMark Cave-Ayland | For i386, qemu_st8_i32 is exactly like qemu_st_i32, except the size of 7135e97a28aSMark Cave-Ayland the memory operation is known to be 8-bit. This allows the backend to 7145e97a28aSMark Cave-Ayland provide a different set of register constraints. 7155e97a28aSMark Cave-Ayland 7165e97a28aSMark Cave-Ayland 7175e97a28aSMark Cave-AylandHost vector operations 7185e97a28aSMark Cave-Ayland---------------------- 7195e97a28aSMark Cave-Ayland 7204d872218SRichard HendersonAll of the vector ops have two parameters, ``TCGOP_TYPE`` & ``TCGOP_VECE``. 7214d872218SRichard HendersonThe former specifies the length of the vector as a TCGType; the latter 7224d872218SRichard Hendersonspecifies the length of the element (if applicable) in log2 8-bit units. 7235e97a28aSMark Cave-Ayland 7245e97a28aSMark Cave-Ayland.. list-table:: 7255e97a28aSMark Cave-Ayland 7265e97a28aSMark Cave-Ayland * - mov_vec *v0*, *v1* 727b08caa6dSMark Cave-Ayland 7285e97a28aSMark Cave-Ayland ld_vec *v0*, *t1* 729b08caa6dSMark Cave-Ayland 7305e97a28aSMark Cave-Ayland st_vec *v0*, *t1* 7315e97a28aSMark Cave-Ayland 7325e97a28aSMark Cave-Ayland - | Move, load and store. 7335e97a28aSMark Cave-Ayland 7345e97a28aSMark Cave-Ayland * - dup_vec *v0*, *r1* 7355e97a28aSMark Cave-Ayland 7364d872218SRichard Henderson - | Duplicate the low N bits of *r1* into TYPE/VECE copies across *v0*. 7375e97a28aSMark Cave-Ayland 7385e97a28aSMark Cave-Ayland * - dupi_vec *v0*, *c* 7395e97a28aSMark Cave-Ayland 7405e97a28aSMark Cave-Ayland - | Similarly, for a constant. 7415e97a28aSMark Cave-Ayland | Smaller values will be replicated to host register size by the expanders. 7425e97a28aSMark Cave-Ayland 7435e97a28aSMark Cave-Ayland * - dup2_vec *v0*, *r1*, *r2* 7445e97a28aSMark Cave-Ayland 7454d872218SRichard Henderson - | Duplicate *r2*:*r1* into TYPE/64 copies across *v0*. This opcode is 7465e97a28aSMark Cave-Ayland only present for 32-bit hosts. 7475e97a28aSMark Cave-Ayland 7485e97a28aSMark Cave-Ayland * - add_vec *v0*, *v1*, *v2* 7495e97a28aSMark Cave-Ayland 7505e97a28aSMark Cave-Ayland - | *v0* = *v1* + *v2*, in elements across the vector. 7515e97a28aSMark Cave-Ayland 7525e97a28aSMark Cave-Ayland * - sub_vec *v0*, *v1*, *v2* 7535e97a28aSMark Cave-Ayland 7545e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* - *v2*. 7555e97a28aSMark Cave-Ayland 7565e97a28aSMark Cave-Ayland * - mul_vec *v0*, *v1*, *v2* 7575e97a28aSMark Cave-Ayland 7585e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* * *v2*. 7595e97a28aSMark Cave-Ayland 7605e97a28aSMark Cave-Ayland * - neg_vec *v0*, *v1* 7615e97a28aSMark Cave-Ayland 7625e97a28aSMark Cave-Ayland - | Similarly, *v0* = -*v1*. 7635e97a28aSMark Cave-Ayland 7645e97a28aSMark Cave-Ayland * - abs_vec *v0*, *v1* 7655e97a28aSMark Cave-Ayland 7665e97a28aSMark Cave-Ayland - | Similarly, *v0* = *v1* < 0 ? -*v1* : *v1*, in elements across the vector. 7675e97a28aSMark Cave-Ayland 7685e97a28aSMark Cave-Ayland * - smin_vec *v0*, *v1*, *v2* 7695e97a28aSMark Cave-Ayland 7705e97a28aSMark Cave-Ayland umin_vec *v0*, *v1*, *v2* 7715e97a28aSMark Cave-Ayland 7725e97a28aSMark Cave-Ayland - | Similarly, *v0* = MIN(*v1*, *v2*), for signed and unsigned element types. 7735e97a28aSMark Cave-Ayland 7745e97a28aSMark Cave-Ayland * - smax_vec *v0*, *v1*, *v2* 7755e97a28aSMark Cave-Ayland 7765e97a28aSMark Cave-Ayland umax_vec *v0*, *v1*, *v2* 7775e97a28aSMark Cave-Ayland 7785e97a28aSMark Cave-Ayland - | Similarly, *v0* = MAX(*v1*, *v2*), for signed and unsigned element types. 7795e97a28aSMark Cave-Ayland 7805e97a28aSMark Cave-Ayland * - ssadd_vec *v0*, *v1*, *v2* 7815e97a28aSMark Cave-Ayland 7825e97a28aSMark Cave-Ayland sssub_vec *v0*, *v1*, *v2* 7835e97a28aSMark Cave-Ayland 7845e97a28aSMark Cave-Ayland usadd_vec *v0*, *v1*, *v2* 7855e97a28aSMark Cave-Ayland 7865e97a28aSMark Cave-Ayland ussub_vec *v0*, *v1*, *v2* 7875e97a28aSMark Cave-Ayland 7885e97a28aSMark Cave-Ayland - | Signed and unsigned saturating addition and subtraction. 7895e97a28aSMark Cave-Ayland | 7905e97a28aSMark Cave-Ayland | If the true result is not representable within the element type, the 7915e97a28aSMark Cave-Ayland element is set to the minimum or maximum value for the type. 7925e97a28aSMark Cave-Ayland 7935e97a28aSMark Cave-Ayland * - and_vec *v0*, *v1*, *v2* 7945e97a28aSMark Cave-Ayland 7955e97a28aSMark Cave-Ayland or_vec *v0*, *v1*, *v2* 7965e97a28aSMark Cave-Ayland 7975e97a28aSMark Cave-Ayland xor_vec *v0*, *v1*, *v2* 7985e97a28aSMark Cave-Ayland 7995e97a28aSMark Cave-Ayland andc_vec *v0*, *v1*, *v2* 8005e97a28aSMark Cave-Ayland 8015e97a28aSMark Cave-Ayland orc_vec *v0*, *v1*, *v2* 8025e97a28aSMark Cave-Ayland 8035e97a28aSMark Cave-Ayland not_vec *v0*, *v1* 8045e97a28aSMark Cave-Ayland 8055e97a28aSMark Cave-Ayland - | Similarly, logical operations with and without complement. 8065e97a28aSMark Cave-Ayland | 8075e97a28aSMark Cave-Ayland | Note that VECE is unused. 8085e97a28aSMark Cave-Ayland 8095e97a28aSMark Cave-Ayland * - shli_vec *v0*, *v1*, *i2* 8105e97a28aSMark Cave-Ayland 8115e97a28aSMark Cave-Ayland shls_vec *v0*, *v1*, *s2* 8125e97a28aSMark Cave-Ayland 8135e97a28aSMark Cave-Ayland - | Shift all elements from v1 by a scalar *i2*/*s2*. I.e. 8145e97a28aSMark Cave-Ayland 8155e97a28aSMark Cave-Ayland .. code-block:: c 8165e97a28aSMark Cave-Ayland 8174d872218SRichard Henderson for (i = 0; i < TYPE/VECE; ++i) { 8185e97a28aSMark Cave-Ayland v0[i] = v1[i] << s2; 8195e97a28aSMark Cave-Ayland } 8205e97a28aSMark Cave-Ayland 8215e97a28aSMark Cave-Ayland * - shri_vec *v0*, *v1*, *i2* 8225e97a28aSMark Cave-Ayland 8235e97a28aSMark Cave-Ayland sari_vec *v0*, *v1*, *i2* 8245e97a28aSMark Cave-Ayland 8255e97a28aSMark Cave-Ayland rotli_vec *v0*, *v1*, *i2* 8265e97a28aSMark Cave-Ayland 8275e97a28aSMark Cave-Ayland shrs_vec *v0*, *v1*, *s2* 8285e97a28aSMark Cave-Ayland 8295e97a28aSMark Cave-Ayland sars_vec *v0*, *v1*, *s2* 8305e97a28aSMark Cave-Ayland 8315e97a28aSMark Cave-Ayland - | Similarly for logical and arithmetic right shift, and left rotate. 8325e97a28aSMark Cave-Ayland 8335e97a28aSMark Cave-Ayland * - shlv_vec *v0*, *v1*, *v2* 8345e97a28aSMark Cave-Ayland 8355e97a28aSMark Cave-Ayland - | Shift elements from *v1* by elements from *v2*. I.e. 8365e97a28aSMark Cave-Ayland 8375e97a28aSMark Cave-Ayland .. code-block:: c 8385e97a28aSMark Cave-Ayland 8394d872218SRichard Henderson for (i = 0; i < TYPE/VECE; ++i) { 8405e97a28aSMark Cave-Ayland v0[i] = v1[i] << v2[i]; 8415e97a28aSMark Cave-Ayland } 8425e97a28aSMark Cave-Ayland 8435e97a28aSMark Cave-Ayland * - shrv_vec *v0*, *v1*, *v2* 8445e97a28aSMark Cave-Ayland 8455e97a28aSMark Cave-Ayland sarv_vec *v0*, *v1*, *v2* 8465e97a28aSMark Cave-Ayland 8475e97a28aSMark Cave-Ayland rotlv_vec *v0*, *v1*, *v2* 8485e97a28aSMark Cave-Ayland 8495e97a28aSMark Cave-Ayland rotrv_vec *v0*, *v1*, *v2* 8505e97a28aSMark Cave-Ayland 8515e97a28aSMark Cave-Ayland - | Similarly for logical and arithmetic right shift, and rotates. 8525e97a28aSMark Cave-Ayland 8535e97a28aSMark Cave-Ayland * - cmp_vec *v0*, *v1*, *v2*, *cond* 8545e97a28aSMark Cave-Ayland 8555e97a28aSMark Cave-Ayland - | Compare vectors by element, storing -1 for true and 0 for false. 8565e97a28aSMark Cave-Ayland 8575e97a28aSMark Cave-Ayland * - bitsel_vec *v0*, *v1*, *v2*, *v3* 8585e97a28aSMark Cave-Ayland 8595e97a28aSMark Cave-Ayland - | Bitwise select, *v0* = (*v2* & *v1*) | (*v3* & ~\ *v1*), across the entire vector. 8605e97a28aSMark Cave-Ayland 8615e97a28aSMark Cave-Ayland * - cmpsel_vec *v0*, *c1*, *c2*, *v3*, *v4*, *cond* 8625e97a28aSMark Cave-Ayland 8635e97a28aSMark Cave-Ayland - | Select elements based on comparison results: 8645e97a28aSMark Cave-Ayland 8655e97a28aSMark Cave-Ayland .. code-block:: c 8665e97a28aSMark Cave-Ayland 8675e97a28aSMark Cave-Ayland for (i = 0; i < n; ++i) { 8685e97a28aSMark Cave-Ayland v0[i] = (c1[i] cond c2[i]) ? v3[i] : v4[i]. 8695e97a28aSMark Cave-Ayland } 8705e97a28aSMark Cave-Ayland 8715e97a28aSMark Cave-Ayland**Note 1**: Some shortcuts are defined when the last operand is known to be 8725e97a28aSMark Cave-Aylanda constant (e.g. addi for add, movi for mov). 8735e97a28aSMark Cave-Ayland 8745e97a28aSMark Cave-Ayland**Note 2**: When using TCG, the opcodes must never be generated directly 8755e97a28aSMark Cave-Aylandas some of them may not be available as "real" opcodes. Always use the 8765e97a28aSMark Cave-Aylandfunction tcg_gen_xxx(args). 8775e97a28aSMark Cave-Ayland 8785e97a28aSMark Cave-Ayland 8795e97a28aSMark Cave-AylandBackend 8805e97a28aSMark Cave-Ayland======= 8815e97a28aSMark Cave-Ayland 8825e97a28aSMark Cave-Ayland``tcg-target.h`` contains the target specific definitions. ``tcg-target.c.inc`` 8835e97a28aSMark Cave-Aylandcontains the target specific code; it is #included by ``tcg/tcg.c``, rather 8845e97a28aSMark Cave-Aylandthan being a standalone C file. 8855e97a28aSMark Cave-Ayland 8865e97a28aSMark Cave-AylandAssumptions 8875e97a28aSMark Cave-Ayland----------- 8885e97a28aSMark Cave-Ayland 8895e97a28aSMark Cave-AylandThe target word size (``TCG_TARGET_REG_BITS``) is expected to be 32 bit or 8905e97a28aSMark Cave-Ayland64 bit. It is expected that the pointer has the same size as the word. 8915e97a28aSMark Cave-Ayland 8925e97a28aSMark Cave-AylandOn a 32 bit target, all 64 bit operations are converted to 32 bits. A 8935e97a28aSMark Cave-Aylandfew specific operations must be implemented to allow it (see add2_i32, 8945e97a28aSMark Cave-Aylandsub2_i32, brcond2_i32). 8955e97a28aSMark Cave-Ayland 8965e97a28aSMark Cave-AylandOn a 64 bit target, the values are transferred between 32 and 64-bit 8975e97a28aSMark Cave-Aylandregisters using the following ops: 8985e97a28aSMark Cave-Ayland 899bb9d7ee8SPhilippe Mathieu-Daudé- extrl_i64_i32 900bb9d7ee8SPhilippe Mathieu-Daudé- extrh_i64_i32 9015e97a28aSMark Cave-Ayland- ext_i32_i64 9025e97a28aSMark Cave-Ayland- extu_i32_i64 9035e97a28aSMark Cave-Ayland 9045e97a28aSMark Cave-AylandThey ensure that the values are correctly truncated or extended when 9055e97a28aSMark Cave-Aylandmoved from a 32-bit to a 64-bit register or vice-versa. Note that the 906bb9d7ee8SPhilippe Mathieu-Daudéextrl_i64_i32 and extrh_i64_i32 are optional ops. It is not necessary 907bb9d7ee8SPhilippe Mathieu-Daudéto implement them if all the following conditions are met: 9085e97a28aSMark Cave-Ayland 9095e97a28aSMark Cave-Ayland- 64-bit registers can hold 32-bit values 9105e97a28aSMark Cave-Ayland- 32-bit values in a 64-bit register do not need to stay zero or 9115e97a28aSMark Cave-Ayland sign extended 9125e97a28aSMark Cave-Ayland- all 32-bit TCG ops ignore the high part of 64-bit registers 9135e97a28aSMark Cave-Ayland 9145e97a28aSMark Cave-AylandFloating point operations are not supported in this version. A 9155e97a28aSMark Cave-Aylandprevious incarnation of the code generator had full support of them, 9165e97a28aSMark Cave-Aylandbut it is better to concentrate on integer operations first. 9175e97a28aSMark Cave-Ayland 9185e97a28aSMark Cave-AylandConstraints 9195e97a28aSMark Cave-Ayland---------------- 9205e97a28aSMark Cave-Ayland 9215e97a28aSMark Cave-AylandGCC like constraints are used to define the constraints of every 9225e97a28aSMark Cave-Aylandinstruction. Memory constraints are not supported in this 9235e97a28aSMark Cave-Aylandversion. Aliases are specified in the input operands as for GCC. 9245e97a28aSMark Cave-Ayland 9255e97a28aSMark Cave-AylandThe same register may be used for both an input and an output, even when 9265e97a28aSMark Cave-Aylandthey are not explicitly aliased. If an op expands to multiple target 9275e97a28aSMark Cave-Aylandinstructions then care must be taken to avoid clobbering input values. 9285e97a28aSMark Cave-AylandGCC style "early clobber" outputs are supported, with '``&``'. 9295e97a28aSMark Cave-Ayland 9305e97a28aSMark Cave-AylandA target can define specific register or constant constraints. If an 9315e97a28aSMark Cave-Aylandoperation uses a constant input constraint which does not allow all 9325e97a28aSMark Cave-Aylandconstants, it must also accept registers in order to have a fallback. 9335e97a28aSMark Cave-AylandThe constraint '``i``' is defined generically to accept any constant. 9345e97a28aSMark Cave-AylandThe constraint '``r``' is not defined generically, but is consistently 9356b8abd24SRichard Hendersonused by each backend to indicate all registers. If ``TCG_REG_ZERO`` 9366b8abd24SRichard Hendersonis defined by the backend, the constraint '``z``' is defined generically 9376b8abd24SRichard Hendersonto map constant 0 to the hardware zero register. 9385e97a28aSMark Cave-Ayland 9395e97a28aSMark Cave-AylandThe movi_i32 and movi_i64 operations must accept any constants. 9405e97a28aSMark Cave-Ayland 9415e97a28aSMark Cave-AylandThe mov_i32 and mov_i64 operations must accept any registers of the 9425e97a28aSMark Cave-Aylandsame type. 9435e97a28aSMark Cave-Ayland 9445e97a28aSMark Cave-AylandThe ld/st/sti instructions must accept signed 32 bit constant offsets. 9455e97a28aSMark Cave-AylandThis can be implemented by reserving a specific register in which to 9465e97a28aSMark Cave-Aylandcompute the address if the offset is too big. 9475e97a28aSMark Cave-Ayland 9485e97a28aSMark Cave-AylandThe ld/st instructions must accept any destination (ld) or source (st) 9495e97a28aSMark Cave-Aylandregister. 9505e97a28aSMark Cave-Ayland 9515e97a28aSMark Cave-AylandThe sti instruction may fail if it cannot store the given constant. 9525e97a28aSMark Cave-Ayland 9535e97a28aSMark Cave-AylandFunction call assumptions 9545e97a28aSMark Cave-Ayland------------------------- 9555e97a28aSMark Cave-Ayland 9565e97a28aSMark Cave-Ayland- The only supported types for parameters and return value are: 32 and 9575e97a28aSMark Cave-Ayland 64 bit integers and pointer. 9585e97a28aSMark Cave-Ayland- The stack grows downwards. 9595e97a28aSMark Cave-Ayland- The first N parameters are passed in registers. 9605e97a28aSMark Cave-Ayland- The next parameters are passed on the stack by storing them as words. 9615e97a28aSMark Cave-Ayland- Some registers are clobbered during the call. 9625e97a28aSMark Cave-Ayland- The function can return 0 or 1 value in registers. On a 32 bit 9635e97a28aSMark Cave-Ayland target, functions must be able to return 2 values in registers for 9645e97a28aSMark Cave-Ayland 64 bit return type. 9655e97a28aSMark Cave-Ayland 9665e97a28aSMark Cave-Ayland 9675e97a28aSMark Cave-AylandRecommended coding rules for best performance 9685e97a28aSMark Cave-Ayland============================================= 9695e97a28aSMark Cave-Ayland 9705e97a28aSMark Cave-Ayland- Use globals to represent the parts of the QEMU CPU state which are 9715e97a28aSMark Cave-Ayland often modified, e.g. the integer registers and the condition 9725e97a28aSMark Cave-Ayland codes. TCG will be able to use host registers to store them. 9735e97a28aSMark Cave-Ayland 9745e97a28aSMark Cave-Ayland- Don't hesitate to use helpers for complicated or seldom used guest 9755e97a28aSMark Cave-Ayland instructions. There is little performance advantage in using TCG to 9765e97a28aSMark Cave-Ayland implement guest instructions taking more than about twenty TCG 9775e97a28aSMark Cave-Ayland instructions. Note that this rule of thumb is more applicable to 9785e97a28aSMark Cave-Ayland helpers doing complex logic or arithmetic, where the C compiler has 9795e97a28aSMark Cave-Ayland scope to do a good job of optimisation; it is less relevant where 9805e97a28aSMark Cave-Ayland the instruction is mostly doing loads and stores, and in those cases 9815e97a28aSMark Cave-Ayland inline TCG may still be faster for longer sequences. 9825e97a28aSMark Cave-Ayland 9835e97a28aSMark Cave-Ayland- Use the 'discard' instruction if you know that TCG won't be able to 9845e97a28aSMark Cave-Ayland prove that a given global is "dead" at a given program point. The 9855e97a28aSMark Cave-Ayland x86 guest uses it to improve the condition codes optimisation. 986