#
2a4b939c |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VCVT between floating and fixed point
Implement the MVE VCVT insns which convert between floating and fixed point. As with the Neon equivalents, these use essentially the
target/arm: Implement MVE VCVT between floating and fixed point
Implement the MVE VCVT insns which convert between floating and fixed point. As with the Neon equivalents, these use essentially the same constant encoding as right-shift-by-immediate.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
c2d8f6bb |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE fp scalar comparisons
Implement the MVE fp scalar comparisons VCMP and VPT.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard
target/arm: Implement MVE fp scalar comparisons
Implement the MVE fp scalar comparisons VCMP and VPT.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
c87fe6d2 |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE fp vector comparisons
Implement the MVE fp vector comparisons VCMP and VPT.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard
target/arm: Implement MVE fp vector comparisons
Implement the MVE fp vector comparisons VCMP and VPT.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
29f80e7d |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE FP max/min across vector
Implement the MVE VMAXNMV, VMINNMV, VMAXNMAV, VMINNMAV insns. These calculate the maximum or minimum of floating point elements across a vector, s
target/arm: Implement MVE FP max/min across vector
Implement the MVE VMAXNMV, VMINNMV, VMAXNMAV, VMINNMAV insns. These calculate the maximum or minimum of floating point elements across a vector, starting with a value in a general purpose register and returning the result there.
The pseudocode silences a possible SNaN in the accumulating result on every iteration (by calling FPConvertNaN), but we do it only on the input ra, because if none of the inputs to float*_maxnum or float*_minnum are SNaNs then the result can't be an SNaN.
Note that we can't use the float*_maxnuma() etc functions we defined earlier for VMAXNMA and VMINNMA, because we mustn't take the absolute value of the starting general-purpose register value, which could be negative.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
4773e74e |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE fp-with-scalar VFMA, VFMAS
Implement the MVE fp-with-scalar VFMA and VFMAS insns.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <r
target/arm: Implement MVE fp-with-scalar VFMA, VFMAS
Implement the MVE fp-with-scalar VFMA and VFMAS insns.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
abfe39b2 |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE scalar fp insns
Implement the MVE scalar floating point insns VADD, VSUB and VMUL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <
target/arm: Implement MVE scalar fp insns
Implement the MVE scalar floating point insns VADD, VSUB and VMUL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
90257a4f |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VMAXNMA and VMINNMA
Implement the MVE VMAXNMA and VMINNMA insns; these are 2-operand, but the destination register must be the same as one of the source registers.
We defe
target/arm: Implement MVE VMAXNMA and VMINNMA
Implement the MVE VMAXNMA and VMINNMA insns; these are 2-operand, but the destination register must be the same as one of the source registers.
We defer the decode of the size in bit 28 to the individual insn patterns rather than doing it in the format, because otherwise we would have a single insn pattern that overlapped with two groups (eg VMAXNMA with the VMULH_S and VMULH_U groups). Having two insn patterns per insn seems clearer than a complex multilevel nesting of overlapping and non-overlapping groups.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
d3cd965c |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VCMUL and VCMLA
Implement the MVE VCMUL and VCMLA insns.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.o
target/arm: Implement MVE VCMUL and VCMLA
Implement the MVE VCMUL and VCMLA insns.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
3173c0dd |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VFMA and VFMS
Implement the MVE VFMA and VFMS insns.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
#
104afc68 |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VCADD
Implement the MVE VCADD insn. Note that here the size bit is the opposite sense to the other 2-operand fp insns.
We don't check for the sz == 1 && Qd == Qm UNPREDIC
target/arm: Implement MVE VCADD
Implement the MVE VCADD insn. Note that here the size bit is the opposite sense to the other 2-operand fp insns.
We don't check for the sz == 1 && Qd == Qm UNPREDICTABLE case, because that would mean we can't use the DO_2OP_FP macro in translate-mve.c.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
82af0153 |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VSUB, VMUL, VABD, VMAXNM, VMINNM
Implement more simple 2-operand floating point MVE insns.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pet
target/arm: Implement MVE VSUB, VMUL, VABD, VMAXNM, VMINNM
Implement more simple 2-operand floating point MVE insns.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
1e35cd91 |
| 01-Sep-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VADD (floating-point)
Implement the MVE VADD (floating-point) insn. Handling of this is similar to the 2-operand integer insns, except that we must take care to only updat
target/arm: Implement MVE VADD (floating-point)
Implement the MVE VADD (floating-point) insn. Handling of this is similar to the 2-operand integer insns, except that we must take care to only update the floating point exception status if the least significant bit of the predicate mask for each element is active.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
075e7e97 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE interleaving loads/stores
Implement the MVE interleaving load/store functions VLD2, VLD4, VST2 and VST4. VLD2 loads 16 bytes of data from memory and writes to 2 consecutiv
target/arm: Implement MVE interleaving loads/stores
Implement the MVE interleaving load/store functions VLD2, VLD4, VST2 and VST4. VLD2 loads 16 bytes of data from memory and writes to 2 consecutive Qregs; VLD4 loads 16 bytes of data from memory and writes to 4 consecutive Qregs. The 'pattern' field in the encoding determines the offset into memory which is accessed and also which elements in the Qregs are written to. (The intention is that a sequence of four consecutive VLD4 with different pattern values performs a complete de-interleaving load of 64 bytes into all elements of the 4 Qregs.) VST2 and VST4 do the same, but for stores.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
fac80f08 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE scatter-gather immediate forms
Implement the MVE VLDR/VSTR insns which do scatter-gather using base addresses from Qm plus or minus an immediate offset (possibly with write
target/arm: Implement MVE scatter-gather immediate forms
Implement the MVE VLDR/VSTR insns which do scatter-gather using base addresses from Qm plus or minus an immediate offset (possibly with writeback). Note that writeback is not predicated but it does have to honour ECI state, so we have to add an eci_mask check to the VSTR_SG macros (the VLDR_SG macros already needed this to be able to distinguish "skip beat" from "set predicated element to 0").
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
dc18628b |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE scatter-gather insns
Implement the MVE gather-loads and scatter-stores which form the address by adding a base value from a scalar register to an offset in each element of
target/arm: Implement MVE scatter-gather insns
Implement the MVE gather-loads and scatter-stores which form the address by adding a base value from a scalar register to an offset in each element of a vector.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
0f31e37c |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VCTP
Implement the MVE VCTP insn, which sets the VPR.P0 predicate bits so as to predicate any element at index Rn or greater is predicated. As with VPNOT, this insn itself
target/arm: Implement MVE VCTP
Implement the MVE VCTP insn, which sets the VPR.P0 predicate bits so as to predicate any element at index Rn or greater is predicated. As with VPNOT, this insn itself is predicable and subject to beatwise execution.
The calculation of the mask is the same as is used to determine ltpmask in mve_element_mask(), but we precalculate masklen in generated code to avoid having to have 4 helpers specialized by size.
We put the decode line in with the low-overhead-loop insns in t32.decode because it's logically part of that collection of insn patterns, even though it is an MVE only insn.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
fea3958f |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VPNOT
Implement the MVE VPNOT insn, which inverts the bits in VPR.P0 (subject to both predication and to beatwise execution).
Signed-off-by: Peter Maydell <peter.maydell@l
target/arm: Implement MVE VPNOT
Implement the MVE VPNOT insn, which inverts the bits in VPR.P0 (subject to both predication and to beatwise execution).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
1241f148 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VMOV to/from 2 general-purpose registers
Implement the MVE VMOV forms that move data between 2 general-purpose registers and 2 32-bit lanes in a vector register.
Signed-of
target/arm: Implement MVE VMOV to/from 2 general-purpose registers
Implement the MVE VMOV forms that move data between 2 general-purpose registers and 2 32-bit lanes in a vector register.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
d5c571ea |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VMAXA, VMINA
Implement the MVE VMAXA and VMINA insns, which take the absolute value of the signed elements in the input vector and then accumulate the unsigned max or min i
target/arm: Implement MVE VMAXA, VMINA
Implement the MVE VMAXA and VMINA insns, which take the absolute value of the signed elements in the input vector and then accumulate the unsigned max or min into the destination vector.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
398e7cd3 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VQABS, VQNEG
Implement the MVE 1-operand saturating operations VQABS and VQNEG.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <ric
target/arm: Implement MVE VQABS, VQNEG
Implement the MVE 1-operand saturating operations VQABS and VQNEG.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
8be9a250 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE saturating doubling multiply accumulates
Implement the MVE saturating doubling multiply accumulate insns VQDMLAH, VQRDMLAH, VQDMLASH and VQRDMLASH. These perform a multipl
target/arm: Implement MVE saturating doubling multiply accumulates
Implement the MVE saturating doubling multiply accumulate insns VQDMLAH, VQRDMLAH, VQDMLASH and VQRDMLASH. These perform a multiply, double, add the accumulator shifted by the element size, possibly round, saturate to twice the element size, then take the high half of the result. The *MLAH insns do vector * scalar + vector, and the *MLASH insns do vector * vector + scalar.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
c69e34c6 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VMLA
Implement the MVE VMLA insn, which multiplies a vector by a scalar and accumulates into another vector.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Review
target/arm: Implement MVE VMLA
Implement the MVE VMLA insn, which multiplies a vector by a scalar and accumulates into another vector.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
f0ffff51 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE VMLADAV and VMLSLDAV
Implement the MVE VMLADAV and VMLSLDAV insns. Like the VMLALDAV and VMLSLDAV insns already implemented, these accumulate multiplied vector elements; b
target/arm: Implement MVE VMLADAV and VMLSLDAV
Implement the MVE VMLADAV and VMLSLDAV insns. Like the VMLALDAV and VMLSLDAV insns already implemented, these accumulate multiplied vector elements; but they accumulate a 32-bit result rather than a 64-bit one.
Note that these encodings overlap with what would be RdaHi=0b111 for VMLALDAV, VMLSLDAV, VRMLALDAVH and VRMLSLDAVH.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
640cdf20 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn
The MVEGenDualAccOpFn is a bit misnamed, since it is used for the "long dual accumulate" operations that use a 64-bit accumulator. Renam
target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn
The MVEGenDualAccOpFn is a bit misnamed, since it is used for the "long dual accumulate" operations that use a 64-bit accumulator. Rename it to MVEGenLongDualAccOpFn so we can use the former name for the 32-bit accumulator insns.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|
#
54dc78a9 |
| 13-Aug-2021 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement MVE narrowing moves
Implement the MVE narrowing move insns VMOVN, VQMOVN and VQMOVUN. These take a double-width input, narrow it (possibly saturating) and store the result to e
target/arm: Implement MVE narrowing moves
Implement the MVE narrowing move insns VMOVN, VQMOVN and VQMOVUN. These take a double-width input, narrow it (possibly saturating) and store the result to either the top or bottom half of the output element.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
show more ...
|