#
7b959c58 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Convert Neon VCVT fixed-point to gvec
Convert the Neon VCVT float<->fixed-point insns to a gvec style, in preparation for adding fp16 support.
Signed-off-by: Peter Maydell <peter.maydel
target/arm: Convert Neon VCVT fixed-point to gvec
Convert the Neon VCVT float<->fixed-point insns to a gvec style, in preparation for adding fp16 support.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-38-peter.maydell@linaro.org
show more ...
|
#
7782a9af |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon float-integer VCVT
Convert the Neon float-integer VCVT insns to gvec, and use this to implement fp16 support for them.
Note that unlike the VFP int<->fp16 VCVT i
target/arm: Implement fp16 for Neon float-integer VCVT
Convert the Neon float-integer VCVT insns to gvec, and use this to implement fp16 support for them.
Note that unlike the VFP int<->fp16 VCVT insns we converted earlier and which convert to/from a 32-bit integer, these Neon insns convert to/from 16-bit integers. So we can use the existing vfp conversion helpers for the f32<->u32/i32 case but need to provide our own for f16<->u16/i16.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-37-peter.maydell@linaro.org
show more ...
|
#
1dc587ee |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon pairwise fp ops
Convert the Neon pairwise fp ops to use a single gvic-style helper to do the full operation instead of one helper call for each 32-bit part. This
target/arm: Implement fp16 for Neon pairwise fp ops
Convert the Neon pairwise fp ops to use a single gvic-style helper to do the full operation instead of one helper call for each 32-bit part. This allows us to use the same framework to implement the fp16.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-36-peter.maydell@linaro.org
show more ...
|
#
40fde72d |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon VRSQRTS
Convert the Neon VRSQRTS insn to using a gvec helper, and use this to implement the fp16 case.
As with VRECPS, we adjust the phrasing of the new implemen
target/arm: Implement fp16 for Neon VRSQRTS
Convert the Neon VRSQRTS insn to using a gvec helper, and use this to implement the fp16 case.
As with VRECPS, we adjust the phrasing of the new implementation slightly so that the fp32 version parallels the fp16 one.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-35-peter.maydell@linaro.org
show more ...
|
#
ac8c62c4 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon VRECPS
Convert the Neon VRECPS insn to using a gvec helper, and use this to implement the fp16 case.
The phrasing of the new float32_recps_nf() is slightly diffe
target/arm: Implement fp16 for Neon VRECPS
Convert the Neon VRECPS insn to using a gvec helper, and use this to implement the fp16 case.
The phrasing of the new float32_recps_nf() is slightly different from the old recps_f32() so that it parallels the f16 version; for f16 we can't assume that flush-to-zero is always enabled.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-34-peter.maydell@linaro.org
show more ...
|
#
635187aa |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon fp compare-vs-0
Convert the neon floating-point vector compare-vs-0 insns VCEQ0, VCGT0, VCLE0, VCGE0 and VCLT0 to use a gvec helper, and use this to implement the
target/arm: Implement fp16 for Neon fp compare-vs-0
Convert the neon floating-point vector compare-vs-0 insns VCEQ0, VCGT0, VCLE0, VCGE0 and VCLT0 to use a gvec helper, and use this to implement the fp16 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-33-peter.maydell@linaro.org
show more ...
|
#
cf722d75 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon VFMA, VMFS
Convert the neon floating-point vector operations VFMA and VFMS to use a gvec helper, and use this to implement the fp16 case.
This is the last use of
target/arm: Implement fp16 for Neon VFMA, VMFS
Convert the neon floating-point vector operations VFMA and VFMS to use a gvec helper, and use this to implement the fp16 case.
This is the last use of do_3same_fp() so we can now delete that function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-32-peter.maydell@linaro.org
show more ...
|
#
e5adc706 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon VMLA, VMLS operations
Convert the Neon floating-point VMLA and VMLS insns over to using a gvec helper, and use this to implement the fp16 case.
Signed-off-by: Pe
target/arm: Implement fp16 for Neon VMLA, VMLS operations
Convert the Neon floating-point VMLA and VMLS insns over to using a gvec helper, and use this to implement the fp16 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-31-peter.maydell@linaro.org
show more ...
|
#
e22705bb |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon VMAXNM, VMINNM
Convert the Neon floating point VMAXNM and VMINNM insns to using a gvec helper and use this to implement the fp16 case.
Signed-off-by: Peter Mayde
target/arm: Implement fp16 for Neon VMAXNM, VMINNM
Convert the Neon floating point VMAXNM and VMINNM insns to using a gvec helper and use this to implement the fp16 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-30-peter.maydell@linaro.org
show more ...
|
#
e43268c5 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for Neon VMAX, VMIN
Convert the Neon float-point VMAX and VMIN insns over to using a gvec helper, and use this to implement the fp16 case.
Signed-off-by: Peter Maydell <p
target/arm: Implement fp16 for Neon VMAX, VMIN
Convert the Neon float-point VMAX and VMIN insns over to using a gvec helper, and use this to implement the fp16 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-29-peter.maydell@linaro.org
show more ...
|
#
bb2741da |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for VACGE, VACGT
Convert the neon floating-point vector absolute comparison ops VACGE and VACGT over to using a gvec hepler and use this to implement the fp16 case.
Signe
target/arm: Implement fp16 for VACGE, VACGT
Convert the neon floating-point vector absolute comparison ops VACGE and VACGT over to using a gvec hepler and use this to implement the fp16 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-28-peter.maydell@linaro.org
show more ...
|
#
ad505db2 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement fp16 for VCEQ, VCGE, VCGT comparisons
Convert the Neon floating-point vector comparison ops VCEQ, VCGE and VCGT over to using a gvec helper and use this to implement the fp16 c
target/arm: Implement fp16 for VCEQ, VCGE, VCGT comparisons
Convert the Neon floating-point vector comparison ops VCEQ, VCGE and VCGT over to using a gvec helper and use this to implement the fp16 case.
(We put the float16_ceq() etc functions above the DO_2OP() macro definition because later when we convert the compare-against-zero instructions we'll want their definitions to be visible at that point in the source file.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-27-peter.maydell@linaro.org
show more ...
|
#
e4a6d4a6 |
| 28-Aug-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Implement FP16 for Neon VADD, VSUB, VABD, VMUL
Implement FP16 support for the Neon insns which use the DO_3S_FP_GVEC macro: VADD, VSUB, VABD, VMUL.
For VABD this requires us to implemen
target/arm: Implement FP16 for Neon VADD, VSUB, VABD, VMUL
Implement FP16 support for the Neon insns which use the DO_3S_FP_GVEC macro: VADD, VSUB, VABD, VMUL.
For VABD this requires us to implement a new gvec_fabd_h helper using the machinery we have already for the other helpers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-24-peter.maydell@linaro.org
show more ...
|
#
ed78849d |
| 28-Aug-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 2020081501
target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-21-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
3607440c |
| 28-Aug-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Mess
target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-20-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
2e5a265e |
| 28-Aug-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-
target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-19-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
d2179885 |
| 28-Aug-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Generalize inl_qrdmlah_* helper functions
Unify add/sub helpers and add a parameter for rounding. This will allow saturating non-rounding to reuse this code.
Signed-off-by: Richard Hend
target/arm: Generalize inl_qrdmlah_* helper functions
Unify add/sub helpers and add a parameter for rounding. This will allow saturating non-rounding to reuse this code.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [PMM: fixed accidental use of '=' rather than '+=' in do_sqrdmlah_s] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-15-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
a04b68e1 |
| 14-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Convert aes and sm4 to gvec helpers
With this conversion, we will be able to use the same helpers with sve. In particular, pass 3 vector parameters for the 3-operand operations; for adv
target/arm: Convert aes and sm4 to gvec helpers
With this conversion, we will be able to use the same helpers with sve. In particular, pass 3 vector parameters for the 3-operand operations; for advsimd the destination register is also an input.
This also fixes a bug in which we failed to clear the high bits of the SVE register after an AdvSIMD operation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200514212831.31248-2-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
a26a352b |
| 12-May-2020 |
Peter Maydell <peter.maydell@linaro.org> |
target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree. We already have gvec helpers for addition and subtraction,
target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree
Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree. We already have gvec helpers for addition and subtraction, but must add one for fabd.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
show more ...
|
#
cfdb2c0c |
| 13-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Vectorize SABA/UABA
Include 64-bit element size in preparation for SVE2.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro
target/arm: Vectorize SABA/UABA
Include 64-bit element size in preparation for SVE2.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-17-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
50c160d4 |
| 13-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Vectorize SABD/UABD
Include 64-bit element size in preparation for SVE2.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro
target/arm: Vectorize SABD/UABD
Include 64-bit element size in preparation for SVE2.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-16-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
525d9b6d |
| 13-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
Must clear the tail for AdvSIMD when SVE is enabled.
Fixes: ca40a6e6e39 Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.mayde
target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_*
Must clear the tail for AdvSIMD when SVE is enabled.
Fixes: ca40a6e6e39 Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-15-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
e286bf4a |
| 13-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Pass pointer to qc to qrdmla/qrdmls
Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage.
Chang
target/arm: Pass pointer to qc to qrdmla/qrdmls
Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage.
Change the return type of inl_qrdml.h_s16 to match the sense of the operation: signed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-14-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
893ab054 |
| 13-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Create gen_gvec_{sri,sli}
The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away
target/arm: Create gen_gvec_{sri,sli}
The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
6ccd48d4 |
| 13-May-2020 |
Richard Henderson <richard.henderson@linaro.org> |
target/arm: Create gen_gvec_{u,s}{rshr,rsra}
Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vect
target/arm: Create gen_gvec_{u,s}{rshr,rsra}
Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|