History log of /qemu/target/i386/tcg/decode-new.h (Results 26 – 46 of 46)
Revision Date Author Comments
# 8a36bbcf 19-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: add X86_SPECIALs for MOVSX and MOVZX

Usually the registers are just moved into s->T0 without much care for
their operand size. However, in some cases we can get more efficient
code if

target/i386: add X86_SPECIALs for MOVSX and MOVZX

Usually the registers are just moved into s->T0 without much care for
their operand size. However, in some cases we can get more efficient
code if the operand fetching logic syncs with the emission function
on what is nicer.

All the current uses are mostly demonstrative and only reduce the code
in the emission functions, because the instructions do not support
memory operands. However the logic is generic and applies to several
more instructions such as MOVSXD (aka movslq), one-byte shift
instructions, multiplications, XLAT, and indirect calls/jumps.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 5baf5641 19-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: rename zext0/zext2 and make them closer to the manual

X86_SPECIAL_ZExtOp0 and X86_SPECIAL_ZExtOp2 are poorly named; they are a hack
that is needed by scalar insertion and extraction ins

target/i386: rename zext0/zext2 and make them closer to the manual

X86_SPECIAL_ZExtOp0 and X86_SPECIAL_ZExtOp2 are poorly named; they are a hack
that is needed by scalar insertion and extraction instructions, and not really
related to zero extension: for PEXTR the zero extension is done by the generation
functions, for PINSR the high bits are not used at all and in fact are *not*
filled with zeroes when loaded into s->T1.

Rename the values to match the effect described in the manual, and explain
better in the comments.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# b609db94 19-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: reimplement check for validity of LOCK prefix

The previous check erroneously allowed CMP to be modified with LOCK.
Instead, tag explicitly the instructions that do support LOCK.

Acked-

target/i386: reimplement check for validity of LOCK prefix

The previous check erroneously allowed CMP to be modified with LOCK.
Instead, tag explicitly the instructions that do support LOCK.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 3c95fd4e 27-Oct-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: implement SHA instructions
* target/i386: check CPUID_PAE to determine 36 bit processor address space
* target

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: implement SHA instructions
* target/i386: check CPUID_PAE to determine 36 bit processor address space
* target/i386: improve validation of AVX instructions
* require Linux 4.4 for KVM

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmU5Vi4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNVbwf9HCx+C0MITWjQ+rEkmtiy/Cn+ZsF1
# gbaL31ahymEU3vUcKZX8Z4ycmBFw9b3yvotTVR38lE9p+sKtSaGKUGV0btpS7oBB
# y8IfnVmg5X1j4PtyDxFlLD48qg//2kVgJ6wtaDTSAkgQMOPM9UgHgQD+Ks7kOo8v
# rReL46XVPEZTWt3syX0y87mFinjK2hXGqIdsnJ1uT614BAVVIrmO6aFNNN1FlsRb
# NGRZevJTfEWjWVfWOhUiZdUGDz74sOXdshZX/teadeDJLtWaw0uytMN9qoTN33h/
# OsdR2fO7h8ZknGEc2F1fJEVh4sOfO4fGYAAJGzHP9AjUDV1IVVYELb79dg==
# =WYTo
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 26 Oct 2023 02:53:50 JST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (24 commits)
kvm: i8254: require KVM_CAP_PIT2 and KVM_CAP_PIT_STATE2
kvm: i386: require KVM_CAP_SET_IDENTITY_MAP_ADDR
kvm: i386: require KVM_CAP_ADJUST_CLOCK
kvm: i386: require KVM_CAP_MCE
kvm: i386: require KVM_CAP_SET_VCPU_EVENTS and KVM_CAP_X86_ROBUST_SINGLESTEP
kvm: i386: require KVM_CAP_XSAVE
kvm: i386: require KVM_CAP_DEBUGREGS
kvm: i386: move KVM_CAP_IRQ_ROUTING detection to kvm_arch_required_capabilities
kvm: unify listeners for PIO address space
kvm: require KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH
kvm: assume that many ioeventfds can be created
kvm: drop reference to KVM_CAP_PCI_2_3
kvm: require KVM_IRQFD for kernel irqchip
kvm: require KVM_IRQFD for kernel irqchip
kvm: require KVM_CAP_SIGNAL_MSI
kvm: require KVM_CAP_INTERNAL_ERROR_DATA
kvm: remove unnecessary stub
target/i386: check CPUID_PAE to determine 36 bit processor address space
target/i386: validate VEX.W for AVX instructions
target/i386: group common checks in the decoding phase
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# e000687f 09-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: validate VEX.W for AVX instructions

Instructions in VEX exception class 6 generally look at the value of
VEX.W. Note that the manual places some instructions incorrectly in
class 4, fo

target/i386: validate VEX.W for AVX instructions

Instructions in VEX exception class 6 generally look at the value of
VEX.W. Note that the manual places some instructions incorrectly in
class 4, for example VPERMQ which has no non-VEX encoding and no legacy
SSE analogue. AMD does a mess of its own, as documented in the comment
that this patch adds.

Most of them are checked for VEX.W=0, and are listed in the manual
(though with an omission) in table 2-16; VPERMQ and VPERMPD check for
VEX.W=1, which is only listed in the instruction description. Others,
such as VPSRLV, VPSLLV and the FMA3 instructions, use VEX.W to switch
between a 32-bit and 64-bit operation.

Fix more of the class 4/class 6 mismatches, and implement the check for
VEX.W in TCG.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 183e6679 09-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: group common checks in the decoding phase

In preparation for adding more similar checks, move the VEX.L=0 check
and several X86_SPECIAL_* checks to a new field, where each bit represent

target/i386: group common checks in the decoding phase

In preparation for adding more similar checks, move the VEX.L=0 check
and several X86_SPECIAL_* checks to a new field, where each bit represent
a common check on unused bits, or a restriction on the processor mode.

Likewise, many SVM intercepts can be checked during the decoding phase,
the main exception being the selective CR0 write, MSR and IOIO intercepts.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# e582b629 10-Oct-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: implement SHA instructions

The implementation was validated with OpenSSL and with the test vectors in
https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/src/x86/sha.rs.

target/i386: implement SHA instructions

The implementation was validated with OpenSSL and with the test vectors in
https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/src/x86/sha.rs.

The instructions provide a ~25% improvement on hashing a 64 MiB file:
runtime goes down from 1.8 seconds to 1.4 seconds; instruction count on
the host goes down from 5.8 billion to 4.8 billion with slightly better
IPC too. Good job Intel. ;)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 03a3a62f 07-Sep-2023 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* only build util/async-teardown.c when system build is requested
* target/i386: fix BQL handling of the legacy FERR interrup

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* only build util/async-teardown.c when system build is requested
* target/i386: fix BQL handling of the legacy FERR interrupts
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: Add support for AMX-COMPLEX in CPUID enumeration
* compile plugins on Darwin
* configure and meson cleanups
* drop mkvenv support for Python 3.7 and Debian10
* add wrap file for libblkio
* tweak KVM stubs

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT5t6UUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmjwf+MpvVuq+nn+3PqGUXgnzJx5ccA5ne
# O9Xy8+1GdlQPzBw/tPovxXDSKn3HQtBfxObn2CCE1tu/4uHWpBA1Vksn++NHdUf2
# P0yoHxGskJu5iYYTtIcNw5cH2i+AizdiXuEjhfNjqD5Y234cFoHnUApt9e3zBvVO
# cwGD7WpPuSb4g38hHkV6nKcx72o7b4ejDToqUVZJ2N+RkddSqB03fSdrOru0hR7x
# V+lay0DYdFszNDFm05LJzfDbcrHuSryGA91wtty7Fzj6QhR/HBHQCUZJxMB5PI7F
# Zy4Zdpu60zxtSxUqeKgIi7UhNFgMcax2Hf9QEqdc/B4ARoBbboh4q4u8kQ==
# =dH7/
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 07 Sep 2023 07:44:37 EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (51 commits)
docs/system/replay: do not show removed command line option
subprojects: add wrap file for libblkio
sysemu/kvm: Restrict kvm_pc_setup_irq_routing() to x86 targets
sysemu/kvm: Restrict kvm_has_pit_state2() to x86 targets
sysemu/kvm: Restrict kvm_get_apic_state() to x86 targets
sysemu/kvm: Restrict kvm_arch_get_supported_cpuid/msr() to x86 targets
target/i386: Restrict declarations specific to CONFIG_KVM
target/i386: Allow elision of kvm_hv_vpindex_settable()
target/i386: Allow elision of kvm_enable_x2apic()
target/i386: Remove unused KVM stubs
target/i386/cpu-sysemu: Inline kvm_apic_in_kernel()
target/i386/helper: Restrict KVM declarations to system emulation
hw/i386/fw_cfg: Include missing 'cpu.h' header
hw/i386/pc: Include missing 'cpu.h' header
hw/i386/pc: Include missing 'sysemu/tcg.h' header
Revert "mkvenv: work around broken pip installations on Debian 10"
mkvenv: assume presence of importlib.metadata
Python: Drop support for Python 3.7
configure: remove dead code
meson: list leftover CONFIG_* symbols
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# a48b2697 29-Aug-2023 Paolo Bonzini <pbonzini@redhat.com>

target/i386: generalize operand size "ph" for use in CVTPS2PD

CVTPS2PD only loads a half-register for memory, like CVTPH2PS. It can
reuse the "ph" packed half-precision size to load a half-register

target/i386: generalize operand size "ph" for use in CVTPS2PD

CVTPS2PD only loads a half-register for memory, like CVTPH2PS. It can
reuse the "ph" packed half-precision size to load a half-register,
but rename it to "xh" because it is now a variation of "x" (it is not
used only for half-precision values).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# e52d57c8 24-Oct-2022 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: new decoder bugfix
* target/i386: complete x86-v3 support for TCG

# -----BEGIN PGP SIGNATURE-----
#
# iQFHBAA

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: new decoder bugfix
* target/i386: complete x86-v3 support for TCG

# -----BEGIN PGP SIGNATURE-----
#
# iQFHBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNTlqQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOQNQf430MHbrtN9WKKiXv3684XxmcnoRqg
# PHmaGg2SKp7UB+hI2FMYgCZWOl5s3cGTHtwX8byFCttmE4kI7HJR7IouW6znm57j
# 7QVx2TJXIZgqSYcfYzfLu46yS6pNqJUA+mBv5In3Vqt4ZQT2szefVBg6BzmuF6lT
# HXbu/llc3iVfW4SNLJOABXzKNbPacmmpmLjoporfwOHwHjv4iikuXNUOZ84FFL11
# 2tkdcff282q00IRgHm1lSyiRiqh+kAxzSDanMjOZbphBiE9gNJjLGoV5F2X63e1O
# DQGg4wqBWP68O/r8Fj8tOUMCTW212DwWyv1+d/lQB+wwpJK+P4O14dCW
# =Fd+y
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 22 Oct 2022 03:07:16 EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386: implement FMA instructions
target/i386: implement F16C instructions
target/i386: introduce function to set rounding mode from FPCW or MXCSR bits
target/i386: decode-new: avoid out-of-bounds access to xmm_regs[-1]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# 2872b0f3 19-Oct-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: implement FMA instructions

The only issue with FMA instructions is that there are _a lot_ of them (30
opcodes, each of which comes in up to 4 versions depending on VEX.W and
VEX.L; a to

target/i386: implement FMA instructions

The only issue with FMA instructions is that there are _a lot_ of them (30
opcodes, each of which comes in up to 4 versions depending on VEX.W and
VEX.L; a total of 96 possibilities). However, they can be implement with
only 6 helpers, two for scalar operations and four for packed operations.
(Scalar versions do not do any merging; they only affect the bottom 32
or 64 bits of the output operand. Therefore, there is no separate XMM
and YMM of the scalar helpers).

First, we can reduce the number of helpers to one third by passing four
operands (one output and three inputs); the reordering of which operands
go to the multiply and which go to the add is done in emit.c.

Second, the different instructions also dispatch to the same softfloat
function, so the flags for float32_muladd and float64_muladd are passed
in the helper as int arguments, with a little extra complication to
handle FMADDSUB and FMSUBADD.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# cf5ec664 19-Oct-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: implement F16C instructions

F16C only consists of two instructions, which are a bit peculiar
nevertheless.

First, they access only the low half of an YMM or XMM register for the
packed

target/i386: implement F16C instructions

F16C only consists of two instructions, which are a bit peculiar
nevertheless.

First, they access only the low half of an YMM or XMM register for the
packed-half operand; the exact size still depends on the VEX.L flag.
This is similar to the existing avx_movx flag, but not exactly because
avx_movx is hardcoded to affect operand 2. To this end I added a "ph"
format name; it's possible to reuse this approach for the VPMOVSX and
VPMOVZX instructions, though that would also require adding two more
formats for the low-quarter and low-eighth of an operand.

Second, VCVTPS2PH is somewhat weird because it *stores* the result of
the instruction into memory rather than loading it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 214a8da2 18-Oct-2022 Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* configure: don't enable firmware for targets that are not built
* configure: don't use strings(1)
* scsi, target/i386: swit

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* configure: don't enable firmware for targets that are not built
* configure: don't use strings(1)
* scsi, target/i386: switch from device_legacy_reset() to device_cold_reset()
* target/i386: AVX support for TCG
* target/i386: fix SynIC SINT assertion failure on guest reset
* target/i386: Use atomic operations for pte updates and other cleanups
* tests/tcg: extend SSE tests to AVX
* virtio-scsi: send "REPORTED LUNS CHANGED" sense data upon disk hotplug events

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNOlOcUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNuvwgAj/Z5pI9KU33XiWKFR3bZf2lHh21P
# xmTzNtPmnP1WHDY1DNug/UB+BLg3c+carpTf5n3B8aKI4X3FfxGSJvYlXy4BONFD
# XqYMH3OZB5GaR8Wza9trNYjDs/9hOZus/0R6Hqdl/T38PlMjf8mmayULJIGdcFcJ
# WJvITVntbcCwwbpyJbRC5BNigG8ZXTNRoKBgtFVGz6Ox+n0YydwKX5qU5J7xRfCU
# lW41LjZ0Fk5lonH16+xuS4WD5EyrNt8cMKCGsxnyxhI7nehe/OGnYr9l+xZJclrh
# inQlSwJv0IpUJcrGCI4Xugwux4Z7ZXv3JQ37FzsdZcv/ZXpGonXMeXNJ9A==
# =o6x7
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 18 Oct 2022 07:58:31 EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (53 commits)
target/i386: remove old SSE decoder
target/i386: move 3DNow to the new decoder
tests/tcg: extend SSE tests to AVX
target/i386: Enable AVX cpuid bits when using TCG
target/i386: implement VLDMXCSR/VSTMXCSR
target/i386: implement XSAVE and XRSTOR of AVX registers
target/i386: reimplement 0x0f 0x28-0x2f, add AVX
target/i386: reimplement 0x0f 0x10-0x17, add AVX
target/i386: reimplement 0x0f 0xc2, 0xc4-0xc6, add AVX
target/i386: reimplement 0x0f 0x38, add AVX
target/i386: Use tcg gvec ops for pmovmskb
target/i386: reimplement 0x0f 0x3a, add AVX
target/i386: clarify (un)signedness of immediates from 0F3Ah opcodes
target/i386: reimplement 0x0f 0xd0-0xd7, 0xe0-0xe7, 0xf0-0xf7, add AVX
target/i386: reimplement 0x0f 0x70-0x77, add AVX
target/i386: reimplement 0x0f 0x78-0x7f, add AVX
target/i386: reimplement 0x0f 0x50-0x5f, add AVX
target/i386: reimplement 0x0f 0xd8-0xdf, 0xe8-0xef, 0xf8-0xff, add AVX
target/i386: reimplement 0x0f 0x60-0x6f, add AVX
target/i386: Introduce 256-bit vector helpers
...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


# 71a0891d 05-Sep-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: move 3DNow to the new decoder

This adds another kind of weirdness when you thought you had seen it all:
an opcode byte that comes _after_ the address, not before. It's not
worth adding

target/i386: move 3DNow to the new decoder

This adds another kind of weirdness when you thought you had seen it all:
an opcode byte that comes _after_ the address, not before. It's not
worth adding a new X86_SPECIAL_* constant for it, but it's actually
not unlike VCMP; so, forgive me for exploiting the similarity and just
deciding to dispatch to the right gen_helper_* call in a single code
generation function.

In fact, the old decoder had a bug where s->rip_offset should have
been set to 1 for 3DNow! instructions, and it's fixed now.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 16fc5726 14-Sep-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: reimplement 0x0f 0x38, add AVX

There are several special cases here:

1) extending moves have different widths for the helpers vs. for the
memory loads, and the width for memory loads d

target/i386: reimplement 0x0f 0x38, add AVX

There are several special cases here:

1) extending moves have different widths for the helpers vs. for the
memory loads, and the width for memory loads depends on VEX.L too.
This is represented by X86_SPECIAL_AVXExtMov.

2) some instructions, such as variable-width shifts, select the vector element
size via REX.W.

3) VSIB instructions (VGATHERxPy, VPGATHERxy) are also part of this group,
and they have (among other things) two output operands.

3) the macros for 4-operand blends (which are under 0x0f 0x3a) have to be
extended to support 2-operand blends. The 2-operand variant actually
came a few years earlier, but it is clearer to implement them in the
opposite order.

X86_TYPE_WM, introduced earlier for unaligned loads, is reused for helpers
that accept a Reg* but have a M argument.

These three-byte opcodes also include AVX new instructions, for which
the helpers were originally implemented by Paul Brook <paul@nowt.org>.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 6bbeb98d 01-Sep-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: reimplement 0x0f 0xd0-0xd7, 0xe0-0xe7, 0xf0-0xf7, add AVX

The more complicated ones here are d6-d7, e6-e7, f7. The others
are trivial.

For LDDQU, using gen_load_sse directly might cor

target/i386: reimplement 0x0f 0xd0-0xd7, 0xe0-0xe7, 0xf0-0xf7, add AVX

The more complicated ones here are d6-d7, e6-e7, f7. The others
are trivial.

For LDDQU, using gen_load_sse directly might corrupt the register if
the second part of the load fails. Therefore, add a custom X86_TYPE_WM
value; like X86_TYPE_W it does call gen_load(), but it also rejects a
value of 11 in the ModRM field like X86_TYPE_M.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 55a33286 05-Sep-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: validate SSE prefixes directly in the decoding table

Many SSE and AVX instructions are only valid with specific prefixes
(none, 66, F3, F2). Introduce a direct way to encode this in th

target/i386: validate SSE prefixes directly in the decoding table

Many SSE and AVX instructions are only valid with specific prefixes
(none, 66, F3, F2). Introduce a direct way to encode this in the
decoding table to avoid using decode groups too much.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# 20581aad 17-Sep-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: validate VEX prefixes via the instructions' exception classes

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# caa01fad 01-Sep-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: add CPUID feature checks to new decoder

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


# 6ba13999 23-Aug-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: add ALU load/writeback core

Add generic code generation that takes care of preparing operands
around calls to decode.e.gen in a table-driven manner, so that ALU
operations need not take

target/i386: add ALU load/writeback core

Add generic code generation that takes care of preparing operands
around calls to decode.e.gen in a table-driven manner, so that ALU
operations need not take care of that.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


# b3e22b23 23-Aug-2022 Paolo Bonzini <pbonzini@redhat.com>

target/i386: add core of new i386 decoder

The new decoder is based on three principles:

- use mostly table-driven decoding, using tables derived as much as possible
from the Intel manual. Centra

target/i386: add core of new i386 decoder

The new decoder is based on three principles:

- use mostly table-driven decoding, using tables derived as much as possible
from the Intel manual. Centralizing the decode the operands makes it
more homogeneous, for example all immediates are signed. All modrm
handling is in one function, and can be shared between SSE and ALU
instructions (including XMM<->GPR instructions). The SSE/AVX decoder
will also not have duplicated code between the 0F, 0F38 and 0F3A tables.

- keep the code as "non-branchy" as possible. Generally, the code for
the new decoder is more verbose, but the control flow is simpler.
Conditionals are not nested and have small bodies. All instruction
groups are resolved even before operands are decoded, and code
generation is separated as much as possible within small functions
that only handle one instruction each.

- keep address generation and (for ALU operands) memory loads and writeback
as much in common code as possible. All ALU operations for example
are implemented as T0=f(T0,T1). For non-ALU instructions,
read-modify-write memory operations are rare, but registers do not
have TCGv equivalents: therefore, the common logic sets up pointer
temporaries with the operands, while load and writeback are handled
by gvec or by helpers.

These principles make future code review and extensibility simpler, at
the cost of having a relatively large amount of code in the form of this
patch. Even EVEX should not be _too_ hard to implement (it's just a crazy
large amount of possibilities).

This patch introduces the main decoder flow, and integrates the old
decoder with the new one. The old decoder takes care of parsing
prefixes and then optionally drops to the new one. The changes to the
old decoder are minimal and allow it to be replaced incrementally with
the new one.

There is a debugging mechanism through a "LIMIT" environment variable.
In user-mode emulation, the variable is the number of instructions
decoded by the new decoder before permanently switching to the old one.
In system emulation, the variable is the highest opcode that is decoded
by the new decoder (this is less friendly, but it's the best that can
be done without requiring deterministic execution).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

show more ...


12