History log of /src/sys/amd64/include/smp.h (Results 1 – 25 of 535)
Revision Date Author Comments
# 5f0ab9d9 12-Mar-2026 Zhenlei Huang <zlei@FreeBSD.org>

amd64: Make start_all_aps() static

It is not used elsewhere since the change [1].

[1] ac3ede5371af x86/xen: remove PVHv1 code

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D

amd64: Make start_all_aps() static

It is not used elsewhere since the change [1].

[1] ac3ede5371af x86/xen: remove PVHv1 code

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D55668

show more ...


# 190d0a96 26-Oct-2025 Justin Hibbits <jhibbits@FreeBSD.org>

amd64: Add cpu_stop() support to go UP after SMP

Reviewed by: kib
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D51622


# fa02551d 21-Jul-2025 Mark Johnston <markj@FreeBSD.org>

amd64: Remove support for "nooptions SMP"

It does not appear to get much, if any, testing, and doesn't seem to be
worth the maintenance overhead. Virtually all amd64 hardware has
multiple cores. T

amd64: Remove support for "nooptions SMP"

It does not appear to get much, if any, testing, and doesn't seem to be
worth the maintenance overhead. Virtually all amd64 hardware has
multiple cores. The CPU and memory usage overhead of the SMP option in
single-vCPU VMs is quite marginal and not worth maintaining.

Reviewed by: alc (pmap.c), kib
Differential Revision: https://reviews.freebsd.org/D51403
Differential Revision: https://reviews.freebsd.org/D51345

show more ...


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# d6717f87 10-Jul-2021 Konstantin Belousov <kib@FreeBSD.org>

amd64: rework AP startup

Stop using temporal page table with 1:1 mapping of low 1G populated over
the whole VA. Use 1:1 mapping of low 4G temporarily installed in the
normal kernel page table.

The

amd64: rework AP startup

Stop using temporal page table with 1:1 mapping of low 1G populated over
the whole VA. Use 1:1 mapping of low 4G temporarily installed in the
normal kernel page table.

The features are:
- now there is one less step for startup asm to perform
- the startup code still needs to be at lower 1G because CPU starts in
real mode. But everything else can be located anywhere in low 4G
because it is accessed by non-paged 32bit protected mode. Note that
kernel page table root page is at low 4G, as well as the kernel itself.
- the page table pages can be allocated by normal allocator, there is
no need to carve them from the phys_avail segments at very early time.
The allocation of the page for startup code still requires some magic.
Pages are freed after APs are ignited.
- la57 startup for APs is less tricky, we directly load the final page
table and do not need to tweak the paging mode.

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121

show more ...


# ac3ede53 12-May-2021 Roger Pau Monné <royger@FreeBSD.org>

x86/xen: remove PVHv1 code

PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: C

x86/xen: remove PVHv1 code

PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: Citrix Systems R&D
Reviewed by: kib, Elliott Mitchell
Differential Revision: https://reviews.freebsd.org/D30228

show more ...


# c7aa572c 31-Jul-2020 Glen Barber <gjb@FreeBSD.org>

MFH

Sponsored by: Rubicon Communications, LLC (netgate.com)


# aba10e13 25-Jul-2020 Alexander Motin <mav@FreeBSD.org>

Allow swi_sched() to be called from NMI context.

For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To

Allow swi_sched() to be called from NMI context.

For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs. On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY. To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25754

show more ...


# e2c0e292 16-Jul-2020 Glen Barber <gjb@FreeBSD.org>

MFH

Sponsored by: Rubicon Communications, LLC (netgate.com)


# dc43978a 14-Jul-2020 Konstantin Belousov <kib@FreeBSD.org>

amd64: allow parallel shootdown IPIs

Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu. Now each CPU can initiate
shootdown IPI independently fr

amd64: allow parallel shootdown IPIs

Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu. Now each CPU can initiate
shootdown IPI independently from other CPUs. Initiator enters
critical section, then fills its local PCPU shootdown info
(pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu,
my_cpuid) for each target cpu. After that IPI is sent to all targets
which scan for zeroed scoreboard generation words. Upon finding such
word the shootdown data is read from corresponding cpu' pcpu, and
generation is set. Meantime initiator loops waiting for all zeroed
generations in scoreboard to update.

Initiator does not disable interrupts, which should allow
non-invalidation IPIs from deadlocking, it only needs to disable
preemption to pin itself to the instance of the pcpu smp_tlb data.

The generation is set before the actual invalidation is performed in
handler. It is safe because target CPU cannot return to userspace
before handler finishes. In principle only NMI can preempt the
handler, but NMI would see the kernel handler frame and not touch
not-invalidated user page table.

Handlers loop until they do not see zeroed scoreboard generations.
This, together with hardware keeping one pending IPI in LAPIC IRR
should prevent lost shootdowns.

Notes.
1. The code does protect writes to LAPIC ICR with exclusion. I believe
this is fine because we in fact do not send IPIs from interrupt
handlers. More for !x2APIC mode where ICR access for write requires
two registers write, we disable interrupts around it. If considered
incorrect, I can add per-cpu spinlock around ipi_send().
2. Scoreboard lines owned by given target CPU can be padded to the
cache line, to reduce ping-pong.

Reviewed by: markj (previous version)
Discussed with: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D25510

show more ...


# 190d0a96 26-Oct-2025 Justin Hibbits <jhibbits@FreeBSD.org>

amd64: Add cpu_stop() support to go UP after SMP

Reviewed by: kib
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D51622


# fa02551d 21-Jul-2025 Mark Johnston <markj@FreeBSD.org>

amd64: Remove support for "nooptions SMP"

It does not appear to get much, if any, testing, and doesn't seem to be
worth the maintenance overhead. Virtually all amd64 hardware has
multiple cores. T

amd64: Remove support for "nooptions SMP"

It does not appear to get much, if any, testing, and doesn't seem to be
worth the maintenance overhead. Virtually all amd64 hardware has
multiple cores. The CPU and memory usage overhead of the SMP option in
single-vCPU VMs is quite marginal and not worth maintaining.

Reviewed by: alc (pmap.c), kib
Differential Revision: https://reviews.freebsd.org/D51403
Differential Revision: https://reviews.freebsd.org/D51345

show more ...


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# d6717f87 10-Jul-2021 Konstantin Belousov <kib@FreeBSD.org>

amd64: rework AP startup

Stop using temporal page table with 1:1 mapping of low 1G populated over
the whole VA. Use 1:1 mapping of low 4G temporarily installed in the
normal kernel page table.

The

amd64: rework AP startup

Stop using temporal page table with 1:1 mapping of low 1G populated over
the whole VA. Use 1:1 mapping of low 4G temporarily installed in the
normal kernel page table.

The features are:
- now there is one less step for startup asm to perform
- the startup code still needs to be at lower 1G because CPU starts in
real mode. But everything else can be located anywhere in low 4G
because it is accessed by non-paged 32bit protected mode. Note that
kernel page table root page is at low 4G, as well as the kernel itself.
- the page table pages can be allocated by normal allocator, there is
no need to carve them from the phys_avail segments at very early time.
The allocation of the page for startup code still requires some magic.
Pages are freed after APs are ignited.
- la57 startup for APs is less tricky, we directly load the final page
table and do not need to tweak the paging mode.

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121

show more ...


# ac3ede53 12-May-2021 Roger Pau Monné <royger@FreeBSD.org>

x86/xen: remove PVHv1 code

PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: C

x86/xen: remove PVHv1 code

PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: Citrix Systems R&D
Reviewed by: kib, Elliott Mitchell
Differential Revision: https://reviews.freebsd.org/D30228

show more ...


# c7aa572c 31-Jul-2020 Glen Barber <gjb@FreeBSD.org>

MFH

Sponsored by: Rubicon Communications, LLC (netgate.com)


# aba10e13 25-Jul-2020 Alexander Motin <mav@FreeBSD.org>

Allow swi_sched() to be called from NMI context.

For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To

Allow swi_sched() to be called from NMI context.

For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs. On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY. To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25754

show more ...


# e2c0e292 16-Jul-2020 Glen Barber <gjb@FreeBSD.org>

MFH

Sponsored by: Rubicon Communications, LLC (netgate.com)


# dc43978a 14-Jul-2020 Konstantin Belousov <kib@FreeBSD.org>

amd64: allow parallel shootdown IPIs

Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu. Now each CPU can initiate
shootdown IPI independently fr

amd64: allow parallel shootdown IPIs

Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu. Now each CPU can initiate
shootdown IPI independently from other CPUs. Initiator enters
critical section, then fills its local PCPU shootdown info
(pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu,
my_cpuid) for each target cpu. After that IPI is sent to all targets
which scan for zeroed scoreboard generation words. Upon finding such
word the shootdown data is read from corresponding cpu' pcpu, and
generation is set. Meantime initiator loops waiting for all zeroed
generations in scoreboard to update.

Initiator does not disable interrupts, which should allow
non-invalidation IPIs from deadlocking, it only needs to disable
preemption to pin itself to the instance of the pcpu smp_tlb data.

The generation is set before the actual invalidation is performed in
handler. It is safe because target CPU cannot return to userspace
before handler finishes. In principle only NMI can preempt the
handler, but NMI would see the kernel handler frame and not touch
not-invalidated user page table.

Handlers loop until they do not see zeroed scoreboard generations.
This, together with hardware keeping one pending IPI in LAPIC IRR
should prevent lost shootdowns.

Notes.
1. The code does protect writes to LAPIC ICR with exclusion. I believe
this is fine because we in fact do not send IPIs from interrupt
handlers. More for !x2APIC mode where ICR access for write requires
two registers write, we disable interrupts around it. If considered
incorrect, I can add per-cpu spinlock around ipi_send().
2. Scoreboard lines owned by given target CPU can be padded to the
cache line, to reduce ping-pong.

Reviewed by: markj (previous version)
Discussed with: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D25510

show more ...


# 9dba82a4 05-Apr-2018 Roger Pau Monné <royger@FreeBSD.org>

x86: improve reservation of AP trampoline memory

So that it doesn't rely on physmap[1] containing an address below
1MiB. Instead scan the full physmap and search for a suitable address
to place the

x86: improve reservation of AP trampoline memory

So that it doesn't rely on physmap[1] containing an address below
1MiB. Instead scan the full physmap and search for a suitable address
to place the trampoline code (below 1MiB) and the initial memory pages
(below 4GiB).

Sponsored by: Citrix Systems R&D
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D14878

show more ...


# c8f9c1f3 27-Jan-2018 Konstantin Belousov <kib@FreeBSD.org>

Use PCID to optimize PTI.

Use PCID to avoid complete TLB shootdown when switching between user
and kernel mode with PTI enabled.

I use the model close to what I read about KAISER, user-mode PCID ha

Use PCID to optimize PTI.

Use PCID to avoid complete TLB shootdown when switching between user
and kernel mode with PTI enabled.

I use the model close to what I read about KAISER, user-mode PCID has
1:1 correspondence to the kernel-mode PCID, by setting bit 11 in PCID.
Full kernel-mode TLB shootdown is performed on context switches, since
KVA TLB invalidation only works in the current pmap. User-mode part of
TLB is flushed on the pmap activations as well.

Similarly, IPI TLB shootdowns must handle both kernel and user address
spaces for each address. Note that machines which implement PCID but
do not have INVPCID instructions, cause the usual complications in the
IPI handlers, due to the need to switch to the target PCID temporary.
This is racy, but because for PCID/no-INVPCID we disable the
interrupts in pmap_activate_sw(), IPI handler cannot see inconsistent
state of CPU PCID vs PCPU pmap/kcr3/ucr3 pointers.

On the other hand, on kernel/user switches, CR3_PCID_SAVE bit is set
and we do not clear TLB.

I can imagine alternative use of PCID, where there is only one PCID
allocated for the kernel pmap. Then, there is no need to shootdown
kernel TLB entries on context switch. But copyout(3) would need to
either use method similar to proc_rwmem() to access the userspace
data, or (in reverse) provide a temporal mapping for the kernel buffer
into user mode PCID and use trampoline for copy.

Reviewed by: markj (previous version)
Tested by: pho
Discussed with: alc (some aspects)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D13985

show more ...


# bd50262f 17-Jan-2018 Konstantin Belousov <kib@FreeBSD.org>

PTI for amd64.

The implementation of the Kernel Page Table Isolation (KPTI) for
amd64, first version. It provides a workaround for the 'meltdown'
vulnerability. PTI is turned off by default for now

PTI for amd64.

The implementation of the Kernel Page Table Isolation (KPTI) for
amd64, first version. It provides a workaround for the 'meltdown'
vulnerability. PTI is turned off by default for now, enable with the
loader tunable vm.pmap.pti=1.

The pmap page table is split into kernel-mode table and user-mode
table. Kernel-mode table is identical to the non-PTI table, while
usermode table is obtained from kernel table by leaving userspace
mappings intact, but only leaving the following parts of the kernel
mapped:

kernel text (but not modules text)
PCPU
GDT/IDT/user LDT/task structures
IST stacks for NMI and doublefault handlers.

Kernel switches to user page table before returning to usermode, and
restores full kernel page table on the entry. Initial kernel-mode
stack for PTI trampoline is allocated in PCPU, it is only 16
qwords. Kernel entry trampoline switches page tables. then the
hardware trap frame is copied to the normal kstack, and execution
continues.

IST stacks are kept mapped and no trampoline is needed for
NMI/doublefault, but of course page table switch is performed.

On return to usermode, the trampoline is used again, iret frame is
copied to the trampoline stack, page tables are switched and iretq is
executed. The case of iretq faulting due to the invalid usermode
context is tricky, since the frame for fault is appended to the
trampoline frame. Besides copying the fault frame and original
(corrupted) frame to kstack, the fault frame must be patched to make
it look as if the fault occured on the kstack, see the comment in
doret_iret detection code in trap().

Currently kernel pages which are mapped during trampoline operation
are identical for all pmaps. They are registered using
pmap_pti_add_kva(). Besides initial registrations done during boot,
LDT and non-common TSS segments are registered if user requested their
use. In principle, they can be installed into kernel page table per
pmap with some work. Similarly, PCPU can be hidden from userspace
mapping using trampoline PCPU page, but again I do not see much
benefits besides complexity.

PDPE pages for the kernel half of the user page tables are
pre-allocated during boot because we need to know pml4 entries which
are copied to the top-level paging structure page, in advance on a new
pmap creation. I enforce this to avoid iterating over the all
existing pmaps if a new PDPE page is needed for PTI kernel mappings.
The iteration is a known problematic operation on i386.

The need to flush hidden kernel translations on the switch to user
mode make global tables (PG_G) meaningless and even harming, so PG_G
use is disabled for PTI case. Our existing use of PCID is
incompatible with PTI and is automatically disabled if PTI is
enabled. PCID can be forced on only for developer's benefit.

MCE is known to be broken, it requires IST stack to operate completely
correctly even for non-PTI case, and absolutely needs dedicated IST
stack because MCE delivery while trampoline did not switched from PTI
stack is fatal. The fix is pending.

Reviewed by: markj (partially)
Tested by: pho (previous version)
Discussed with: jeff, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks

show more ...


# 02ebdc78 31-Oct-2016 Dimitry Andric <dim@FreeBSD.org>

Merge ^/head r307736 through r308146.


# 0a4c51f4 31-Oct-2016 John Baldwin <jhb@FreeBSD.org>

Move declarations of invpcid_works and pmap_pcid_enabled to pmap.h.

Previously these were only declared under #ifdef SMP in <machine/smp.h>.
However, these variables are defind in pmap.c uncondition

Move declarations of invpcid_works and pmap_pcid_enabled to pmap.h.

Previously these were only declared under #ifdef SMP in <machine/smp.h>.
However, these variables are defind in pmap.c unconditionally, and efirt.c
references them unconditionally. This fixes non-SMP kernel builds.

Discussed with: kib
MFC after: 1 week

show more ...


# b626f5a7 04-Jan-2016 Glen Barber <gjb@FreeBSD.org>

MFH r289384-r293170

Sponsored by: The FreeBSD Foundation


12345678910>>...22