| #
9718f184
|
| 12-Jan-2025 |
Konstantin Belousov <kib@FreeBSD.org> |
pthread_mutex_trylock(): init libthr if needed
Reported by: yuri Reviewed by: markj, olce Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D4
pthread_mutex_trylock(): init libthr if needed
Reported by: yuri Reviewed by: markj, olce Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D48454
show more ...
|
| #
f8bbbce4
|
| 06-Mar-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
libthr: remove explicit sys/cdefs.h includes
Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
| #
1d386b48
|
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
| #
0a5c29a6
|
| 22-Jul-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
thr_mutex.c: style
Reindend and re-fill the statement.
Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D41150
|
| #
b370ef15
|
| 07-Jul-2023 |
Greg Becker <becker.greg@att.net> |
libthr: Patch to reduce latency to acquire+release a pthread mutex.
The acquisition and release of an uncontended default/normal pthread mutex on FreeBSD is suprisingly slow, e.g., pthread wrlocks a
libthr: Patch to reduce latency to acquire+release a pthread mutex.
The acquisition and release of an uncontended default/normal pthread mutex on FreeBSD is suprisingly slow, e.g., pthread wrlocks and binary semaphores both exhibit roughly 33% lower latency, while default/normal mutexes on Linux exhibit roughly 67% lower latency than FreeBSD. This is likely explained by the fact that AFAICT in the best case to acquire an uncontended mutex on Linux one need touch only 1 page and read+modify only 1 cacheline, whereas on FreeBSD we need to touch at least 4 pages, read 6 cachelines, and modify at least 4 cachelines.
This patch does not address the pthread mutex architecture. Instead, it improves performance by adding the __always_inline attribute to mutex_lock_common() and mutex_unlock_common() to encourage constant folding and propagation, thereby lowering the latency to acquire and release a mutex due to a shorter code path with fewer compares, jumps, and mispredicts.
With this patch on a stock build I see a reduction in latency of roughly 7% for default/normal mutexes, and 17% for robust mutexes. When built without PTHREADS_ASSERTIONS enabled I see a reduction in latency of roughly 15% and 26%, respectively. Suprisingly, I see similar reductions in latency for heavily contended mutexes.
By default, this patch increases the size of libthr.so.3 by 2448 bytes, but when built without PTHREAD_ASSERTIONS enabled it only increases by 448 bytes.
Reviewed by: jhb (previous version), kib MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40912
show more ...
|
| #
642cd511
|
| 07-Jul-2023 |
Greg Becker <becker.greg@att.net> |
libthr: Add src.conf variable WITHOUT_PTHREADS_ASSERTIONS
This patch fixes a bug which prevents building libthr without _PTHREADS_INVARIANTS defined. The default remains to build libthr with -D_PTHR
libthr: Add src.conf variable WITHOUT_PTHREADS_ASSERTIONS
This patch fixes a bug which prevents building libthr without _PTHREADS_INVARIANTS defined. The default remains to build libthr with -D_PTHREADS_INVARIANTS. However, with this patch, if one builds libthr with WITHOUT_PTHREADS_ASSERTIONS=true then the latency to acquire+release a default pthread mutex is reduced by roughly 5%, and a robust mutex by roughly 18% (as measured by a simple synthetic test on a Xeon E5-2697a based machine).
Reviewed by: jhb, kib, mjg MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40900
show more ...
|
| #
c7904405
|
| 07-Apr-2022 |
Andrew Turner <andrew@FreeBSD.org> |
Remove PAGE_SIZE from libthr
In libthr we use PAGE_SIZE when allocating memory with mmap and to check various structs will fit into a single page so we can use this allocator for them.
Ask the kern
Remove PAGE_SIZE from libthr
In libthr we use PAGE_SIZE when allocating memory with mmap and to check various structs will fit into a single page so we can use this allocator for them.
Ask the kernel for the page size on init for use by the page allcator and add a new machine dependent macro to hold the smallest page size the architecture supports to check the structure is small enough.
This allows us to use the same libthr on arm64 with either 4k or 16k pages.
Reviewed by: kib, markj, imp Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34984
show more ...
|
| #
ec5fed75
|
| 30-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Ensure that threading library is initialized in pthread_mutex_init().
We need at least thr_malloc ready. The situation is possible e.g. in case of libthr being listed in DT_NEEDED before some of it
Ensure that threading library is initialized in pthread_mutex_init().
We need at least thr_malloc ready. The situation is possible e.g. in case of libthr being listed in DT_NEEDED before some of its consumers.
Reported and tested by: lev Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
| #
668ee101
|
| 26-Sep-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r352587 through r352763.
|
| #
f9bf9282
|
| 23-Sep-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix destruction of the robust mutexes.
If robust mutex' owner terminated, causing kernel-assisted state recovery, and then pthread_mutex_destroy() is executed as the next action, assert is triggered
Fix destruction of the robust mutexes.
If robust mutex' owner terminated, causing kernel-assisted state recovery, and then pthread_mutex_destroy() is executed as the next action, assert is triggered about mutex still being on the list. Ignore the mutex linkage in pthread_mutex_destroy() for shared robust mutexes with dead owner, same as for enqueue_mutex().
Reported by: avg Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
| #
0ab1bfc7
|
| 31-Jul-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Avoid conflicts with libc symbols in libthr jump table.
In some corner cases of static linking and unexpected libraries order on the linker command line, libc symbol might preempt the same libthr sy
Avoid conflicts with libc symbols in libthr jump table.
In some corner cases of static linking and unexpected libraries order on the linker command line, libc symbol might preempt the same libthr symbol, in which case libthr jump table points back to libc causing either infinite recursion or loop. Handle all of such symbols by using private libthr names for them, ensuring that the right pointers are installed into the table.
In collaboration with: arichardson PR: 239475 Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21088
show more ...
|
| #
7648bc9f
|
| 13-May-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @347527
Sponsored by: The FreeBSD Foundation
|
| #
b8f75b17
|
| 12-Apr-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not access mutex memory after unlock.
PR: 237195 Reported by: freebsd@hurrikhan.eu Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
| #
8e69ae1c
|
| 05-Feb-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r343712 through r343806.
|
| #
e4314da2
|
| 04-Feb-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fixes for very early use of the pthread_mutex_* and libthr malloc.
When libthr is statically linked into the binary, order of the constructors execution is not deterministic. It is possible for the
Fixes for very early use of the pthread_mutex_* and libthr malloc.
When libthr is statically linked into the binary, order of the constructors execution is not deterministic. It is possible for the application constructor to use pthread_mutex_* functions before the libthr initialization was done.
Handle it by: - making thr_malloc.c locking functions operational when curthread is not yet set; - making __thr_malloc_init() idempotent, allowing more than one call to it; - unconditionally calling __thr_malloc_init() before initializing a process-private mutex.
Reported and tested by: mmel Sponsored by: The FreeBSD Foundation MFC after: 1 week
show more ...
|
| #
7e565c55
|
| 30-Jan-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r343320 through r343570.
|
| #
381c2d2e
|
| 29-Jan-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Untangle jemalloc and mutexes initialization.
The need to use libc malloc(3) from some places in libthr always caused issues. For instance, per-thread key allocation was switched to use plain mmap(
Untangle jemalloc and mutexes initialization.
The need to use libc malloc(3) from some places in libthr always caused issues. For instance, per-thread key allocation was switched to use plain mmap(2) to get storage, because some third party mallocs used keys for implementation of calloc(3).
Even more important, libthr calls calloc(3) during initialization of pthread mutexes, and jemalloc uses pthread mutexes. Jemalloc provides some way to both postpone the initialization, and to make initialization to use specialized allocator, but this is very fragile and often breaks. See the referenced PR for another example.
Add the small malloc implementation used by rtld, to libthr. Use it in thr_spec.c and for mutexes initialization. This avoids the issues with mutual dependencies between malloc and libthr in principle. The drawback is that some more allocations are not interceptable for alternate malloc implementations. There should be not too much memory use from this allocator, and the alternative, direct use of mmap(2) is obviously worse.
PR: 235211 MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D18988
show more ...
|
| #
3611ec60
|
| 18-Aug-2018 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r337646 through r338014.
|
| #
9718f184
|
| 12-Jan-2025 |
Konstantin Belousov <kib@FreeBSD.org> |
pthread_mutex_trylock(): init libthr if needed
Reported by: yuri Reviewed by: markj, olce Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D4
pthread_mutex_trylock(): init libthr if needed
Reported by: yuri Reviewed by: markj, olce Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D48454
show more ...
|
| #
f8bbbce4
|
| 06-Mar-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
libthr: remove explicit sys/cdefs.h includes
Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
| #
1d386b48
|
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
| #
0a5c29a6
|
| 22-Jul-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
thr_mutex.c: style
Reindend and re-fill the statement.
Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D41150
|
| #
b370ef15
|
| 07-Jul-2023 |
Greg Becker <becker.greg@att.net> |
libthr: Patch to reduce latency to acquire+release a pthread mutex.
The acquisition and release of an uncontended default/normal pthread mutex on FreeBSD is suprisingly slow, e.g., pthread wrlocks a
libthr: Patch to reduce latency to acquire+release a pthread mutex.
The acquisition and release of an uncontended default/normal pthread mutex on FreeBSD is suprisingly slow, e.g., pthread wrlocks and binary semaphores both exhibit roughly 33% lower latency, while default/normal mutexes on Linux exhibit roughly 67% lower latency than FreeBSD. This is likely explained by the fact that AFAICT in the best case to acquire an uncontended mutex on Linux one need touch only 1 page and read+modify only 1 cacheline, whereas on FreeBSD we need to touch at least 4 pages, read 6 cachelines, and modify at least 4 cachelines.
This patch does not address the pthread mutex architecture. Instead, it improves performance by adding the __always_inline attribute to mutex_lock_common() and mutex_unlock_common() to encourage constant folding and propagation, thereby lowering the latency to acquire and release a mutex due to a shorter code path with fewer compares, jumps, and mispredicts.
With this patch on a stock build I see a reduction in latency of roughly 7% for default/normal mutexes, and 17% for robust mutexes. When built without PTHREADS_ASSERTIONS enabled I see a reduction in latency of roughly 15% and 26%, respectively. Suprisingly, I see similar reductions in latency for heavily contended mutexes.
By default, this patch increases the size of libthr.so.3 by 2448 bytes, but when built without PTHREAD_ASSERTIONS enabled it only increases by 448 bytes.
Reviewed by: jhb (previous version), kib MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40912
show more ...
|
| #
642cd511
|
| 07-Jul-2023 |
Greg Becker <becker.greg@att.net> |
libthr: Add src.conf variable WITHOUT_PTHREADS_ASSERTIONS
This patch fixes a bug which prevents building libthr without _PTHREADS_INVARIANTS defined. The default remains to build libthr with -D_PTHR
libthr: Add src.conf variable WITHOUT_PTHREADS_ASSERTIONS
This patch fixes a bug which prevents building libthr without _PTHREADS_INVARIANTS defined. The default remains to build libthr with -D_PTHREADS_INVARIANTS. However, with this patch, if one builds libthr with WITHOUT_PTHREADS_ASSERTIONS=true then the latency to acquire+release a default pthread mutex is reduced by roughly 5%, and a robust mutex by roughly 18% (as measured by a simple synthetic test on a Xeon E5-2697a based machine).
Reviewed by: jhb, kib, mjg MFC after: 1 week Differential revision: https://reviews.freebsd.org/D40900
show more ...
|
| #
c7904405
|
| 07-Apr-2022 |
Andrew Turner <andrew@FreeBSD.org> |
Remove PAGE_SIZE from libthr
In libthr we use PAGE_SIZE when allocating memory with mmap and to check various structs will fit into a single page so we can use this allocator for them.
Ask the kern
Remove PAGE_SIZE from libthr
In libthr we use PAGE_SIZE when allocating memory with mmap and to check various structs will fit into a single page so we can use this allocator for them.
Ask the kernel for the page size on init for use by the page allcator and add a new machine dependent macro to hold the smallest page size the architecture supports to check the structure is small enough.
This allows us to use the same libthr on arm64 with either 4k or 16k pages.
Reviewed by: kib, markj, imp Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34984
show more ...
|