History log of /linux/tools/perf/tests/shell/coresight/asm_pure_loop.sh (Results 1 – 25 of 103)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# f4f346c3 01-Aug-2025 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-tools-for-v6.17-2025-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"Build-ID processing goodies:

Build-IDs

Merge tag 'perf-tools-for-v6.17-2025-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"Build-ID processing goodies:

Build-IDs are content based hashes to link regions of memory to ELF
files in post processing. They have been available in distros for
quite a while:

$ file /bin/bash
/bin/bash: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV),
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=707a1c670cd72f8e55ffedfbe94ea98901b7ce3a,
for GNU/Linux 3.2.0, stripped

It is possible to ask the kernel to get it from mmap executable
backing storage at time they are being put in place and send it as
metadata at that moment to have in perf.data.

Prefer that across the board to speed up 'record' time - it post
processes the samples to find binaries touched by any samples and
to save them with build-ID. It can skip reading build-ID in
userspace if it comes from the kernel.

perf record:

* Make --buildid-mmap default. The kernel can generate MMAP2 events
with a build-ID from ELF header. Use that by default instead of using
inode and device ID to identify binaries. It also can be disabled
with --no-buildid-mmap.

* Use BPF for -u/--uid option to sample processes belong to a user.
BPF can track user processes more accurately and the existing logic
often fails to get the list of processes due to race with reading the
/proc filesystem.

* Generate PERF_RECORD_BPF_METADATA when it profiles BPF programs and
they have variables starting with "bpf_metadata_". This will help to
identify BPF objects used in the profile. This has been supported in
bpftool for some time and allows the recording of metadata such as
commit hashes, versions, etc, that now gets recorded in perf.data as
well.

* Collect list of DSOs touched in the sample callchains as well as in
the sample itself. This would increase the processing time at the end
of record, but can improve the data quality.

perf stat:

* Add a new 'drm' pseudo-PMU support like in 'hwmon'. It can collect
DRM usage stats using fdinfo in /proc.

On my Intel laptop, it shows like below:

$ perf list drm
...

drm:
drm-active-stolen-system0
[Total memory active in one or more engines. Unit: drm_i915]
drm-active-system0
[Total memory active in one or more engines. Unit: drm_i915]
drm-engine-capacity-video
[Engine capacity. Unit: drm_i915]
drm-engine-copy
[Utilization in ns. Unit: drm_i915]
drm-engine-render
[Utilization in ns. Unit: drm_i915]
drm-engine-video
[Utilization in ns. Unit: drm_i915]
...

$ sudo perf stat -a -e drm-engine-render,drm-engine-video,drm-engine-capacity-video sleep 1

Performance counter stats for 'system wide':

48,137,316,988,873 ns drm-engine-render
34,452,696,746 ns drm-engine-video
20 capacity drm-engine-capacity-video

1.002086194 seconds time elapsed

perf list

* Add description for software events. The description is in JSON format
and the event parser now can handle the software events like others
(for example, it's case-insensitive and subject to wildcard matching).

$ perf list software

List of pre-defined events (to be used in -e or -M):

software:
alignment-faults
[Number of kernel handled memory alignment faults. Unit: software]
bpf-output
[An event used by BPF programs to write to the perf ring buffer. Unit: software]
cgroup-switches
[Number of context switches to a task in a different cgroup. Unit: software]
context-switches
[Number of context switches [This event is an alias of cs]. Unit: software]
cpu-clock
[Per-CPU high-resolution timer based event. Unit: software]
cpu-migrations
[Number of times a process has migrated to a new CPU [This event is an alias of migrations]. Unit: software]
cs
[Number of context switches [This event is an alias of context-switches]. Unit: software]
dummy
[A placeholder event that doesn't count anything. Unit: software]
emulation-faults
[Number of kernel handled unimplemented instruction faults handled through emulation. Unit: software]
faults
[Number of page faults [This event is an alias of page-faults]. Unit: software]
major-faults
[Number of major page faults. Major faults require I/O to handle. Unit: software]
migrations
[Number of times a process has migrated to a new CPU [This event is an alias of cpu-migrations]. Unit: software]
minor-faults
[Number of minor page faults. Minor faults don't require I/O to handle. Unit: software]
page-faults
[Number of page faults [This event is an alias of faults]. Unit: software]
task-clock
[Per-task high-resolution timer based event. Unit: software]

perf ftrace:

* Add -e/--events option to perf ftrace latency to measure latency
between the two events instead of a function.

$ sudo perf ftrace latency -ab -e i915_request_wait_begin,i915_request_wait_end --hide-empty -- sleep 1
# DURATION | COUNT | GRAPH |
256 - 512 us | 4 | ###### |
2 - 4 ms | 2 | ### |
4 - 8 ms | 12 | ################### |
8 - 16 ms | 10 | ################ |

# statistics (in usec)
total time: 194915
avg time: 6961
max time: 12855
min time: 373
count: 28

* Add new function graph tracer options (--graph-opts) to display more
info like arguments and return value. They will be passed to the
kernel ftrace directly.

$ sudo perf ftrace -G vfs_write --graph-opts retval,retaddr
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
...
5) | mutex_unlock() { /* <-rb_simple_write+0xda/0x150 */
5) 0.188 us | local_clock(); /* <-lock_release+0x2ad/0x440 ret=0x3bf2a3cf90e */
5) | rt_mutex_slowunlock() { /* <-rb_simple_write+0xda/0x150 */
5) | _raw_spin_lock_irqsave() { /* <-rt_mutex_slowunlock+0x4f/0x200 */
5) 0.123 us | preempt_count_add(); /* <-_raw_spin_lock_irqsave+0x23/0x90 ret=0x0 */
5) 0.128 us | local_clock(); /* <-__lock_acquire.isra.0+0x17a/0x740 ret=0x3bf2a3cfc8b */
5) 0.086 us | do_raw_spin_trylock(); /* <-_raw_spin_lock_irqsave+0x4a/0x90 ret=0x1 */
5) 0.845 us | } /* _raw_spin_lock_irqsave ret=0x292 */
...

Misc:

* Add perf archive --exclude-buildids <FILE> option to skip some binaries.
The format of the FILE should be same as an output of perf buildid-list.

* Get rid of dependency of libcrypto. It was just to get SHA-1 hash so
implement it directly like in the kernel. A side effect is that it
needs -fno-strict-aliasing compiler option (again, like in the kernel).

* Convert all shell script tests to use bash"

* tag 'perf-tools-for-v6.17-2025-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits)
perf record: Cache build-ID of hit DSOs only
perf test: Ensure lock contention using pipe mode
perf python: Stop using deprecated PyUnicode_AsString()
perf list: Skip ABI PMUs when printing pmu values
perf list: Remove tracepoint printing code
perf tp_pmu: Add event APIs
perf tp_pmu: Factor existing tracepoint logic to new file
perf parse-events: Remove non-json software events
perf jevents: Add common software event json
perf tools: Remove libtraceevent in .gitignore
perf test: Fix comment ordering
perf sort: Use perf_env to set arch sort keys and header
perf test: Move PERF_SAMPLE_WEIGHT_STRUCT parsing to common test
perf sample: Remove arch notion of sample parsing
perf env: Remove global perf_env
perf trace: Avoid global perf_env with evsel__env
perf auxtrace: Pass perf_env from session through to mmap read
perf machine: Explicitly pass in host perf_env
perf bench synthesize: Avoid use of global perf_env
perf top: Make perf_env locally scoped
...

show more ...


Revision tags: v6.16, v6.16-rc7, v6.16-rc6, v6.16-rc5, v6.16-rc4
# 2f5d370d 23-Jun-2025 James Clark <james.clark@linaro.org>

perf test: Change all remaining #!/bin/sh to #!/bin/bash

There are 43 instances of posix shell tests and 35 instances of bash. To
give us a single consistent language for testing in, replace
all #!/

perf test: Change all remaining #!/bin/sh to #!/bin/bash

There are 43 instances of posix shell tests and 35 instances of bash. To
give us a single consistent language for testing in, replace
all #!/bin/sh to #!/bin/bash. Common sources that are included in both
different shells will now work as expected. And we no longer have to fix
up bashisms that appear to work when someone's system has sh symlinked
to bash, but don't work on other systems that have both shells
installed.

Although we could have chosen sh, it's not backwards compatible so it
wouldn't be possible to bulk convert without re-writing the existing
bash tests.

Choosing bash also gives us some nicer features including 'local'
variable definitions and regexes in if statements that are already
widely used in the tests.

It's not expected that there are any users with only sh available due to
the large number of bash tests that exist.

Discussed in relation to running shellcheck here:
https://lore.kernel.org/linux-perf-users/e3751a74be34bbf3781c4644f518702a7270220b.1749785642.git.collin.funk1@gmail.com/

Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250623-james-perf-bash-tests-v1-1-f572f54d4559@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

show more ...


Revision tags: v6.16-rc3, v6.16-rc2, v6.16-rc1, v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2
# c771600c 05-Feb-2025 Tvrtko Ursulin <tursulin@ursulin.net>

Merge drm/drm-next into drm-intel-gt-next

We need
4ba4f1afb6a9 ("perf: Generic hotplug support for a PMU with a scope")
in order to land a i915 PMU simplification and a fix. That landed in 6.12
and

Merge drm/drm-next into drm-intel-gt-next

We need
4ba4f1afb6a9 ("perf: Generic hotplug support for a PMU with a scope")
in order to land a i915 PMU simplification and a fix. That landed in 6.12
and we are stuck at 6.9 so lets bump things forward.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

show more ...


Revision tags: v6.14-rc1
# 25768de5 21-Jan-2025 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.14 merge window.


# 670af65d 20-Jan-2025 Jiri Kosina <jkosina@suse.com>

Merge branch 'for-6.14/constify-bin-attribute' into for-linus

- constification of 'struct bin_attribute' in various HID driver (Thomas Weißschuh)


Revision tags: v6.13, v6.13-rc7, v6.13-rc6
# c5fb51b7 03-Jan-2025 Rob Clark <robdclark@chromium.org>

Merge remote-tracking branch 'pm/opp/linux-next' into HEAD

Merge pm/opp tree to get dev_pm_opp_get_bw()

Signed-off-by: Rob Clark <robdclark@chromium.org>


Revision tags: v6.13-rc5, v6.13-rc4
# 60675d4c 20-Dec-2024 Ingo Molnar <mingo@kernel.org>

Merge branch 'linus' into x86/mm, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 6d4a0f4e 17-Dec-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.13-rc3' into next

Sync up with the mainline.


Revision tags: v6.13-rc3
# e7f0a3a6 11-Dec-2024 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-intel-next

Catching up with 6.13-rc2.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


Revision tags: v6.13-rc2
# c34e9ab9 05-Dec-2024 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v6.13-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.13

A few small fixes for v6.13, all system specific - the biggest t

Merge tag 'asoc-fix-v6.13-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.13

A few small fixes for v6.13, all system specific - the biggest thing is
the fix for jack handling over suspend on some Intel laptops.

show more ...


# 8f109f28 02-Dec-2024 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-xe-next

A backmerge to get the PMT preparation work for
merging the BMG PMT support.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 3aba2eba 02-Dec-2024 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-next into drm-misc-next

Kickstart 6.14 cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>


# bcfd5f64 02-Dec-2024 Ingo Molnar <mingo@kernel.org>

Merge tag 'v6.13-rc1' into perf/core, to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>


Revision tags: v6.13-rc1
# b50ecc5a 26-Nov-2024 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"perf record:

- Enable leader sampling fo

Merge tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"perf record:

- Enable leader sampling for inherited task events. It was supported
only for system-wide events but the kernel started to support such
a setup since v6.12.

This is to reduce the number of PMU interrupts. The samples of the
leader event will contain counts of other events and no samples
will be generated for the other member events.

$ perf record -e '{cycles,instructions}:S' ${MYPROG}

perf report:

- Fix --branch-history option to display more branch-related
information like prediction, abort and cycles which is available
on Intel machines.

$ perf record -bg -- perf test -w brstack

$ perf report --branch-history
...
#
# Overhead Source:Line Symbol Shared Object Predicted Abort Cycles IPC [IPC Coverage]
# ........ ........................ .............. .................... ......... ..... ...... ....................
#
8.17% copy_page_64.S:19 [k] copy_page [kernel.kallsyms] 50.0% 0 5 - -
|
---xas_load xarray.h:171
|
|--5.68%--xas_load xarray.c:245 (cycles:1)
| xas_load xarray.c:242
| xas_load xarray.h:1260 (cycles:1)
| xas_descend xarray.c:146
| xas_load xarray.c:244 (cycles:2)
| xas_load xarray.c:245
| xas_descend xarray.c:218 (cycles:10)
...

perf stat:

- Add HWMON PMU support.

The HWMON provides various system information like CPU/GPU
temperature, fan speed and so on. Expose them as PMU events so that
users can see the values using perf stat commands.

$ perf stat -e temp_cpu,fan1 true

Performance counter stats for 'true':

60.00 'C temp_cpu
0 rpm fan1

0.000745382 seconds time elapsed

0.000883000 seconds user
0.000000000 seconds sys

- Display metric threshold in JSON output.

Some metrics define thresholds to classify value ranges. It used to
be in a different color but it won't work for JSON.

Add "metric-threshold" field to the JSON that can be one of "good",
"less good", "nearly bad" and "bad".

# perf stat -a -M TopdownL1 -j true
{"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "% tma_backend_bound", "metric-threshold" : "bad"}
{"metric-value" : "29.212267", "metric-unit" : "% tma_frontend_bound", "metric-threshold" : "bad"}
{"metric-value" : "7.138972", "metric-unit" : "% tma_bad_speculation", "metric-threshold" : "good"}
{"metric-value" : "20.422759", "metric-unit" : "% tma_retiring", "metric-threshold" : "good"}
{"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
{"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
{"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
{"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
...

perf sched:

- Add -P/--pre-migrations option for 'timehist' sub-command to track
time a task waited on a run-queue before migrating to a different
CPU.

$ perf sched timehist -P
time cpu task name wait time sch delay run time pre-mig time
[tid/pid] (msec) (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- --------- ---------
585940.535527 [0000] perf[584885] 0.000 0.000 0.000 0.000
585940.535535 [0000] migration/0[20] 0.000 0.002 0.008 0.000
585940.535559 [0001] perf[584885] 0.000 0.000 0.000 0.000
585940.535563 [0001] migration/1[25] 0.000 0.001 0.004 0.000
585940.535678 [0002] perf[584885] 0.000 0.000 0.000 0.000
585940.535686 [0002] migration/2[31] 0.000 0.002 0.008 0.000
585940.535905 [0001] <idle> 0.000 0.000 0.342 0.000
585940.535938 [0003] perf[584885] 0.000 0.000 0.000 0.000
585940.537048 [0001] sleep[584886] 0.000 0.019 1.142 0.001
585940.537749 [0002] <idle> 0.000 0.000 2.062 0.000
...

Build:

- Make libunwind opt-in (LIBUNWIND=1) rather than opt-out.

The perf tools are generally built with libelf and libdw which has
unwinder functionality. The libunwind support predates it and no
need to have duplicate unwinders by default.

- Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify
it's using libdw for handling DWARF information.

Internals:

- Do not set exclude_guest bit in the perf_event_attr by default.

This was causing a trouble in AMD IBS PMU as it doesn't support the
bit. The bit will be set when it's needed later by the fallback
logic. Also update the missing feature detection logic to make sure
not clear supported bits unnecessarily.

- Run perf test in parallel by default and mark flaky tests
"exclusive" to run them serially at the end. Some test numbers are
changed but the test can complete in less than half the time.

JSON vendor events:

- Add AMD Zen 5 events and metrics.

- Add i.MX91 and i.MX95 DDR metrics

- Fix HiSilicon HIP08 Topdown metric name.

- Support compat events on PowerPC"

* tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (232 commits)
perf tests: Fix hwmon parsing with PMU name test
perf hwmon_pmu: Ensure hwmon key union is zeroed before use
perf tests hwmon_pmu: Remove double evlist__delete()
perf/test: fix perf ftrace test on s390
perf bpf-filter: Return -ENOMEM directly when pfi allocation fails
perf test: Correct hwmon test PMU detection
perf: Remove unused del_perf_probe_events()
perf pmu: Move pmu_metrics_table__find and remove ARM override
perf jevents: Add map_for_cpu()
perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
perf header: Avoid transitive PMU includes
perf arm64 header: Use cpu argument in get_cpuid
perf header: Refactor get_cpuid to take a CPU for ARM
perf header: Move is_cpu_online to numa bench
perf jevents: fix breakage when do perf stat on system metric
perf test: Add missing __exit calls in tool/hwmon tests
perf tests: Make leader sampling test work without branch event
perf util: Remove kernel version deadcode
perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM isn't resolved
perf test shell trace_exit_race: Show what went wrong in verbose mode
...

show more ...


Revision tags: v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5
# 2532be3d 25-Oct-2024 Ian Rogers <irogers@google.com>

perf test: Tag parallel failing shell tests with "(exclusive)"

Some shell tests compete for resources and so can't run with other
tests, tag such tests. The "(exclusive)" stems from shared/exclusiv

perf test: Tag parallel failing shell tests with "(exclusive)"

Some shell tests compete for resources and so can't run with other
tests, tag such tests. The "(exclusive)" stems from shared/exclusive
to describe how the tests run as if holding a lock.

For ARM/coresight tests:
Suggested-by: James Clark <james.clark@linaro.org>

Additional failing tests:
Suggested-by: Namhyung Kim <namhyung@kernel.org>

Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Link: https://lore.kernel.org/r/20241025192109.132482-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

show more ...


Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1
# a23e1966 15-Jul-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.11 merge window.


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2
# 6f47c7ae 28-May-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.9' into next

Sync up with the mainline to bring in the new cleanup API.


Revision tags: v6.10-rc1
# 60a2f25d 16-May-2024 Tvrtko Ursulin <tursulin@ursulin.net>

Merge drm/drm-next into drm-intel-gt-next

Some display refactoring patches are needed in order to allow conflict-
less merging.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>


Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5
# 03c11eb3 14-Feb-2024 Ingo Molnar <mingo@kernel.org>

Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch

Conflicts:
arch/x86/include/asm/percpu.h
arch/x86/include/asm/text-patching.h

Signed-off-by: Ingo Molnar <mingo@k

Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch

Conflicts:
arch/x86/include/asm/percpu.h
arch/x86/include/asm/text-patching.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1
# 0ea5c948 15-Jan-2024 Jani Nikula <jani.nikula@intel.com>

Merge drm/drm-next into drm-intel-next

Backmerge to bring Xe driver to drm-intel-next.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 6b93f350 08-Jan-2024 Jiri Kosina <jkosina@suse.com>

Merge branch 'for-6.8/amd-sfh' into for-linus

- addition of new interfaces to export User presence information and
Ambient light from amd-sfh to other drivers within the kernel (Basavaraj
Natika

Merge branch 'for-6.8/amd-sfh' into for-linus

- addition of new interfaces to export User presence information and
Ambient light from amd-sfh to other drivers within the kernel (Basavaraj
Natikar)

show more ...


Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2
# 3bf3e21c 15-Nov-2023 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-next into drm-misc-next

Let's kickstart the v6.8 release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>


# 5d2d4a9f 15-Nov-2023 Peter Zijlstra <peterz@infradead.org>

Merge branch 'tip/perf/urgent'

Avoid conflicts, base on fixes.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>


Revision tags: v6.7-rc1
# 7ab89417 03-Nov-2023 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"Build:

- Compile BPF programs by defaul

Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"Build:

- Compile BPF programs by default if clang (>= 12.0.1) is available
to enable more features like kernel lock contention, off-cpu
profiling, kwork, sample filtering and so on.

This can be disabled by passing BUILD_BPF_SKEL=0 to make.

- Produce better error messages for bison on debug build (make
DEBUG=1) by defining YYDEBUG symbol internally.

perf record:

- Track sideband events (like FORK/MMAP) from all CPUs even if perf
record targets a subset of CPUs only (using -C option). Otherwise
it may lose some information happened on a CPU out of the target
list.

- Fix checking raw sched_switch tracepoint argument using system BTF.
This affects off-cpu profiling which attaches a BPF program to the
raw tracepoint.

perf lock contention:

- Add --lock-cgroup option to see contention by cgroups. This should
be used with BPF only (using -b option).

$ sudo perf lock con -ab --lock-cgroup -- sleep 1
contended total wait max wait avg wait cgroup

835 14.06 ms 41.19 us 16.83 us /system.slice/led.service
25 122.38 us 13.77 us 4.89 us /
44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope
1 491 ns 491 ns 491 ns /system.slice/connectd.service

- Add -G/--cgroup-filter option to see contention only for given
cgroups.

This can be useful when you identified a cgroup in the above
command and want to investigate more on it. It also works with
other output options like -t/--threads and -l/--lock-addr.

$ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1
contended total wait max wait avg wait type caller

8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8
2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25
1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a

- Use per-cpu array for better spinlock tracking. This is to improve
performance of the BPF program and to avoid nested contention on a
lock in the BPF hash map.

- Update callstack check for PowerPC. To find a representative caller
of a lock, it needs to look up the call stacks. It ends the lookup
when it sees 0 in the call stack buffer. However, PowerPC call
stacks can have 0 values in the beginning so skip them when it
expects valid call stacks after.

perf kwork:

- Support 'sched' class (for -k option) so that it can see task
scheduling event (using sched_switch tracepoint) as well as irq and
workqueue items.

- Add perf kwork top subcommand to show more accurate cpu utilization
with sched class above. It works both with a recorded data (using
perf kwork record command) and BPF (using -b option). Unlike perf
top command, it does not support interactive mode (yet).

$ sudo perf kwork top -b -k sched
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 160702.425 ms, 8 cpus
%Cpu(s): 36.00% id, 0.00% hi, 0.00% si
%Cpu0 [|||||||||||||||||| 61.66%]
%Cpu1 [|||||||||||||||||| 61.27%]
%Cpu2 [||||||||||||||||||| 66.40%]
%Cpu3 [|||||||||||||||||| 61.28%]
%Cpu4 [|||||||||||||||||| 61.82%]
%Cpu5 [||||||||||||||||||||||| 77.41%]
%Cpu6 [|||||||||||||||||| 61.73%]
%Cpu7 [|||||||||||||||||| 63.25%]

PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
0 0 38.72 8089.463 ms [swapper/1]
0 0 38.71 8084.547 ms [swapper/3]
0 0 38.33 8007.532 ms [swapper/0]
0 0 38.26 7992.985 ms [swapper/6]
0 0 38.17 7971.865 ms [swapper/4]
0 0 36.74 7447.765 ms [swapper/7]
0 0 33.59 6486.942 ms [swapper/2]
0 0 22.58 3771.268 ms [swapper/5]
9545 9351 2.48 447.136 ms sched-messaging
9574 9351 2.09 418.583 ms sched-messaging
9724 9351 2.05 372.407 ms sched-messaging
9531 9351 2.01 368.804 ms sched-messaging
9512 9351 2.00 362.250 ms sched-messaging
9514 9351 1.95 357.767 ms sched-messaging
9538 9351 1.86 384.476 ms sched-messaging
9712 9351 1.84 386.490 ms sched-messaging
9723 9351 1.83 380.021 ms sched-messaging
9722 9351 1.82 382.738 ms sched-messaging
9517 9351 1.81 354.794 ms sched-messaging
9559 9351 1.79 344.305 ms sched-messaging
9725 9351 1.77 365.315 ms sched-messaging
<SNIP>

- Add hard/soft-irq statistics to perf kwork top. This will show the
total CPU utilization with IRQ stats like below:

$ sudo perf kwork top -b -k sched,irq,softirq
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 12554.889 ms, 8 cpus
%Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here
%Cpu0 [| 4.60%]
%Cpu1 [| 4.59%]
%Cpu2 [ 2.73%]
%Cpu3 [| 3.81%]
<SNIP>

perf bench:

- Add -G/--cgroups option to perf bench sched pipe. The pipe bench is
good to measure context switch overhead. With this option, it puts
the reader and writer tasks in separate cgroups to enforce context
switch between two different cgroups.

Also it needs to set CPU affinity of the tasks in a CPU to
accurately measure the impact of cgroup context switches.

$ sudo perf stat -e context-switches,cgroup-switches -- \
> taskset -c 0 perf bench sched pipe -l 100000
# Running 'sched/pipe' benchmark:
# Executed 100000 pipe operations between two processes

Total time: 0.307 [sec]

3.078180 usecs/op
324867 ops/sec

Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000':

200,026 context-switches
63 cgroup-switches

0.321637922 seconds time elapsed

You can see small number of cgroup-switches because both write and
read tasks are in the same cgroup.

$ sudo mkdir /sys/fs/cgroup/{AAA,BBB}

$ sudo perf stat -e context-switches,cgroup-switches -- \
> taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB
# Running 'sched/pipe' benchmark:
# Executed 100000 pipe operations between two processes

Total time: 0.351 [sec]

3.512990 usecs/op
284657 ops/sec

Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB':

200,020 context-switches
200,019 cgroup-switches

0.365034567 seconds time elapsed

Now context-switches and cgroup-switches are almost same. And you
can see the pipe operation took little more.

- Kill child processes when perf bench sched messaging exited
abnormally. Otherwise it'd leave the child doing unnecessary work.

perf test:

- Fix various shellcheck issues on the tests written in shell script.

- Skip tests when condition is not satisfied:
- object code reading test for non-text section addresses.
- CoreSight test if cs_etm// event is not available.
- lock contention test if not enough CPUs.

Event parsing:

- Make PMU alias name loading lazy to reduce the startup time in the
event parsing code for perf record, stat and others in the general
case.

- Lazily compute PMU default config. In the same sense, delay PMU
initialization until it's really needed to reduce the startup cost.

- Fix event term values that are raw events. The event specification
can have several terms including event name. But sometimes it
clashes with raw event encoding which starts with 'r' and has
hex-digits.

For example, an event named 'read' should be processed as a normal
event but it was mis-treated as a raw encoding and caused a
failure.

$ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
event syntax error: '..nning/event=read/'
\___ parser error
Run 'perf list' for a list of valid events

Usage: perf stat [<options>] [<command>]

-e, --event <event> event selector. use 'perf list' to list available events

Event metrics:

- Add "Compat" regex to match event with multiple identifiers.

- Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne.

Misc:

- Assorted memory leak fixes and footprint reduction.

- Add "bpf_skeletons" to perf version --build-options so that users
can check whether their perf tools have BPF support easily.

- Fix unaligned access in Intel-PT packet decoder found by
undefined-behavior sanitizer.

- Avoid frequency mode for the dummy event. Surprisingly it'd impact
kernel timer tick handler performance by force iterating all PMU
events.

- Update bash shell completion for events and metrics"

* tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits)
perf vendor events intel: Update tsx_cycles_per_elision metrics
perf vendor events intel: Update bonnell version number to v5
perf vendor events intel: Update westmereex events to v4
perf vendor events intel: Update meteorlake events to v1.06
perf vendor events intel: Update knightslanding events to v16
perf vendor events intel: Add typo fix for ivybridge FP
perf vendor events intel: Update a spelling in haswell/haswellx
perf vendor events intel: Update emeraldrapids to v1.01
perf vendor events intel: Update alderlake/alderlake events to v1.23
perf build: Disable BPF skeletons if clang version is < 12.0.1
perf callchain: Fix spelling mistake "statisitcs" -> "statistics"
perf report: Fix spelling mistake "heirachy" -> "hierarchy"
perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit()
perf tests: test_arm_coresight: Simplify source iteration
perf vendor events intel: Add tigerlake two metrics
perf vendor events intel: Add broadwellde two metrics
perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric
perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit
perf callchain: Minor layout changes to callchain_list
perf callchain: Make brtype_stat in callchain_list optional
...

show more ...


# 20cd569d 31-Oct-2023 Jiri Kosina <jkosina@suse.cz>

Merge branch 'for-6.7/config_pm' into for-linus

- #ifdef CONFIG_PM removal from HID code (Thomas Weißschuh)


12345