|
Revision tags: v6.18-rc3, v6.18-rc2, v6.18-rc1 |
|
| #
8b87f67b
|
| 08-Oct-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.18 merge window.
|
| #
d325efac
|
| 30-Sep-2025 |
Benjamin Tissoires <bentiss@kernel.org> |
Merge branch 'for-6.18/core' into for-linus
- allow HID-BPF to rebind a driver to hid-multitouch (Benjamin Tissoires) - Change hid_driver to use a const char* for .name (Rahul Rameshbabu)
|
|
Revision tags: v6.17, v6.17-rc7 |
|
| #
71b28769
|
| 19-Sep-2025 |
Jiri Kosina <jkosina@suse.com> |
Merge remote-tracking branch 'origin' into for-6.18/intel-thc-hid
Needed as a basisi for followup support for quicki2c advanced BIOS features.
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
| #
b4d90dbc
|
| 15-Sep-2025 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging to drm-misc-next-fixes to get features and fixes from v6.17-rc6.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
Revision tags: v6.17-rc6 |
|
| #
702fdf35
|
| 10-Sep-2025 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catching up with some display dependencies.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Revision tags: v6.17-rc5, v6.17-rc4, v6.17-rc3 |
|
| #
4b051897
|
| 21-Aug-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.17-rc2' into HEAD
Sync up with mainline to bring in changes to include/linux/sprintf.h
|
|
Revision tags: v6.17-rc2 |
|
| #
ca994e89
|
| 12-Aug-2025 |
Lucas De Marchi <lucas.demarchi@intel.com> |
Merge drm/drm-next into drm-xe-next
Bring v6.17-rc1 to propagate commits from other subsystems, particularly PCI, which has some new functions needed for SR-IOV integration.
Signed-off-by: Lucas De
Merge drm/drm-next into drm-xe-next
Bring v6.17-rc1 to propagate commits from other subsystems, particularly PCI, which has some new functions needed for SR-IOV integration.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
show more ...
|
| #
8d2b0853
|
| 11-Aug-2025 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Updating drm-misc-fixes to the state of v6.17-rc1. Begins a new release cycle.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
| #
08c51f5b
|
| 11-Aug-2025 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-n
Updating drm-misc-next to the state of v6.17-rc1. Begins a new release cycle.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
Revision tags: v6.17-rc1 |
|
| #
f4f346c3
|
| 01-Aug-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.17-2025-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim: "Build-ID processing goodies:
Build-IDs
Merge tag 'perf-tools-for-v6.17-2025-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim: "Build-ID processing goodies:
Build-IDs are content based hashes to link regions of memory to ELF files in post processing. They have been available in distros for quite a while:
$ file /bin/bash /bin/bash: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=707a1c670cd72f8e55ffedfbe94ea98901b7ce3a, for GNU/Linux 3.2.0, stripped
It is possible to ask the kernel to get it from mmap executable backing storage at time they are being put in place and send it as metadata at that moment to have in perf.data.
Prefer that across the board to speed up 'record' time - it post processes the samples to find binaries touched by any samples and to save them with build-ID. It can skip reading build-ID in userspace if it comes from the kernel.
perf record:
* Make --buildid-mmap default. The kernel can generate MMAP2 events with a build-ID from ELF header. Use that by default instead of using inode and device ID to identify binaries. It also can be disabled with --no-buildid-mmap.
* Use BPF for -u/--uid option to sample processes belong to a user. BPF can track user processes more accurately and the existing logic often fails to get the list of processes due to race with reading the /proc filesystem.
* Generate PERF_RECORD_BPF_METADATA when it profiles BPF programs and they have variables starting with "bpf_metadata_". This will help to identify BPF objects used in the profile. This has been supported in bpftool for some time and allows the recording of metadata such as commit hashes, versions, etc, that now gets recorded in perf.data as well.
* Collect list of DSOs touched in the sample callchains as well as in the sample itself. This would increase the processing time at the end of record, but can improve the data quality.
perf stat:
* Add a new 'drm' pseudo-PMU support like in 'hwmon'. It can collect DRM usage stats using fdinfo in /proc.
On my Intel laptop, it shows like below:
$ perf list drm ...
drm: drm-active-stolen-system0 [Total memory active in one or more engines. Unit: drm_i915] drm-active-system0 [Total memory active in one or more engines. Unit: drm_i915] drm-engine-capacity-video [Engine capacity. Unit: drm_i915] drm-engine-copy [Utilization in ns. Unit: drm_i915] drm-engine-render [Utilization in ns. Unit: drm_i915] drm-engine-video [Utilization in ns. Unit: drm_i915] ...
$ sudo perf stat -a -e drm-engine-render,drm-engine-video,drm-engine-capacity-video sleep 1
Performance counter stats for 'system wide':
48,137,316,988,873 ns drm-engine-render 34,452,696,746 ns drm-engine-video 20 capacity drm-engine-capacity-video
1.002086194 seconds time elapsed
perf list
* Add description for software events. The description is in JSON format and the event parser now can handle the software events like others (for example, it's case-insensitive and subject to wildcard matching).
$ perf list software
List of pre-defined events (to be used in -e or -M):
software: alignment-faults [Number of kernel handled memory alignment faults. Unit: software] bpf-output [An event used by BPF programs to write to the perf ring buffer. Unit: software] cgroup-switches [Number of context switches to a task in a different cgroup. Unit: software] context-switches [Number of context switches [This event is an alias of cs]. Unit: software] cpu-clock [Per-CPU high-resolution timer based event. Unit: software] cpu-migrations [Number of times a process has migrated to a new CPU [This event is an alias of migrations]. Unit: software] cs [Number of context switches [This event is an alias of context-switches]. Unit: software] dummy [A placeholder event that doesn't count anything. Unit: software] emulation-faults [Number of kernel handled unimplemented instruction faults handled through emulation. Unit: software] faults [Number of page faults [This event is an alias of page-faults]. Unit: software] major-faults [Number of major page faults. Major faults require I/O to handle. Unit: software] migrations [Number of times a process has migrated to a new CPU [This event is an alias of cpu-migrations]. Unit: software] minor-faults [Number of minor page faults. Minor faults don't require I/O to handle. Unit: software] page-faults [Number of page faults [This event is an alias of faults]. Unit: software] task-clock [Per-task high-resolution timer based event. Unit: software]
perf ftrace:
* Add -e/--events option to perf ftrace latency to measure latency between the two events instead of a function.
$ sudo perf ftrace latency -ab -e i915_request_wait_begin,i915_request_wait_end --hide-empty -- sleep 1 # DURATION | COUNT | GRAPH | 256 - 512 us | 4 | ###### | 2 - 4 ms | 2 | ### | 4 - 8 ms | 12 | ################### | 8 - 16 ms | 10 | ################ |
# statistics (in usec) total time: 194915 avg time: 6961 max time: 12855 min time: 373 count: 28
* Add new function graph tracer options (--graph-opts) to display more info like arguments and return value. They will be passed to the kernel ftrace directly.
$ sudo perf ftrace -G vfs_write --graph-opts retval,retaddr # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | ... 5) | mutex_unlock() { /* <-rb_simple_write+0xda/0x150 */ 5) 0.188 us | local_clock(); /* <-lock_release+0x2ad/0x440 ret=0x3bf2a3cf90e */ 5) | rt_mutex_slowunlock() { /* <-rb_simple_write+0xda/0x150 */ 5) | _raw_spin_lock_irqsave() { /* <-rt_mutex_slowunlock+0x4f/0x200 */ 5) 0.123 us | preempt_count_add(); /* <-_raw_spin_lock_irqsave+0x23/0x90 ret=0x0 */ 5) 0.128 us | local_clock(); /* <-__lock_acquire.isra.0+0x17a/0x740 ret=0x3bf2a3cfc8b */ 5) 0.086 us | do_raw_spin_trylock(); /* <-_raw_spin_lock_irqsave+0x4a/0x90 ret=0x1 */ 5) 0.845 us | } /* _raw_spin_lock_irqsave ret=0x292 */ ...
Misc:
* Add perf archive --exclude-buildids <FILE> option to skip some binaries. The format of the FILE should be same as an output of perf buildid-list.
* Get rid of dependency of libcrypto. It was just to get SHA-1 hash so implement it directly like in the kernel. A side effect is that it needs -fno-strict-aliasing compiler option (again, like in the kernel).
* Convert all shell script tests to use bash"
* tag 'perf-tools-for-v6.17-2025-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits) perf record: Cache build-ID of hit DSOs only perf test: Ensure lock contention using pipe mode perf python: Stop using deprecated PyUnicode_AsString() perf list: Skip ABI PMUs when printing pmu values perf list: Remove tracepoint printing code perf tp_pmu: Add event APIs perf tp_pmu: Factor existing tracepoint logic to new file perf parse-events: Remove non-json software events perf jevents: Add common software event json perf tools: Remove libtraceevent in .gitignore perf test: Fix comment ordering perf sort: Use perf_env to set arch sort keys and header perf test: Move PERF_SAMPLE_WEIGHT_STRUCT parsing to common test perf sample: Remove arch notion of sample parsing perf env: Remove global perf_env perf trace: Avoid global perf_env with evsel__env perf auxtrace: Pass perf_env from session through to mmap read perf machine: Explicitly pass in host perf_env perf bench synthesize: Avoid use of global perf_env perf top: Make perf_env locally scoped ...
show more ...
|
|
Revision tags: v6.16, v6.16-rc7, v6.16-rc6 |
|
| #
bcc7693a
|
| 10-Jul-2025 |
Ian Rogers <irogers@google.com> |
perf spark: Fix includes and add SPDX
scnprintf is declared in linux/kernel.h, directly depend upon it. Add missing SPDX comments.
Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.
perf spark: Fix includes and add SPDX
scnprintf is declared in linux/kernel.h, directly depend upon it. Add missing SPDX comments.
Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
|
Revision tags: v6.16-rc5, v6.16-rc4, v6.16-rc3, v6.16-rc2, v6.16-rc1, v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3 |
|
| #
c95baf12
|
| 20-Feb-2020 |
Zhenyu Wang <zhenyuw@linux.intel.com> |
Merge drm-intel-next-queued into gvt-next
Backmerge to pull in https://patchwork.freedesktop.org/patch/353621/?series=73544&rev=1
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
Revision tags: v5.6-rc2, v5.6-rc1 |
|
| #
b19efcab
|
| 01-Feb-2020 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 5.6 merge window.
|
|
Revision tags: v5.5, v5.5-rc7, v5.5-rc6 |
|
| #
1bdd3e05
|
| 10-Jan-2020 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.5-rc5' into next
Sync up with mainline to get SPI "delay" API changes.
|
| #
22164fbe
|
| 06-Jan-2020 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
Merge drm/drm-next into drm-misc-next
Requested, and we need v5.5-rc1 backported as our current branch is still based on v5.4.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
Revision tags: v5.5-rc5 |
|
| #
28336be5
|
| 30-Dec-2019 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v5.5-rc4' into locking/kcsan, to resolve conflicts
Conflicts: init/main.c lib/Kconfig.debug
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Revision tags: v5.5-rc4, v5.5-rc3, v5.5-rc2 |
|
| #
023265ed
|
| 11-Dec-2019 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next-queued
Sync up with v5.5-rc1 to get the updated lock_release() API among other things. Fix the conflict reported by Stephen Rothwell [1].
[1] http://lore.kern
Merge drm/drm-next into drm-intel-next-queued
Sync up with v5.5-rc1 to get the updated lock_release() API among other things. Fix the conflict reported by Stephen Rothwell [1].
[1] http://lore.kernel.org/r/20191210093957.5120f717@canb.auug.org.au
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
show more ...
|
|
Revision tags: v5.5-rc1 |
|
| #
942e6f8a
|
| 05-Dec-2019 |
Olof Johansson <olof@lixom.net> |
Merge mainline/master into arm/fixes
This brings in the mainline tree right after armsoc contents was merged this release cycle, so that we can re-run savedefconfig, etc.
Signed-off-by: Olof Johans
Merge mainline/master into arm/fixes
This brings in the mainline tree right after armsoc contents was merged this release cycle, so that we can re-run savedefconfig, etc.
Signed-off-by: Olof Johansson <olof@lixom.net>
show more ...
|
| #
3f59dbca
|
| 26-Nov-2019 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "The main kernel side changes in this cycle were:
- Various Intel
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "The main kernel side changes in this cycle were:
- Various Intel-PT updates and optimizations (Alexander Shishkin)
- Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)
- Add support for LSM and SELinux checks to control access to the perf syscall (Joel Fernandes)
- Misc other changes, optimizations, fixes and cleanups - see the shortlog for details.
There were numerous tooling changes as well - 254 non-merge commits. Here are the main changes - too many to list in detail:
- Enhancements to core tooling infrastructure, perf.data, libperf, libtraceevent, event parsing, vendor events, Intel PT, callchains, BPF support and instruction decoding.
- There were updates to the following tools:
perf annotate perf diff perf inject perf kvm perf list perf maps perf parse perf probe perf record perf report perf script perf stat perf test perf trace
- And a lot of other changes: please see the shortlog and Git log for more details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits) perf parse: Fix potential memory leak when handling tracepoint errors perf probe: Fix spelling mistake "addrees" -> "address" libtraceevent: Fix memory leakage in copy_filter_type libtraceevent: Fix header installation perf intel-bts: Does not support AUX area sampling perf intel-pt: Add support for decoding AUX area samples perf intel-pt: Add support for recording AUX area samples perf pmu: When using default config, record which bits of config were changed by the user perf auxtrace: Add support for queuing AUX area samples perf session: Add facility to peek at all events perf auxtrace: Add support for dumping AUX area samples perf inject: Cut AUX area samples perf record: Add aux-sample-size config term perf record: Add support for AUX area sampling perf auxtrace: Add support for AUX area sample recording perf auxtrace: Move perf_evsel__find_pmu() perf record: Add a function to test for kernel support for AUX area sampling perf tools: Add kernel AUX area sampling definitions perf/core: Make the mlock accounting simple again perf report: Jump to symbol source view from total cycles view ...
show more ...
|
|
Revision tags: v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4 |
|
| #
39b656ee
|
| 15-Oct-2019 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo-5.5-20191011' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf tra
Merge tag 'perf-core-for-mingo-5.5-20191011' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf trace:
Arnaldo Carvalho de Melo:
- Reuse the strace-like syscall_arg_fmt->scnprintf() beautification routines (convert integer arguments into strings, like open flags, etc) in tracepoint arguments.
For now the type based scnprintf routines (pid_t, umode_t, etc) and the ones based in well known arg name based ("fd", etc) gets associated with tracepoint args of that type.
A tracepoint only arg, "msr", for the msr:{write,read}_msr gets added as an initial step.
- Introduce syscall_arg_fmt->strtoul() methods to be the reverse operation of ->scnprintf(), i.e. to go from a string to an integer.
- Implement --filter, just like in 'perf record', that affects the tracepoint events specied thus far in the command line, use the ->strtoul() methods to allow strings in tables associated with beautifiers to the integers the in-kernel tracepoint (eBPF later) filters expect, e.g.:
# perf trace --max-events 1 -e sched:*ipi --filter="cpu==1 || cpu==2" 0.000 as/24630 sched:sched_wake_idle_without_ipi(cpu: 1) #
# perf trace --max-events 1 --max-stack=32 -e msr:* --filter="msr==IA32_TSC_DEADLINE" 207.000 cc1/19963 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 5442316760822) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_interrupt ([kernel.kallsyms]) smp_apic_timer_interrupt ([kernel.kallsyms]) apic_timer_interrupt ([kernel.kallsyms]) [0x6ff66c] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x7047c3] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x707708] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) execute_one_pass (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x4f3d37] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x4f3d49] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) execute_pass_list (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) cgraph_node::expand (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x2625b4] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) symbol_table::finalize_compilation_unit (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x5ae8b9] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) toplev::main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1) [0x26b6a] (/usr/lib/x86_64-linux-gnu/libc-2.29.so) # # perf trace --max-events 8 -e msr:* --filter="msr==IA32_SPEC_CTRL" 0.000 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 0.063 migration/3/25 msr:write_msr(msr: IA32_SPEC_CTRL) 0.217 kworker/u16:1-/4826 msr:write_msr(msr: IA32_SPEC_CTRL) 0.687 rcu_sched/11 msr:write_msr(msr: IA32_SPEC_CTRL) 0.696 :13280/13280 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 0.305 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 0.355 :13274/13274 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 2.743 kworker/u16:0-/6711 msr:write_msr(msr: IA32_SPEC_CTRL) # # perf trace --max-events 8 --cpu 1 -e msr:* --filter="msr!=IA32_SPEC_CTRL && msr!=IA32_TSC_DEADLINE && msr != FS_BASE" 0.000 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 68719479037) 0.096 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 238.925 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 8589936893) 511.010 :0/0 msr:write_msr(msr: 0x830, val: 68719479037) 1005.052 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 1235.131 CPU 0/KVM/3750 msr:write_msr(msr: 0x830, val: 4294969595) 1235.195 CPU 0/KVM/3750 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199023037952) 1235.201 CPU 0/KVM/3750 msr:read_msr(msr: IA32_APICBASE, val: 4276096000) #
- Default to not using libtraceevent and its plugins for beautifying tracepoint arguments, since now we're reusing the strace-like beatufiers. Use --libtraceevent_print (using just --libtrace is unambiguous and can be used as a short hand) to go back to those beautifiers.
This will help in the transition, as can be seen in some of the sched tracepoints that still need some work in the libbeauty based mode:
# trace --no-inherit -e msr:*,*sleep,sched:* sleep 1 0.000 ( ): sched:sched_waking(comm: "trace", pid: 3319 (trace), prio: 120, success: 1) 0.006 ( ): sched:sched_wakeup(comm: "trace", pid: 3319 (trace), prio: 120, success: 1) 0.348 ( ): sched:sched_process_exec(filename: 140212596720100, pid: 3319 (sleep), old_pid: 3319 (sleep)) 0.490 ( ): msr:write_msr(msr: FS_BASE, val: 139631189321088) 0.670 ( ): nanosleep(rqtp: 0x7ffc52c23bc0) ... 0.674 ( ): sched:sched_stat_runtime(comm: "sleep", pid: 3319 (sleep), runtime: 659259, vruntime: 78942418342) 0.675 ( ): sched:sched_switch(prev_comm: "sleep", prev_pid: 3319 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/0", next_prio: 120) 1001.059 ( ): sched:sched_waking(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1) 1001.098 ( ): sched:sched_wakeup(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1) 0.670 (1000.504 ms): ... [continued]: nanosleep()) = 0 1001.456 ( ): sched:sched_process_exit(comm: "sleep", pid: 3319 (sleep), prio: 120) # trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1 # trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1 0.000 ( ): sched:sched_waking(comm=trace pid=3323 prio=120 target_cpu=000) 0.007 ( ): sched:sched_wakeup(comm=trace pid=3323 prio=120 target_cpu=000) 0.382 ( ): sched:sched_process_exec(filename=/usr/bin/sleep pid=3323 old_pid=3323) 0.525 ( ): msr:write_msr(c0000100, value 7f5d508a0580) 0.713 ( ): nanosleep(rqtp: 0x7fff487fb4a0) ... 0.717 ( ): sched:sched_stat_runtime(comm=sleep pid=3323 runtime=617722 [ns] vruntime=78957731636 [ns]) 0.719 ( ): sched:sched_switch(prev_comm=sleep prev_pid=3323 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120) 1001.117 ( ): sched:sched_waking(comm=sleep pid=3323 prio=120 target_cpu=000) 1001.157 ( ): sched:sched_wakeup(comm=sleep pid=3323 prio=120 target_cpu=000) 0.713 (1000.522 ms): ... [continued]: nanosleep()) = 0 1001.538 ( ): sched:sched_process_exit(comm=sleep pid=3323 prio=120) #
- Make -v (verbose) mode be honoured for .perfconfig based trace.add_events, to help in diagnosing problems with building eBPF events (-e source.c).
- When using eBPF syscall payload augmentation do not show strace-like syscalls when all the user specified was some tracepoint event, bringing the behaviour in line with that of when not using eBPF augmentation.
Intel PT:
exported-sql-viewer GUI:
Adrian Hunter:
- Add LookupModel, HBoxLayout, VBoxLayout, global time range calculations so as to add a time chart by CPU.
perf script:
Andi Kleen:
- Allow --time (to specify a time span of interest) with --reltime
perf diff:
Jin Yao:
- Report noise for cycles diff, i.e. a histogram + stddev. (timestamps relative to start).
perf annotate:
Arnaldo Carvalho de Melo:
- Initialize env->cpuid when running in live mode (perf top), as it is used in some of the per arch annotation init routines.
samples bpf:
Björn Töpel:
- Fixup fallout of using tools/perf/perf-sys. from outside tools/perf.
Core:
Ian Rogers:
- Avoid 'sample_reg_masks' being const + weak, as this breaks with some compilers that constant-propagate from the weak symbol.
libperf:
- First part of moving the perf_mmap class from tools/perf to libperf.
- Propagate CFLAGS to libperf from the tools/perf Makefile.
Vendor events:
John Garry:
- Add entry in MAINTAINERS with reviewers for the for perf tool arm64 pmu-events files.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
|
Revision tags: v5.4-rc3, v5.4-rc2, v5.4-rc1 |
|
| #
cebf7d51
|
| 25-Sep-2019 |
Jin Yao <yao.jin@linux.intel.com> |
perf diff: Report noisy for cycles diff
This patch prints the stddev and hist for the cycles diff of program block. It can help us to understand if the cycles is noisy or not.
This patch is inspire
perf diff: Report noisy for cycles diff
This patch prints the stddev and hist for the cycles diff of program block. It can help us to understand if the cycles is noisy or not.
This patch is inspired by Andi Kleen's patch:
https://lwn.net/Articles/600471/
We create new option '--cycles-hist'.
Example:
perf record -b ./div perf record -b ./div perf diff -c cycles
# Baseline [Program Block Range] Cycles Diff Shared Object Symbol # ........ .......................................................... .... ................. ............................ # 46.72% [div.c:40 -> div.c:40] 0 div [.] main 46.72% [div.c:42 -> div.c:44] 0 div [.] main 46.72% [div.c:42 -> div.c:39] 0 div [.] main 20.54% [random_r.c:357 -> random_r.c:394] 1 libc-2.27.so [.] __random_r 20.54% [random_r.c:357 -> random_r.c:380] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:391] 0 libc-2.27.so [.] __random_r 17.04% [random.c:288 -> random.c:291] 0 libc-2.27.so [.] __random 17.04% [random.c:291 -> random.c:291] 0 libc-2.27.so [.] __random 17.04% [random.c:293 -> random.c:293] 0 libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:298 -> random.c:298] 0 libc-2.27.so [.] __random 8.40% [div.c:22 -> div.c:25] 0 div [.] compute_flag 8.40% [div.c:27 -> div.c:28] 0 div [.] compute_flag 5.14% [rand.c:26 -> rand.c:27] 0 libc-2.27.so [.] rand 5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax 0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap 0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15 0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 [kernel.kallsyms] [k] native_sched_clock 0.00% [native_write_msr+0 -> native_write_msr+16] -13 [kernel.kallsyms] [k] native_write_msr
When we enable the option '--cycles-hist', the output is
perf diff -c cycles --cycles-hist
# Baseline [Program Block Range] Cycles Diff stddev/Hist Shared Object Symbol # ........ .......................................................... .... ................. ................. ............................ # 46.72% [div.c:40 -> div.c:40] 0 ± 37.8% ▁█▁▁██▁█ div [.] main 46.72% [div.c:42 -> div.c:44] 0 ± 49.4% ▁▁▂█▂▂▂▂ div [.] main 46.72% [div.c:42 -> div.c:39] 0 ± 24.1% ▃█▂▄▁▃▂▁ div [.] main 20.54% [random_r.c:357 -> random_r.c:394] 1 ± 33.5% ▅▂▁█▃▁▂▁ libc-2.27.so [.] __random_r 20.54% [random_r.c:357 -> random_r.c:380] 0 ± 39.4% ▁▁█▁██▅▁ libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:391] 0 ± 41.2% ▁▃▁▂█▄▃▁ libc-2.27.so [.] __random_r 17.04% [random.c:288 -> random.c:291] 0 ± 48.8% ▁▁▁▁███▁ libc-2.27.so [.] __random 17.04% [random.c:291 -> random.c:291] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:293 -> random.c:293] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:298 -> random.c:298] 0 ± 75.6% ▃█▁▁▁▁▁▁ libc-2.27.so [.] __random 8.40% [div.c:22 -> div.c:25] 0 ± 42.1% ▁▃▁▁███▁ div [.] compute_flag 8.40% [div.c:27 -> div.c:28] 0 ± 41.8% ██▁▁▄▁▁▄ div [.] compute_flag 5.14% [rand.c:26 -> rand.c:27] 0 ± 37.8% ▁▁▁████▁ libc-2.27.so [.] rand 5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax 0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap 0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15 0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 ± 38.5% ▄█▁ [kernel.kallsyms] [k] native_sched_clock 0.00% [native_write_msr+0 -> native_write_msr+16] -13 ± 47.1% ▁█▇▃▁▁ [kernel.kallsyms] [k] native_write_msr
v8: --- Rebase to perf/core branch
v7: --- 1. v6 got Jiri's ACK. 2. Rebase to latest perf/core branch.
v6: --- 1. Jiri provides better code for using data__hpp_register() in ui_init(). Use this code in v6.
v5: --- 1. Refine the use of data__hpp_register() in ui_init() according to Jiri's suggestion.
v4: --- 1. Rename the new option from '--noisy' to '--cycles-hist' 2. Remove the option '-n'. 3. Only update the spark value and stats when '--cycles-hist' is enabled. 4. Remove the code of printing '..'.
v3: --- 1. Move the histogram to a separate column 2. Move the svals[] out of struct stats
v2: --- Jiri got a compile error,
CC builtin-diff.o builtin-diff.c: In function ‘compute_cycles_diff’: builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value] 712 | labs(pair->block_info->cycles_spark[i] - | ^~~~
Because the result of u64 - u64 is still u64. Now we change the type of cycles_spark[] to s64.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|