History log of /linux/tools/perf/util/record.h (Results 251 – 275 of 285)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# d99c22ea 22-Apr-2020 Stephane Eranian <eranian@google.com>

perf record: Add num-synthesize-threads option

To control degree of parallelism of the synthesize_mmap() code which
is scanning /proc/PID/task/PID/maps and can be time consuming.
Mimic perf top way

perf record: Add num-synthesize-threads option

To control degree of parallelism of the synthesize_mmap() code which
is scanning /proc/PID/task/PID/maps and can be time consuming.
Mimic perf top way of handling the option.
If not specified will default to 1 thread, i.e. default behavior before
this option.

On a desktop computer the processing of /proc/PID/task/PID/maps isn't
slow enough to warrant parallel processing and the thread creation has
some cost - hence the default of 1. On a loaded server with
>100 cores it is possible to see synthesis times in the order of
seconds and in this case having the option is desirable.

As the processing is a synchronization point, it is legitimate to worry if
Amdahl's law will apply to this patch. Profiling with this patch in
place:
https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/
shows:
...
- 32.59% __perf_event__synthesize_threads
- 32.54% __event__synthesize_thread
+ 22.13% perf_event__synthesize_mmap_events
+ 6.68% perf_event__get_comm_ids.constprop.0
+ 1.49% process_synthesized_event
+ 1.29% __GI___readdir64
+ 0.60% __opendir
...
That is the processing is 1.49% of execution time and there is plenty to
make parallel. This is shown in the benchmark in this patch:

https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/

Computing performance of multi threaded perf event synthesis by
synthesizing events on CPU 0:
Number of synthesis threads: 1
Average synthesis took: 127729.000 usec (+- 3372.880 usec)
Average num. events: 21548.600 (+- 0.306)
Average time per event 5.927 usec
Number of synthesis threads: 2
Average synthesis took: 88863.500 usec (+- 385.168 usec)
Average num. events: 21552.800 (+- 0.327)
Average time per event 4.123 usec
Number of synthesis threads: 3
Average synthesis took: 83257.400 usec (+- 348.617 usec)
Average num. events: 21553.200 (+- 0.327)
Average time per event 3.863 usec
Number of synthesis threads: 4
Average synthesis took: 75093.000 usec (+- 422.978 usec)
Average num. events: 21554.200 (+- 0.200)
Average time per event 3.484 usec
Number of synthesis threads: 5
Average synthesis took: 64896.600 usec (+- 353.348 usec)
Average num. events: 21558.000 (+- 0.000)
Average time per event 3.010 usec
Number of synthesis threads: 6
Average synthesis took: 59210.200 usec (+- 342.890 usec)
Average num. events: 21560.000 (+- 0.000)
Average time per event 2.746 usec
Number of synthesis threads: 7
Average synthesis took: 54093.900 usec (+- 306.247 usec)
Average num. events: 21562.000 (+- 0.000)
Average time per event 2.509 usec
Number of synthesis threads: 8
Average synthesis took: 48938.700 usec (+- 341.732 usec)
Average num. events: 21564.000 (+- 0.000)
Average time per event 2.269 usec

Where average time per synthesized event goes from 5.927 usec with 1
thread to 2.269 usec with 8. This isn't a linear speed up as not all of
synthesize code has been made parallel. If the synthesis time was about
10 seconds then using 8 threads may bring this down to less than 4.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Jones <tonyj@suse.de>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


# 41d91ec3 22-Apr-2020 Mark Brown <broonie@kernel.org>

Merge tag 'tegra-for-5.7-asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into asoc-5.7

ASoC: tegra: Fixes for v5.7-rc3

This contains a couple of fixes that are needed to properly

Merge tag 'tegra-for-5.7-asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into asoc-5.7

ASoC: tegra: Fixes for v5.7-rc3

This contains a couple of fixes that are needed to properly reconfigure
the audio clocks on older Tegra devices.

show more ...


# 175ae3ad 21-Apr-2020 Tony Lindgren <tony@atomide.com>

Merge branch 'fixes-v5.7' into fixes


# 3bda0386 21-Apr-2020 Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvm-s390-master-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fix for 5.7 and maintainer update

- Silence false positive lockdep warnin

Merge tag 'kvm-s390-master-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fix for 5.7 and maintainer update

- Silence false positive lockdep warning
- add Claudio as reviewer

show more ...


Revision tags: v5.7-rc2
# 08d99b2c 17-Apr-2020 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next

Backmerging required to pull topic/phy-compliance.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 2b703bbd 16-Apr-2020 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge drm/drm-next into drm-intel-next-queued

Backmerging in order to pull "topic/phy-compliance".

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# a4721ced 14-Apr-2020 Maxime Ripard <maxime@cerno.tech>

Merge v5.7-rc1 into drm-misc-fixes

Start the new drm-misc-fixes cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>


# 3b02a051 13-Apr-2020 Ingo Molnar <mingo@kernel.org>

Merge tag 'v5.7-rc1' into locking/kcsan, to resolve conflicts and refresh

Resolve these conflicts:

arch/x86/Kconfig
arch/x86/kernel/Makefile

Do a minor "evil merge" to move the KCSAN entry up a

Merge tag 'v5.7-rc1' into locking/kcsan, to resolve conflicts and refresh

Resolve these conflicts:

arch/x86/Kconfig
arch/x86/kernel/Makefile

Do a minor "evil merge" to move the KCSAN entry up a bit by a few lines
in the Kconfig to reduce the probability of future conflicts.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


Revision tags: v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5
# 28336be5 30-Dec-2019 Ingo Molnar <mingo@kernel.org>

Merge tag 'v5.5-rc4' into locking/kcsan, to resolve conflicts

Conflicts:
init/main.c
lib/Kconfig.debug

Signed-off-by: Ingo Molnar <mingo@kernel.org>


# c48b0722 05-Apr-2020 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-urgent-2020-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull more perf updates from Thomas Gleixner:
"Perf updates all over the place:

core:

- Support for

Merge tag 'perf-urgent-2020-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull more perf updates from Thomas Gleixner:
"Perf updates all over the place:

core:

- Support for cgroup tracking in samples to allow cgroup based
analysis

tools:

- Support for cgroup analysis

- Commandline option and hotkey for perf top to change the sort order

- A set of fixes all over the place

- Various build system related improvements

- Updates of the X86 pmu event JSON data

- Documentation updates"

* tag 'perf-urgent-2020-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
perf python: Fix clang detection to strip out options passed in $CC
perf tools: Support Python 3.8+ in Makefile
perf script: Fix invalid read of directory entry after closedir()
perf script report: Fix SEGFAULT when using DWARF mode
perf script: add -S/--symbols documentation
perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric
perf events parser: Add missing Intel CPU events to parser
perf script: Allow --symbol to accept hexadecimal addresses
perf report/top TUI: Fix title line formatting
perf top: Support hotkey to change sort order
perf top: Support --group-sort-idx to change the sort order
perf symbols: Fix arm64 gap between kernel start and module end
perf build-test: Honour JOBS to override detection of number of cores
perf script: Add --show-cgroup-events option
perf top: Add --all-cgroups option
perf record: Add --all-cgroups option
perf record: Support synthesizing cgroup events
perf report: Add 'cgroup' sort key
perf cgroup: Maintain cgroup hierarchy
perf tools: Basic support for CGROUP event
...

show more ...


# 7dc41b9b 04-Apr-2020 Ingo Molnar <mingo@kernel.org>

Merge tag 'perf-urgent-for-mingo-5.7-20200403' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo:

pe

Merge tag 'perf-urgent-for-mingo-5.7-20200403' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo:

perf python:

Arnaldo Carvalho de Melo:

- Fix clang detection to strip out options passed in $CC.

build:

He Zhe:

- Normalize gcc parameter when generating arch errno table, fixing
the build by removing options from $(CC).

Sam Lunt:

- Support Python 3.8+ in Makefile.

perf report/top:

Arnaldo Carvalho de Melo:

- Fix title line formatting.

perf script:

Andreas Gerstmayr:

- Fix SEGFAULT when using DWARF mode.

- Fix invalid read of directory entry after closedir(), found with valgrind.

Hagen Paul Pfeifer:

- Introduce --deltatime option.

Stephane Eranian:

- Allow --symbol to accept hexadecimal addresses.

Ian Rogers:

- Add -S/--symbols documentation

Namhyung Kim:

- Add --show-cgroup-events option.

perf python:

Arnaldo Carvalho de Melo:

- Include rwsem.c in the python binding, needed by the cgroups improvements.

build-test:

Arnaldo Carvalho de Melo:

- Honour JOBS to override detection of number of cores

perf top:

Jin Yao:

- Support --group-sort-idx to change the sort order

- perf top: Support hotkey to change sort order

perf pmu-events x86:

Jin Yao:

- Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric

perf symbols arm64:

Kemeng Shi:

- Fix arm64 gap between kernel start and module end

kernel perf subsystem:

Namhyung Kim:

- Add PERF_RECORD_CGROUP event and Add PERF_SAMPLE_CGROUP feature,
to allow cgroup tracking, saving a link between cgroup path and
its id number.

perf cgroup:

Namhyung Kim:

- Maintain cgroup hierarchy.

perf report:

Namhyung Kim:

- Add 'cgroup' sort key.

perf record:

Namhyung Kim:

- Support synthesizing cgroup events for pre-existing cgroups.

- Add --all-cgroups option

Documentation:

Tony Jones:

- Update docs regarding kernel/user space unwinding.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


# 8fb4b679 25-Mar-2020 Namhyung Kim <namhyung@kernel.org>

perf record: Add --all-cgroups option

The --all-cgroups option is to enable cgroup profiling support. It
tells kernel to record CGROUP events in the ring buffer so that perf
report can identify tas

perf record: Add --all-cgroups option

The --all-cgroups option is to enable cgroup profiling support. It
tells kernel to record CGROUP events in the ring buffer so that perf
report can identify task/cgroup association later.

[root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ]
[root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 558 of event 'cycles'
# Event count (approx.): 458017341
#
# Overhead cgroup id (dev/inode) Cgroup Pid:Command
# ........ ..................... .......... ...............
#
33.15% 4/0xeffffffb /sub 9615:looper0
32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2
32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1
0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest
0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest
0.32% 4/0xeffffffb / 9615:looper0
0.11% 4/0xeffffffb /sub 9617:cgtest
0.10% 4/0xeffffffb /sub 9618:cgtest

#
# (Tip: Sample related events with: perf record -e '{cycles,instructions}:S')
#
[root@seventh ~]#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org
Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
[ Extracted the HAVE_FILE_HANDLE from the followup patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


# c95baf12 20-Feb-2020 Zhenyu Wang <zhenyuw@linux.intel.com>

Merge drm-intel-next-queued into gvt-next

Backmerge to pull in
https://patchwork.freedesktop.org/patch/353621/?series=73544&rev=1

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>


# b19efcab 01-Feb-2020 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 5.6 merge window.


# 1bdd3e05 10-Jan-2020 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v5.5-rc5' into next

Sync up with mainline to get SPI "delay" API changes.


# 22164fbe 06-Jan-2020 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Merge drm/drm-next into drm-misc-next

Requested, and we need v5.5-rc1 backported as our current branch is still based on v5.4.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>


Revision tags: v5.5-rc4, v5.5-rc3, v5.5-rc2
# 023265ed 11-Dec-2019 Jani Nikula <jani.nikula@intel.com>

Merge drm/drm-next into drm-intel-next-queued

Sync up with v5.5-rc1 to get the updated lock_release() API among other
things. Fix the conflict reported by Stephen Rothwell [1].

[1] http://lore.kern

Merge drm/drm-next into drm-intel-next-queued

Sync up with v5.5-rc1 to get the updated lock_release() API among other
things. Fix the conflict reported by Stephen Rothwell [1].

[1] http://lore.kernel.org/r/20191210093957.5120f717@canb.auug.org.au

Signed-off-by: Jani Nikula <jani.nikula@intel.com>

show more ...


Revision tags: v5.5-rc1
# 942e6f8a 05-Dec-2019 Olof Johansson <olof@lixom.net>

Merge mainline/master into arm/fixes

This brings in the mainline tree right after armsoc contents was merged
this release cycle, so that we can re-run savedefconfig, etc.

Signed-off-by: Olof Johans

Merge mainline/master into arm/fixes

This brings in the mainline tree right after armsoc contents was merged
this release cycle, so that we can re-run savedefconfig, etc.

Signed-off-by: Olof Johansson <olof@lixom.net>

show more ...


# 3f59dbca 26-Nov-2019 Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
"The main kernel side changes in this cycle were:

- Various Intel

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
"The main kernel side changes in this cycle were:

- Various Intel-PT updates and optimizations (Alexander Shishkin)

- Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)

- Add support for LSM and SELinux checks to control access to the
perf syscall (Joel Fernandes)

- Misc other changes, optimizations, fixes and cleanups - see the
shortlog for details.

There were numerous tooling changes as well - 254 non-merge commits.
Here are the main changes - too many to list in detail:

- Enhancements to core tooling infrastructure, perf.data, libperf,
libtraceevent, event parsing, vendor events, Intel PT, callchains,
BPF support and instruction decoding.

- There were updates to the following tools:

perf annotate
perf diff
perf inject
perf kvm
perf list
perf maps
perf parse
perf probe
perf record
perf report
perf script
perf stat
perf test
perf trace

- And a lot of other changes: please see the shortlog and Git log for
more details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits)
perf parse: Fix potential memory leak when handling tracepoint errors
perf probe: Fix spelling mistake "addrees" -> "address"
libtraceevent: Fix memory leakage in copy_filter_type
libtraceevent: Fix header installation
perf intel-bts: Does not support AUX area sampling
perf intel-pt: Add support for decoding AUX area samples
perf intel-pt: Add support for recording AUX area samples
perf pmu: When using default config, record which bits of config were changed by the user
perf auxtrace: Add support for queuing AUX area samples
perf session: Add facility to peek at all events
perf auxtrace: Add support for dumping AUX area samples
perf inject: Cut AUX area samples
perf record: Add aux-sample-size config term
perf record: Add support for AUX area sampling
perf auxtrace: Add support for AUX area sample recording
perf auxtrace: Move perf_evsel__find_pmu()
perf record: Add a function to test for kernel support for AUX area sampling
perf tools: Add kernel AUX area sampling definitions
perf/core: Make the mlock accounting simple again
perf report: Jump to symbol source view from total cycles view
...

show more ...


# 976e3645 25-Nov-2019 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 5.5 merge window.


Revision tags: v5.4
# 8cacac6e 23-Nov-2019 Ingo Molnar <mingo@kernel.org>

Merge tag 'perf-core-for-mingo-5.5-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf rep

Merge tag 'perf-core-for-mingo-5.5-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf report:

Jin Yao:

- Allow entering the annotation view (symbol source/assembly +
overhead/cycles/etc column) from the 'perf report --total-cycles'
interface.

E.g.:

# perf record --all-cpus --branch-any --all-kernel
^C[ perf record: Woken up 5 times to write data ]
#
# perf evlist -v
cycles: size: 120, { sample_period, sample_freq }: 4000,
sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK,
read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1,
precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1,
bpf_event: 1, branch_sample_type: ANY
#
# perf report --total-cycles
#
# Samples: 78762 of event 'cycles'
Sampled Sampled Avg Avg
Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object
1.72% 95.8K 0.00% 254 [msr.h:105 -> msr.h:166] [kernel.vmlinux]
1.56% 107.6K 0.00% 618 [compiler.h:199 -> common.c:301] [kernel.vmlinux]
0.83% 46.3K 0.00% 409 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux]
0.83% 46.1K 0.00% 83 [jump_label.h:41 -> tsc.c:230] [kernel.vmlinux]
0.64% 36.9K 0.01% 1.4K [hda_intel.c:904 -> hda_intel.c:916] [snd_hda_intel]
0.57% 30.2K 0.00% 282 [file.c:710 -> file.c:730] [kernel.vmlinux]
0.48% 25.8K 0.00% 82 [spinlock.c:158 -> spinlock.c:160] [kernel.vmlinux]
0.45% 23.7K 0.00% 369 [tick-broadcast.c:585 -> tick-broadcast.c:586] [kernel.vmlinux]
0.44% 24.4K 0.00% 73 [msr.h:236 -> tsc.c:1088] [kernel.vmlinux]
0.43% 22.7K 0.00% 144 [cpuidle.c:229 -> cpuidle.c:232] [kernel.vmlinux]

Then press 'A' or Enter on one of those lines, just like with 'perf top', say
the top one: [msr.h:105 -> msr.h:166], then this shows up:

Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762
native_write_msr /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period]
Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%)

│ Disassembly of section .text:

│ ffffffff8106c480 <native_write_msr>:
│ __wrmsr():
│ return EAX_EDX_VAL(val, low, high);
│ }

│ static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high)
│ {
│ asm volatile("1: wrmsr\n"
49.16 │0.02 mov %edi,%ecx
│0.02 mov %esi,%eax
│0.02 wrmsr
│ arch_static_branch():
│ #include <linux/stringify.h>
│ #include <linux/types.h>

│ static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
│ {
│ asm_volatile_goto("1:"
0.79 │0.02 nop
│ native_write_msr():
│ {
│ __wrmsr(msr, low, high);

│ if (msr_tracepoint_active(__tracepoint_write_msr))
│ do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
│ }
50.05 │0.02 254 ← retq
│ do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
│ shl $0x20,%rdx
│ mov %esi,%esi
│ or %rdx,%rsi
│ xor %edx,%edx
│ → jmpq do_trace_write_msr

We need to improve this to show the source code line numbers in the
annotation view, so one can go from that program block to the annotation view
and see those source code line numbers straight away.

auxtrace/Intel PT:

Adrian Hunter:

- Add support for AUX area sampling, requires new functionality that
will land in 5.5, its already in tip.

This includes kernel capability querying so that it fails gracefully
with older kernels, duimping aux area samples in 'perf report -D' and
'perf script'.

perf.data:

Alexey Budankov:

- Fix decompression of PERF_RECORD_COMPRESSED records.

core:

Arnaldo Carvalho de Melo:

- Use the 'dcacheline' cmp routine to find the right DSOs taking into
account the 'maj', 'min', 'ino' and 'ino_generation', that got moved
from 'struct map' to 'struct dso', where it belongs.

This further reduces the size of 'struct map', there is still more
work to do to maybe get it to max one cacheline.

libtraceevent:

Hewenliang:

- Fix memory leakage in copy_filter_type().

Sudip Mukherjee:

- Fix header installation.

perf parse:

Ian Rogers :

- Fix potential memory leak when handling tracepoint errors, found using
LLVM's libFuzzer.

perf probe:

Colin Ian King:

- Fix spelling mistake "addrees" -> "address".

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


Revision tags: v5.4-rc8
# f0bb7ee8 15-Nov-2019 Adrian Hunter <adrian.hunter@intel.com>

perf auxtrace: Add support for AUX area sample recording

Add support for parsing and validating AUX area sample options. At
present, the only option is the sample size, but it is also necessary to
e

perf auxtrace: Add support for AUX area sample recording

Add support for parsing and validating AUX area sample options. At
present, the only option is the sample size, but it is also necessary to
ensure that events are in a group with an AUX area event as the leader.

Committer note:

Add missing 'static inline' in front of auxtrace_parse_sample_options()
for when we don't HAVE_AUXTRACE_SUPPORT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


# 9f4813b5 19-Nov-2019 Ingo Molnar <mingo@kernel.org>

Merge tag 'v5.4-rc8' into WIP.x86/mm, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>


# ac94be49 15-Nov-2019 Thomas Gleixner <tglx@linutronix.de>

Merge branch 'linus' into x86/hyperv

Pick up upstream fixes to avoid conflicts.


# 56b2147f 12-Nov-2019 Ingo Molnar <mingo@kernel.org>

Merge tag 'perf-core-for-mingo-5.5-20191107' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf rep

Merge tag 'perf-core-for-mingo-5.5-20191107' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf report:

Jin Yao:

- Introduce --total-cycles, for basic block profiling, further using data
obtained from LBR, an example should suffice:

# perf record -b
^C[ perf record: Woken up 595 times to write data ]
[ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]

# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY

# perf report --total-cycles --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 6M of event 'cycles'
# Event count (approx.): 6299936
#
# Sampled Sampled Avg Avg
# Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object
# ....... ...... ....... ..... .................................... ................
#
2.17% 1.7M 0.08% 607 [compiler.h:199 -> common.c:221] [kernel.vmlinux]
0.72% 544.5K 0.03% 230 [entry_64.S:657 -> entry_64.S:662] [kernel.vmlinux]
0.56% 541.8K 0.09% 672 [compiler.h:199 -> common.c:300] [kernel.vmlinux]
0.39% 293.2K 0.01% 104 [list_debug.c:43 -> list_debug.c:61] [kernel.vmlinux]
0.36% 278.6K 0.03% 272 [entry_64.S:1289 -> entry_64.S:1308] [kernel.vmlinux]

perf record:

Adrian Hunter:

- Allow storing perf.data in a directory together with a copy of /proc/kcore.

Jiwei Sun:

- Add support for limit perf output file size, i.e.:

# perf record --all-cpus -F 10000 --max-size=4M sleep 10h
[ perf record: perf size limit reached (4097 KB), stopping session ]
[ perf record: Woken up 6 times to write data ]
[ perf record: Captured and wrote 4.048 MB perf.data (54094 samples) ]
Terminated
# ls -lah perf.data
-rw-------. 1 root root 4.1M Nov 7 15:27 perf.data
#

perf stat:

Jiri Olsa:

- Add --per-node agregation support:

In live mode:

# perf stat -a -I 1000 -e cycles --per-node
# time node cpus counts unit events
1.000542550 N0 20 6,202,097 cycles
1.000542550 N1 20 639,559 cycles
2.002040063 N0 20 7,412,495 cycles
2.002040063 N1 20 2,185,577 cycles
3.003451699 N0 20 6,508,917 cycles
3.003451699 N1 20 765,607 cycles
...

Or in the record/report stat session:

# perf stat record -a -I 1000 -e cycles
# time counts unit events
1.000536937 10,008,468 cycles
2.002090152 9,578,539 cycles
3.003625233 7,647,869 cycles
4.005135036 7,032,086 cycles
^C 4.340902364 3,923,893 cycles

# perf stat report --per-node
# time node cpus counts unit events
1.000536937 N0 20 9,355,086 cycles
1.000536937 N1 20 653,382 cycles
2.002090152 N0 20 7,712,838 cycles
2.002090152 N1 20 1,865,701 cycles
...

perf probe:

Masami Hiramatsu:

Various fixes related to recent additions to the DWARF format:

- Fix to find range-only function instance

- Walk function lines in lexical blocks

- Fix to show function entry line as probe-able

- Fix wrong address verification

- Fix to probe a function which has no entry pc

- Fix to probe an inline function which has no entry pc

- Fix to list probe event with correct line number

- Fix to show inlined function callsite without entry_pc

- Fix to show ranges of variables in functions without entry_pc

- Return a better scope DIE if there is no best scope

- Skip end-of-sequence and non statement lines

- Filter out instances except for inlined subroutine and subprogram

- Fix to show calling lines of inlined functions

- Skip overlapped location on searching variables

perf inject:

Adrian Hunter:

- Do not strip evsels with --strip, as they are needed for create_gcov
(see the autofdo example in tools/perf/Documentation/intel-pt.txt).

Intel PT:

Adrian Hunter:

- Intel PT uses an auxtrace_cache to store the results of code-walking, to avoid
repeated decoding. Add an auxtrace_cache__remove to handle text poke events.

core:

Andi Kleen:

- Always preserve errno while cleaning up perf_event_open failures.

llvm:

Arnaldo Carvalho de Melo:

- No need to tell that the request for saving a .o file for BPF events, as
expressed in ~/.perfconfig was satisfied, make that a debug message.

perf vendor events:

Intel:

Haiyan Song:

- Update CascadelakeX events to v1.05.

- Update all the Intel JSON metrics from TMAM 3.6.

Treewide:

Ian Rogers:

- Improve error paths, plugging leaks found using LLVM tools
such as libFuzzer.

jevents:

Yunfeng Ye:

- Fix resource leak in process_mapfile() and main()

perf kvm:

Igor Lubashev:

- Use evlist layer api when possible.

libsubcmd:

James Clark:

- Move EXTRA_FLAGS to the end to allow overriding existing flags.

- Use -O0 with DEBUG=1

perf diff:

Jin Yao:

- Don't use hack to skip column length calculation

CoreSight ETM:

Leo yan:

- Fix definition of macro TO_CS_QUEUE_NR

ARM64:

John Garry:

- Do not try to include libelf header files when its feature detection
failed, fixing the cross build for ARM64.

perf tests:

Leo Yan:

- Fix out of bounds memory access in the backward ring buffer test.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


1...<<1112