xref: /cloud-hypervisor/docs/profiling.md (revision b440cb7d2330770cd415b63544a371d4caa2db3a)
1# Profiling
2
3`perf` can be used to profile the `cloud-hypervisor` binary but it is necessary to make some modifications to the build in order to produce a binary that gives useful results.
4
5## Building a suitable binary
6
7Modify the `Cargo.toml` file to add `debug = 1` to the `[profile.release]` block. It should look like this:
8
9```
10[profile.release]
11lto = true
12debug = 1
13```
14
15This adds the symbol information to the release binary but does not otherwise affect the performance.
16
17The binary must also be built with frame pointers included so that the call graph can be captured by the profiler.
18
19```
20$ cargo clean && RUSTFLAGS='-C force-frame-pointers=y' cargo build --release
21```
22
23## Profiling
24
25`perf` may then be used in the usual manner:
26
27e.g.
28
29```
30$ perf record -g target/release/cloud-hypervisor \
31        --kernel ~/src/linux/vmlinux \
32        --pmem file=~/workloads/focal.raw \
33        --cpus boot=1 --memory size=1G \
34        --cmdline "root=/dev/pmem0p1 console=ttyS0" \
35        --serial tty --console off \
36        --api-socket=/tmp/api1
37```
38
39For analysing the samples:
40
41```
42$ perf report -g
43```
44
45If profiling with a network device attached either the TAP device must be already created and configured or the profiling must be done as root so that the TAP device can be created.
46
47## Userspace only profiling with LBR
48
49The use of LBR (Last Branch Record; available since Haswell) offers lower
50overhead if only userspace profiling is required. This lower overhead can allow
51a higher frequency of sampling. This also removes the requirement to compile
52with custom `RUSTFLAGS` however debug symbols should still be included:
53
54e.g.
55
56```
57$ perf record --call-graph lbr --all-user --user-callchains -g target/release/cloud-hypervisor \
58        --kernel ~/src/linux/vmlinux \
59        --pmem file=~/workloads/focal.raw \
60        --cpus boot=1 --memory size=1G \
61        --cmdline "root=/dev/pmem0p1 console=ttyS0" \
62        --serial tty --console off \
63        --api-socket=/tmp/api1
64```
65