xref: /cloud-hypervisor/docs/profiling.md (revision fa22cb0be515cecb5c510a69cc263d903f346129)
1d1a23d90SRob Bradford# Profiling
2d1a23d90SRob Bradford
38160c288SDayu Liu`perf` can be used to profile the `cloud-hypervisor` binary but it is necessary to make some modifications to the build in order to produce a binary that gives useful results.
4d1a23d90SRob Bradford
5d1a23d90SRob Bradford## Building a suitable binary
6d1a23d90SRob Bradford
7d1a23d90SRob BradfordThis adds the symbol information to the release binary but does not otherwise affect the performance.
8d1a23d90SRob Bradford
9d1a23d90SRob BradfordThe binary must also be built with frame pointers included so that the call graph can be captured by the profiler.
10d1a23d90SRob Bradford
11d1a23d90SRob Bradford```
12e2886a4bSRob Bradford$ cargo clean && RUSTFLAGS='-C force-frame-pointers=y' cargo build --profile profiling
13d1a23d90SRob Bradford```
14d1a23d90SRob Bradford
15d1a23d90SRob Bradford## Profiling
16d1a23d90SRob Bradford
17d1a23d90SRob Bradford`perf` may then be used in the usual manner:
18d1a23d90SRob Bradford
19d1a23d90SRob Bradforde.g.
20d1a23d90SRob Bradford
21d1a23d90SRob Bradford```
22e2886a4bSRob Bradford$ perf record -g target/profiling/cloud-hypervisor \
23d1a23d90SRob Bradford        --kernel ~/src/linux/vmlinux \
24d1a23d90SRob Bradford        --pmem file=~/workloads/focal.raw \
25d1a23d90SRob Bradford        --cpus boot=1 --memory size=1G \
26d1a23d90SRob Bradford        --cmdline "root=/dev/pmem0p1 console=ttyS0" \
27d1a23d90SRob Bradford        --serial tty --console off \
28*fa22cb0bSRavi kumar Veeramally        --api-socket=/tmp/api1
29d1a23d90SRob Bradford```
30d1a23d90SRob Bradford
31d1a23d90SRob BradfordFor analysing the samples:
32d1a23d90SRob Bradford
33d1a23d90SRob Bradford```
34d1a23d90SRob Bradford$ perf report -g
35d1a23d90SRob Bradford```
36d1a23d90SRob Bradford
37d1a23d90SRob BradfordIf profiling with a network device attached either the TAP device must be already created and configured or the profiling must be done as root so that the TAP device can be created.
38a583b055SRob Bradford
39a583b055SRob Bradford## Userspace only profiling with LBR
40a583b055SRob Bradford
41a583b055SRob BradfordThe use of LBR (Last Branch Record; available since Haswell) offers lower
42a583b055SRob Bradfordoverhead if only userspace profiling is required. This lower overhead can allow
43a583b055SRob Bradforda higher frequency of sampling. This also removes the requirement to compile
44a583b055SRob Bradfordwith custom `RUSTFLAGS` however debug symbols should still be included:
45a583b055SRob Bradford
46a583b055SRob Bradforde.g.
47a583b055SRob Bradford
48a583b055SRob Bradford```
49a583b055SRob Bradford$ perf record --call-graph lbr --all-user --user-callchains -g target/release/cloud-hypervisor \
50a583b055SRob Bradford        --kernel ~/src/linux/vmlinux \
51a583b055SRob Bradford        --pmem file=~/workloads/focal.raw \
52a583b055SRob Bradford        --cpus boot=1 --memory size=1G \
53a583b055SRob Bradford        --cmdline "root=/dev/pmem0p1 console=ttyS0" \
54a583b055SRob Bradford        --serial tty --console off \
55*fa22cb0bSRavi kumar Veeramally        --api-socket=/tmp/api1
56a583b055SRob Bradford```
57