xref: /qemu/docs/devel/ebpf_rss.rst (revision 946456200e4f32a5a6cb4ae851dc5a9345ec68cf)
1*94645620SAndrew Melnychenko===========================
2*94645620SAndrew MelnychenkoeBPF RSS virtio-net support
3*94645620SAndrew Melnychenko===========================
4*94645620SAndrew Melnychenko
5*94645620SAndrew MelnychenkoRSS(Receive Side Scaling) is used to distribute network packets to guest virtqueues
6*94645620SAndrew Melnychenkoby calculating packet hash. Usually every queue is processed then by a specific guest CPU core.
7*94645620SAndrew Melnychenko
8*94645620SAndrew MelnychenkoFor now there are 2 RSS implementations in qemu:
9*94645620SAndrew Melnychenko- 'in-qemu' RSS (functions if qemu receives network packets, i.e. vhost=off)
10*94645620SAndrew Melnychenko- eBPF RSS (can function with also with vhost=on)
11*94645620SAndrew Melnychenko
12*94645620SAndrew MelnychenkoeBPF support (CONFIG_EBPF) is enabled by 'configure' script.
13*94645620SAndrew MelnychenkoTo enable eBPF RSS support use './configure --enable-bpf'.
14*94645620SAndrew Melnychenko
15*94645620SAndrew MelnychenkoIf steering BPF is not set for kernel's TUN module, the TUN uses automatic selection
16*94645620SAndrew Melnychenkoof rx virtqueue based on lookup table built according to calculated symmetric hash
17*94645620SAndrew Melnychenkoof transmitted packets.
18*94645620SAndrew MelnychenkoIf steering BPF is set for TUN the BPF code calculates the hash of packet header and
19*94645620SAndrew Melnychenkoreturns the virtqueue number to place the packet to.
20*94645620SAndrew Melnychenko
21*94645620SAndrew MelnychenkoSimplified decision formula:
22*94645620SAndrew Melnychenko
23*94645620SAndrew Melnychenko.. code:: C
24*94645620SAndrew Melnychenko
25*94645620SAndrew Melnychenko    queue_index = indirection_table[hash(<packet data>)%<indirection_table size>]
26*94645620SAndrew Melnychenko
27*94645620SAndrew Melnychenko
28*94645620SAndrew MelnychenkoNot for all packets, the hash can/should be calculated.
29*94645620SAndrew Melnychenko
30*94645620SAndrew MelnychenkoNote: currently, eBPF RSS does not support hash reporting.
31*94645620SAndrew Melnychenko
32*94645620SAndrew MelnychenkoeBPF RSS turned on by different combinations of vhost-net, vitrio-net and tap configurations:
33*94645620SAndrew Melnychenko
34*94645620SAndrew Melnychenko- eBPF is used:
35*94645620SAndrew Melnychenko
36*94645620SAndrew Melnychenko        tap,vhost=off & virtio-net-pci,rss=on,hash=off
37*94645620SAndrew Melnychenko
38*94645620SAndrew Melnychenko- eBPF is used:
39*94645620SAndrew Melnychenko
40*94645620SAndrew Melnychenko        tap,vhost=on & virtio-net-pci,rss=on,hash=off
41*94645620SAndrew Melnychenko
42*94645620SAndrew Melnychenko- 'in-qemu' RSS is used:
43*94645620SAndrew Melnychenko
44*94645620SAndrew Melnychenko        tap,vhost=off & virtio-net-pci,rss=on,hash=on
45*94645620SAndrew Melnychenko
46*94645620SAndrew Melnychenko- eBPF is used, hash population feature is not reported to the guest:
47*94645620SAndrew Melnychenko
48*94645620SAndrew Melnychenko        tap,vhost=on & virtio-net-pci,rss=on,hash=on
49*94645620SAndrew Melnychenko
50*94645620SAndrew MelnychenkoIf CONFIG_EBPF is not set then only 'in-qemu' RSS is supported.
51*94645620SAndrew MelnychenkoAlso 'in-qemu' RSS, as a fallback, is used if the eBPF program failed to load or set to TUN.
52*94645620SAndrew Melnychenko
53*94645620SAndrew MelnychenkoRSS eBPF program
54*94645620SAndrew Melnychenko----------------
55*94645620SAndrew Melnychenko
56*94645620SAndrew MelnychenkoRSS program located in ebpf/rss.bpf.skeleton.h generated by bpftool.
57*94645620SAndrew MelnychenkoSo the program is part of the qemu binary.
58*94645620SAndrew MelnychenkoInitially, the eBPF program was compiled by clang and source code located at tools/ebpf/rss.bpf.c.
59*94645620SAndrew MelnychenkoPrerequisites to recompile the eBPF program (regenerate ebpf/rss.bpf.skeleton.h):
60*94645620SAndrew Melnychenko
61*94645620SAndrew Melnychenko        llvm, clang, kernel source tree, bpftool
62*94645620SAndrew Melnychenko        Adjust Makefile.ebpf to reflect the location of the kernel source tree
63*94645620SAndrew Melnychenko
64*94645620SAndrew Melnychenko        $ cd tools/ebpf
65*94645620SAndrew Melnychenko        $ make -f Makefile.ebpf
66*94645620SAndrew Melnychenko
67*94645620SAndrew MelnychenkoCurrent eBPF RSS implementation uses 'bounded loops' with 'backward jump instructions' which present in the last kernels.
68*94645620SAndrew MelnychenkoOverall eBPF RSS works on kernels 5.8+.
69*94645620SAndrew Melnychenko
70*94645620SAndrew MelnychenkoeBPF RSS implementation
71*94645620SAndrew Melnychenko-----------------------
72*94645620SAndrew Melnychenko
73*94645620SAndrew MelnychenkoeBPF RSS loading functionality located in ebpf/ebpf_rss.c and ebpf/ebpf_rss.h.
74*94645620SAndrew Melnychenko
75*94645620SAndrew MelnychenkoThe `struct EBPFRSSContext` structure that holds 4 file descriptors:
76*94645620SAndrew Melnychenko
77*94645620SAndrew Melnychenko- ctx - pointer of the libbpf context.
78*94645620SAndrew Melnychenko- program_fd - file descriptor of the eBPF RSS program.
79*94645620SAndrew Melnychenko- map_configuration - file descriptor of the 'configuration' map. This map contains one element of 'struct EBPFRSSConfig'. This configuration determines eBPF program behavior.
80*94645620SAndrew Melnychenko- map_toeplitz_key - file descriptor of the 'Toeplitz key' map. One element of the 40byte key prepared for the hashing algorithm.
81*94645620SAndrew Melnychenko- map_indirections_table - 128 elements of queue indexes.
82*94645620SAndrew Melnychenko
83*94645620SAndrew Melnychenko`struct EBPFRSSConfig` fields:
84*94645620SAndrew Melnychenko
85*94645620SAndrew Melnychenko- redirect - "boolean" value, should the hash be calculated, on false  - `default_queue` would be used as the final decision.
86*94645620SAndrew Melnychenko- populate_hash - for now, not used. eBPF RSS doesn't support hash reporting.
87*94645620SAndrew Melnychenko- hash_types - binary mask of different hash types. See `VIRTIO_NET_RSS_HASH_TYPE_*` defines. If for packet hash should not be calculated - `default_queue` would be used.
88*94645620SAndrew Melnychenko- indirections_len - length of the indirections table, maximum 128.
89*94645620SAndrew Melnychenko- default_queue - the queue index that used for packet that shouldn't be hashed. For some packets, the hash can't be calculated(g.e ARP).
90*94645620SAndrew Melnychenko
91*94645620SAndrew MelnychenkoFunctions:
92*94645620SAndrew Melnychenko
93*94645620SAndrew Melnychenko- `ebpf_rss_init()` - sets ctx to NULL, which indicates that EBPFRSSContext is not loaded.
94*94645620SAndrew Melnychenko- `ebpf_rss_load()` - creates 3 maps and loads eBPF program from the rss.bpf.skeleton.h. Returns 'true' on success. After that, program_fd can be used to set steering for TAP.
95*94645620SAndrew Melnychenko- `ebpf_rss_set_all()` - sets values for eBPF maps. `indirections_table` length is in EBPFRSSConfig. `toeplitz_key` is VIRTIO_NET_RSS_MAX_KEY_SIZE aka 40 bytes array.
96*94645620SAndrew Melnychenko- `ebpf_rss_unload()` - close all file descriptors and set ctx to NULL.
97*94645620SAndrew Melnychenko
98*94645620SAndrew MelnychenkoSimplified eBPF RSS workflow:
99*94645620SAndrew Melnychenko
100*94645620SAndrew Melnychenko.. code:: C
101*94645620SAndrew Melnychenko
102*94645620SAndrew Melnychenko    struct EBPFRSSConfig config;
103*94645620SAndrew Melnychenko    config.redirect = 1;
104*94645620SAndrew Melnychenko    config.hash_types = VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | VIRTIO_NET_RSS_HASH_TYPE_TCPv4;
105*94645620SAndrew Melnychenko    config.indirections_len = VIRTIO_NET_RSS_MAX_TABLE_LEN;
106*94645620SAndrew Melnychenko    config.default_queue = 0;
107*94645620SAndrew Melnychenko
108*94645620SAndrew Melnychenko    uint16_t table[VIRTIO_NET_RSS_MAX_TABLE_LEN] = {...};
109*94645620SAndrew Melnychenko    uint8_t key[VIRTIO_NET_RSS_MAX_KEY_SIZE] = {...};
110*94645620SAndrew Melnychenko
111*94645620SAndrew Melnychenko    struct EBPFRSSContext ctx;
112*94645620SAndrew Melnychenko    ebpf_rss_init(&ctx);
113*94645620SAndrew Melnychenko    ebpf_rss_load(&ctx);
114*94645620SAndrew Melnychenko    ebpf_rss_set_all(&ctx, &config, table, key);
115*94645620SAndrew Melnychenko    if (net_client->info->set_steering_ebpf != NULL) {
116*94645620SAndrew Melnychenko        net_client->info->set_steering_ebpf(net_client, ctx->program_fd);
117*94645620SAndrew Melnychenko    }
118*94645620SAndrew Melnychenko    ...
119*94645620SAndrew Melnychenko    ebpf_unload(&ctx);
120*94645620SAndrew Melnychenko
121*94645620SAndrew Melnychenko
122*94645620SAndrew MelnychenkoNetClientState SetSteeringEBPF()
123*94645620SAndrew Melnychenko~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124*94645620SAndrew Melnychenko
125*94645620SAndrew MelnychenkoFor now, `set_steering_ebpf()` method supported by Linux TAP NetClientState. The method requires an eBPF program file descriptor as an argument.
126