xref: /qemu/docs/devel/ebpf_rss.rst (revision 70ce076fa6dff60585c229a4b641b13e64bf03cf)
1.. _ebpf-rss:
2
3===========================
4eBPF RSS virtio-net support
5===========================
6
7RSS(Receive Side Scaling) is used to distribute network packets to guest virtqueues
8by calculating packet hash. Usually every queue is processed then by a specific guest CPU core.
9
10For now there are 2 RSS implementations in qemu:
11- 'in-qemu' RSS (functions if qemu receives network packets, i.e. vhost=off)
12- eBPF RSS (can function with also with vhost=on)
13
14eBPF support (CONFIG_EBPF) is enabled by 'configure' script.
15To enable eBPF RSS support use './configure --enable-bpf'.
16
17If steering BPF is not set for kernel's TUN module, the TUN uses automatic selection
18of rx virtqueue based on lookup table built according to calculated symmetric hash
19of transmitted packets.
20If steering BPF is set for TUN the BPF code calculates the hash of packet header and
21returns the virtqueue number to place the packet to.
22
23Simplified decision formula:
24
25.. code:: C
26
27    queue_index = indirection_table[hash(<packet data>)%<indirection_table size>]
28
29
30Not for all packets, the hash can/should be calculated.
31
32Note: currently, eBPF RSS does not support hash reporting.
33
34eBPF RSS turned on by different combinations of vhost-net, vitrio-net and tap configurations:
35
36- eBPF is used:
37
38        tap,vhost=off & virtio-net-pci,rss=on,hash=off
39
40- eBPF is used:
41
42        tap,vhost=on & virtio-net-pci,rss=on,hash=off
43
44- 'in-qemu' RSS is used:
45
46        tap,vhost=off & virtio-net-pci,rss=on,hash=on
47
48- eBPF is used, hash population feature is not reported to the guest:
49
50        tap,vhost=on & virtio-net-pci,rss=on,hash=on
51
52If CONFIG_EBPF is not set then only 'in-qemu' RSS is supported.
53Also 'in-qemu' RSS, as a fallback, is used if the eBPF program failed to load or set to TUN.
54
55RSS eBPF program
56----------------
57
58RSS program located in ebpf/rss.bpf.skeleton.h generated by bpftool.
59So the program is part of the qemu binary.
60Initially, the eBPF program was compiled by clang and source code located at tools/ebpf/rss.bpf.c.
61Prerequisites to recompile the eBPF program (regenerate ebpf/rss.bpf.skeleton.h):
62
63        llvm, clang, kernel source tree, bpftool
64        Adjust Makefile.ebpf to reflect the location of the kernel source tree
65
66        $ cd tools/ebpf
67        $ make -f Makefile.ebpf
68
69Current eBPF RSS implementation uses 'bounded loops' with 'backward jump instructions' which present in the last kernels.
70Overall eBPF RSS works on kernels 5.8+.
71
72eBPF RSS implementation
73-----------------------
74
75eBPF RSS loading functionality located in ebpf/ebpf_rss.c and ebpf/ebpf_rss.h.
76
77The ``struct EBPFRSSContext`` structure that holds 4 file descriptors:
78
79- ctx - pointer of the libbpf context.
80- program_fd - file descriptor of the eBPF RSS program.
81- map_configuration - file descriptor of the 'configuration' map. This map contains one element of 'struct EBPFRSSConfig'. This configuration determines eBPF program behavior.
82- map_toeplitz_key - file descriptor of the 'Toeplitz key' map. One element of the 40byte key prepared for the hashing algorithm.
83- map_indirections_table - 128 elements of queue indexes.
84
85``struct EBPFRSSConfig`` fields:
86
87- redirect - "boolean" value, should the hash be calculated, on false  - ``default_queue`` would be used as the final decision.
88- populate_hash - for now, not used. eBPF RSS doesn't support hash reporting.
89- hash_types - binary mask of different hash types. See ``VIRTIO_NET_RSS_HASH_TYPE_*`` defines. If for packet hash should not be calculated - ``default_queue`` would be used.
90- indirections_len - length of the indirections table, maximum 128.
91- default_queue - the queue index that used for packet that shouldn't be hashed. For some packets, the hash can't be calculated(g.e ARP).
92
93Functions:
94
95- ``ebpf_rss_init()`` - sets ctx to NULL, which indicates that EBPFRSSContext is not loaded.
96- ``ebpf_rss_load()`` - creates 3 maps and loads eBPF program from the rss.bpf.skeleton.h. Returns 'true' on success. After that, program_fd can be used to set steering for TAP.
97- ``ebpf_rss_set_all()`` - sets values for eBPF maps. ``indirections_table`` length is in EBPFRSSConfig. ``toeplitz_key`` is VIRTIO_NET_RSS_MAX_KEY_SIZE aka 40 bytes array.
98- ``ebpf_rss_unload()`` - close all file descriptors and set ctx to NULL.
99
100Simplified eBPF RSS workflow:
101
102.. code:: C
103
104    struct EBPFRSSConfig config;
105    config.redirect = 1;
106    config.hash_types = VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | VIRTIO_NET_RSS_HASH_TYPE_TCPv4;
107    config.indirections_len = VIRTIO_NET_RSS_MAX_TABLE_LEN;
108    config.default_queue = 0;
109
110    uint16_t table[VIRTIO_NET_RSS_MAX_TABLE_LEN] = {...};
111    uint8_t key[VIRTIO_NET_RSS_MAX_KEY_SIZE] = {...};
112
113    struct EBPFRSSContext ctx;
114    ebpf_rss_init(&ctx);
115    ebpf_rss_load(&ctx);
116    ebpf_rss_set_all(&ctx, &config, table, key);
117    if (net_client->info->set_steering_ebpf != NULL) {
118        net_client->info->set_steering_ebpf(net_client, ctx->program_fd);
119    }
120    ...
121    ebpf_unload(&ctx);
122
123
124NetClientState SetSteeringEBPF()
125~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
126
127For now, ``set_steering_ebpf()`` method supported by Linux TAP NetClientState. The method requires an eBPF program file descriptor as an argument.
128