1*94645620SAndrew Melnychenko=========================== 2*94645620SAndrew MelnychenkoeBPF RSS virtio-net support 3*94645620SAndrew Melnychenko=========================== 4*94645620SAndrew Melnychenko 5*94645620SAndrew MelnychenkoRSS(Receive Side Scaling) is used to distribute network packets to guest virtqueues 6*94645620SAndrew Melnychenkoby calculating packet hash. Usually every queue is processed then by a specific guest CPU core. 7*94645620SAndrew Melnychenko 8*94645620SAndrew MelnychenkoFor now there are 2 RSS implementations in qemu: 9*94645620SAndrew Melnychenko- 'in-qemu' RSS (functions if qemu receives network packets, i.e. vhost=off) 10*94645620SAndrew Melnychenko- eBPF RSS (can function with also with vhost=on) 11*94645620SAndrew Melnychenko 12*94645620SAndrew MelnychenkoeBPF support (CONFIG_EBPF) is enabled by 'configure' script. 13*94645620SAndrew MelnychenkoTo enable eBPF RSS support use './configure --enable-bpf'. 14*94645620SAndrew Melnychenko 15*94645620SAndrew MelnychenkoIf steering BPF is not set for kernel's TUN module, the TUN uses automatic selection 16*94645620SAndrew Melnychenkoof rx virtqueue based on lookup table built according to calculated symmetric hash 17*94645620SAndrew Melnychenkoof transmitted packets. 18*94645620SAndrew MelnychenkoIf steering BPF is set for TUN the BPF code calculates the hash of packet header and 19*94645620SAndrew Melnychenkoreturns the virtqueue number to place the packet to. 20*94645620SAndrew Melnychenko 21*94645620SAndrew MelnychenkoSimplified decision formula: 22*94645620SAndrew Melnychenko 23*94645620SAndrew Melnychenko.. code:: C 24*94645620SAndrew Melnychenko 25*94645620SAndrew Melnychenko queue_index = indirection_table[hash(<packet data>)%<indirection_table size>] 26*94645620SAndrew Melnychenko 27*94645620SAndrew Melnychenko 28*94645620SAndrew MelnychenkoNot for all packets, the hash can/should be calculated. 29*94645620SAndrew Melnychenko 30*94645620SAndrew MelnychenkoNote: currently, eBPF RSS does not support hash reporting. 31*94645620SAndrew Melnychenko 32*94645620SAndrew MelnychenkoeBPF RSS turned on by different combinations of vhost-net, vitrio-net and tap configurations: 33*94645620SAndrew Melnychenko 34*94645620SAndrew Melnychenko- eBPF is used: 35*94645620SAndrew Melnychenko 36*94645620SAndrew Melnychenko tap,vhost=off & virtio-net-pci,rss=on,hash=off 37*94645620SAndrew Melnychenko 38*94645620SAndrew Melnychenko- eBPF is used: 39*94645620SAndrew Melnychenko 40*94645620SAndrew Melnychenko tap,vhost=on & virtio-net-pci,rss=on,hash=off 41*94645620SAndrew Melnychenko 42*94645620SAndrew Melnychenko- 'in-qemu' RSS is used: 43*94645620SAndrew Melnychenko 44*94645620SAndrew Melnychenko tap,vhost=off & virtio-net-pci,rss=on,hash=on 45*94645620SAndrew Melnychenko 46*94645620SAndrew Melnychenko- eBPF is used, hash population feature is not reported to the guest: 47*94645620SAndrew Melnychenko 48*94645620SAndrew Melnychenko tap,vhost=on & virtio-net-pci,rss=on,hash=on 49*94645620SAndrew Melnychenko 50*94645620SAndrew MelnychenkoIf CONFIG_EBPF is not set then only 'in-qemu' RSS is supported. 51*94645620SAndrew MelnychenkoAlso 'in-qemu' RSS, as a fallback, is used if the eBPF program failed to load or set to TUN. 52*94645620SAndrew Melnychenko 53*94645620SAndrew MelnychenkoRSS eBPF program 54*94645620SAndrew Melnychenko---------------- 55*94645620SAndrew Melnychenko 56*94645620SAndrew MelnychenkoRSS program located in ebpf/rss.bpf.skeleton.h generated by bpftool. 57*94645620SAndrew MelnychenkoSo the program is part of the qemu binary. 58*94645620SAndrew MelnychenkoInitially, the eBPF program was compiled by clang and source code located at tools/ebpf/rss.bpf.c. 59*94645620SAndrew MelnychenkoPrerequisites to recompile the eBPF program (regenerate ebpf/rss.bpf.skeleton.h): 60*94645620SAndrew Melnychenko 61*94645620SAndrew Melnychenko llvm, clang, kernel source tree, bpftool 62*94645620SAndrew Melnychenko Adjust Makefile.ebpf to reflect the location of the kernel source tree 63*94645620SAndrew Melnychenko 64*94645620SAndrew Melnychenko $ cd tools/ebpf 65*94645620SAndrew Melnychenko $ make -f Makefile.ebpf 66*94645620SAndrew Melnychenko 67*94645620SAndrew MelnychenkoCurrent eBPF RSS implementation uses 'bounded loops' with 'backward jump instructions' which present in the last kernels. 68*94645620SAndrew MelnychenkoOverall eBPF RSS works on kernels 5.8+. 69*94645620SAndrew Melnychenko 70*94645620SAndrew MelnychenkoeBPF RSS implementation 71*94645620SAndrew Melnychenko----------------------- 72*94645620SAndrew Melnychenko 73*94645620SAndrew MelnychenkoeBPF RSS loading functionality located in ebpf/ebpf_rss.c and ebpf/ebpf_rss.h. 74*94645620SAndrew Melnychenko 75*94645620SAndrew MelnychenkoThe `struct EBPFRSSContext` structure that holds 4 file descriptors: 76*94645620SAndrew Melnychenko 77*94645620SAndrew Melnychenko- ctx - pointer of the libbpf context. 78*94645620SAndrew Melnychenko- program_fd - file descriptor of the eBPF RSS program. 79*94645620SAndrew Melnychenko- map_configuration - file descriptor of the 'configuration' map. This map contains one element of 'struct EBPFRSSConfig'. This configuration determines eBPF program behavior. 80*94645620SAndrew Melnychenko- map_toeplitz_key - file descriptor of the 'Toeplitz key' map. One element of the 40byte key prepared for the hashing algorithm. 81*94645620SAndrew Melnychenko- map_indirections_table - 128 elements of queue indexes. 82*94645620SAndrew Melnychenko 83*94645620SAndrew Melnychenko`struct EBPFRSSConfig` fields: 84*94645620SAndrew Melnychenko 85*94645620SAndrew Melnychenko- redirect - "boolean" value, should the hash be calculated, on false - `default_queue` would be used as the final decision. 86*94645620SAndrew Melnychenko- populate_hash - for now, not used. eBPF RSS doesn't support hash reporting. 87*94645620SAndrew Melnychenko- hash_types - binary mask of different hash types. See `VIRTIO_NET_RSS_HASH_TYPE_*` defines. If for packet hash should not be calculated - `default_queue` would be used. 88*94645620SAndrew Melnychenko- indirections_len - length of the indirections table, maximum 128. 89*94645620SAndrew Melnychenko- default_queue - the queue index that used for packet that shouldn't be hashed. For some packets, the hash can't be calculated(g.e ARP). 90*94645620SAndrew Melnychenko 91*94645620SAndrew MelnychenkoFunctions: 92*94645620SAndrew Melnychenko 93*94645620SAndrew Melnychenko- `ebpf_rss_init()` - sets ctx to NULL, which indicates that EBPFRSSContext is not loaded. 94*94645620SAndrew Melnychenko- `ebpf_rss_load()` - creates 3 maps and loads eBPF program from the rss.bpf.skeleton.h. Returns 'true' on success. After that, program_fd can be used to set steering for TAP. 95*94645620SAndrew Melnychenko- `ebpf_rss_set_all()` - sets values for eBPF maps. `indirections_table` length is in EBPFRSSConfig. `toeplitz_key` is VIRTIO_NET_RSS_MAX_KEY_SIZE aka 40 bytes array. 96*94645620SAndrew Melnychenko- `ebpf_rss_unload()` - close all file descriptors and set ctx to NULL. 97*94645620SAndrew Melnychenko 98*94645620SAndrew MelnychenkoSimplified eBPF RSS workflow: 99*94645620SAndrew Melnychenko 100*94645620SAndrew Melnychenko.. code:: C 101*94645620SAndrew Melnychenko 102*94645620SAndrew Melnychenko struct EBPFRSSConfig config; 103*94645620SAndrew Melnychenko config.redirect = 1; 104*94645620SAndrew Melnychenko config.hash_types = VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | VIRTIO_NET_RSS_HASH_TYPE_TCPv4; 105*94645620SAndrew Melnychenko config.indirections_len = VIRTIO_NET_RSS_MAX_TABLE_LEN; 106*94645620SAndrew Melnychenko config.default_queue = 0; 107*94645620SAndrew Melnychenko 108*94645620SAndrew Melnychenko uint16_t table[VIRTIO_NET_RSS_MAX_TABLE_LEN] = {...}; 109*94645620SAndrew Melnychenko uint8_t key[VIRTIO_NET_RSS_MAX_KEY_SIZE] = {...}; 110*94645620SAndrew Melnychenko 111*94645620SAndrew Melnychenko struct EBPFRSSContext ctx; 112*94645620SAndrew Melnychenko ebpf_rss_init(&ctx); 113*94645620SAndrew Melnychenko ebpf_rss_load(&ctx); 114*94645620SAndrew Melnychenko ebpf_rss_set_all(&ctx, &config, table, key); 115*94645620SAndrew Melnychenko if (net_client->info->set_steering_ebpf != NULL) { 116*94645620SAndrew Melnychenko net_client->info->set_steering_ebpf(net_client, ctx->program_fd); 117*94645620SAndrew Melnychenko } 118*94645620SAndrew Melnychenko ... 119*94645620SAndrew Melnychenko ebpf_unload(&ctx); 120*94645620SAndrew Melnychenko 121*94645620SAndrew Melnychenko 122*94645620SAndrew MelnychenkoNetClientState SetSteeringEBPF() 123*94645620SAndrew Melnychenko~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 124*94645620SAndrew Melnychenko 125*94645620SAndrew MelnychenkoFor now, `set_steering_ebpf()` method supported by Linux TAP NetClientState. The method requires an eBPF program file descriptor as an argument. 126