14a832588SMike Rapoport============== 24d09d0f4SAlexander DuyckPage fragments 34a832588SMike Rapoport============== 44d09d0f4SAlexander Duyck 54d09d0f4SAlexander DuyckA page fragment is an arbitrary-length arbitrary-offset area of memory 64d09d0f4SAlexander Duyckwhich resides within a 0 or higher order compound page. Multiple 74d09d0f4SAlexander Duyckfragments within that page are individually refcounted, in the page's 84d09d0f4SAlexander Duyckreference counter. 94d09d0f4SAlexander Duyck 104d09d0f4SAlexander DuyckThe page_frag functions, page_frag_alloc and page_frag_free, provide a 114d09d0f4SAlexander Duycksimple allocation framework for page fragments. This is used by the 124d09d0f4SAlexander Duycknetwork stack and network device drivers to provide a backing region of 134d09d0f4SAlexander Duyckmemory for use as either an sk_buff->head, or to be used in the "frags" 144d09d0f4SAlexander Duyckportion of skb_shared_info. 154d09d0f4SAlexander Duyck 164d09d0f4SAlexander DuyckIn order to make use of the page fragment APIs a backing page fragment 174d09d0f4SAlexander Duyckcache is needed. This provides a central point for the fragment allocation 184d09d0f4SAlexander Duyckand tracks allows multiple calls to make use of a cached page. The 194d09d0f4SAlexander Duyckadvantage to doing this is that multiple calls to get_page can be avoided 204d09d0f4SAlexander Duyckwhich can be expensive at allocation time. However due to the nature of 214d09d0f4SAlexander Duyckthis caching it is required that any calls to the cache be protected by 224d09d0f4SAlexander Duyckeither a per-cpu limitation, or a per-cpu limitation and forcing interrupts 234d09d0f4SAlexander Duyckto be disabled when executing the fragment allocation. 244d09d0f4SAlexander Duyck 254d09d0f4SAlexander DuyckThe network stack uses two separate caches per CPU to handle fragment 264d09d0f4SAlexander Duyckallocation. The netdev_alloc_cache is used by callers making use of the 27*ea8fdf1aSKevin Haonetdev_alloc_frag and __netdev_alloc_skb calls. The napi_alloc_cache is 284d09d0f4SAlexander Duyckused by callers of the __napi_alloc_frag and napi_alloc_skb calls. The 294d09d0f4SAlexander Duyckmain difference between these two calls is the context in which they may be 304d09d0f4SAlexander Duyckcalled. The "netdev" prefixed functions are usable in any context as these 314d09d0f4SAlexander Duyckfunctions will disable interrupts, while the "napi" prefixed functions are 324d09d0f4SAlexander Duyckonly usable within the softirq context. 334d09d0f4SAlexander Duyck 344d09d0f4SAlexander DuyckMany network device drivers use a similar methodology for allocating page 354d09d0f4SAlexander Duyckfragments, but the page fragments are cached at the ring or descriptor 364d09d0f4SAlexander Duycklevel. In order to enable these cases it is necessary to provide a generic 374d09d0f4SAlexander Duyckway of tearing down a page cache. For this reason __page_frag_cache_drain 384d09d0f4SAlexander Duyckwas implemented. It allows for freeing multiple references from a single 394d09d0f4SAlexander Duyckpage via a single call. The advantage to doing this is that it allows for 404d09d0f4SAlexander Duyckcleaning up the multiple references that were added to a page in order to 414d09d0f4SAlexander Duyckavoid calling get_page per allocation. 424d09d0f4SAlexander Duyck 434d09d0f4SAlexander DuyckAlexander Duyck, Nov 29, 2016. 44