1.. SPDX-License-Identifier: GPL-2.0 2 3=========================== 4mmap_prepare callback HOWTO 5=========================== 6 7Introduction 8============ 9 10The ``struct file->f_op->mmap()`` callback has been deprecated as it is both a 11stability and security risk, and doesn't always permit the merging of adjacent 12mappings resulting in unnecessary memory fragmentation. 13 14It has been replaced with the ``file->f_op->mmap_prepare()`` callback which 15solves these problems. 16 17This hook is called right at the beginning of setting up the mapping, and 18importantly it is invoked *before* any merging of adjacent mappings has taken 19place. 20 21If an error arises upon mapping, it might arise after this callback has been 22invoked, therefore it should be treated as effectively stateless. 23 24That is - no resources should be allocated nor state updated to reflect that a 25mapping has been established, as the mapping may either be merged, or fail to be 26mapped after the callback is complete. 27 28Mapped callback 29--------------- 30 31If resources need to be allocated per-mapping, or state such as a reference 32count needs to be manipulated, this should be done using the ``vm_ops->mapped`` 33hook, which itself should be set by the >mmap_prepare hook. 34 35This callback is only invoked if a new mapping has been established and was not 36merged with any other, and is invoked at a point where no error may occur before 37the mapping is established. 38 39You may return an error to the callback itself, which will cause the mapping to 40become unmapped and an error returned to the mmap() caller. This is useful if 41resources need to be allocated, and that allocation might fail. 42 43How To Use 44========== 45 46In your driver's struct file_operations struct, specify an ``mmap_prepare`` 47callback rather than an ``mmap`` one, e.g. for ext4: 48 49.. code-block:: C 50 51 const struct file_operations ext4_file_operations = { 52 ... 53 .mmap_prepare = ext4_file_mmap_prepare, 54 }; 55 56This has a signature of ``int (*mmap_prepare)(struct vm_area_desc *)``. 57 58Examining the struct vm_area_desc type: 59 60.. code-block:: C 61 62 struct vm_area_desc { 63 /* Immutable state. */ 64 const struct mm_struct *const mm; 65 struct file *const file; /* May vary from vm_file in stacked callers. */ 66 unsigned long start; 67 unsigned long end; 68 69 /* Mutable fields. Populated with initial state. */ 70 pgoff_t pgoff; 71 struct file *vm_file; 72 vma_flags_t vma_flags; 73 pgprot_t page_prot; 74 75 /* Write-only fields. */ 76 const struct vm_operations_struct *vm_ops; 77 void *private_data; 78 79 /* Take further action? */ 80 struct mmap_action action; 81 }; 82 83This is straightforward - you have all the fields you need to set up the 84mapping, and you can update the mutable and writable fields, for instance: 85 86.. code-block:: C 87 88 static int ext4_file_mmap_prepare(struct vm_area_desc *desc) 89 { 90 int ret; 91 struct file *file = desc->file; 92 struct inode *inode = file->f_mapping->host; 93 94 ... 95 96 file_accessed(file); 97 if (IS_DAX(file_inode(file))) { 98 desc->vm_ops = &ext4_dax_vm_ops; 99 vma_desc_set_flags(desc, VMA_HUGEPAGE_BIT); 100 } else { 101 desc->vm_ops = &ext4_file_vm_ops; 102 } 103 return 0; 104 } 105 106Importantly, you no longer have to dance around with reference counts or locks 107when updating these fields - **you can simply go ahead and change them**. 108 109Everything is taken care of by the mapping code. 110 111VMA Flags 112--------- 113 114Along with ``mmap_prepare``, VMA flags have undergone an overhaul. Where before 115you would invoke one of vm_flags_init(), vm_flags_reset(), vm_flags_set(), 116vm_flags_clear(), and vm_flags_mod() to modify flags (and to have the 117locking done correctly for you, this is no longer necessary. 118 119Also, the legacy approach of specifying VMA flags via ``VM_READ``, ``VM_WRITE``, 120etc. - i.e. using a ``-VM_xxx``- macro has changed too. 121 122When implementing mmap_prepare(), reference flags by their bit number, defined 123as a ``VMA_xxx_BIT`` macro, e.g. ``VMA_READ_BIT``, ``VMA_WRITE_BIT`` etc., 124and use one of (where ``desc`` is a pointer to struct vm_area_desc): 125 126* ``vma_desc_test_any(desc, ...)`` - Specify a comma-separated list of flags 127 you wish to test for (whether _any_ are set), e.g. - ``vma_desc_test_any( 128 desc, VMA_WRITE_BIT, VMA_MAYWRITE_BIT)`` - returns ``true`` if either are set, 129 otherwise ``false``. 130* ``vma_desc_set_flags(desc, ...)`` - Update the VMA descriptor flags to set 131 additional flags specified by a comma-separated list, 132 e.g. - ``vma_desc_set_flags(desc, VMA_PFNMAP_BIT, VMA_IO_BIT)``. 133* ``vma_desc_clear_flags(desc, ...)`` - Update the VMA descriptor flags to clear 134 flags specified by a comma-separated list, e.g. - ``vma_desc_clear_flags( 135 desc, VMA_WRITE_BIT, VMA_MAYWRITE_BIT)``. 136 137Actions 138======= 139 140You can now very easily have actions be performed upon a mapping once set up by 141utilising simple helper functions invoked upon the struct vm_area_desc 142pointer. These are: 143 144* mmap_action_remap() - Remaps a range consisting only of PFNs for a specific 145 range starting a virtual address and PFN number of a set size. 146 147* mmap_action_remap_full() - Same as mmap_action_remap(), only remaps the 148 entire mapping from ``start_pfn`` onward. 149 150* mmap_action_ioremap() - Same as mmap_action_remap(), only performs an I/O 151 remap. 152 153* mmap_action_ioremap_full() - Same as mmap_action_ioremap(), only remaps 154 the entire mapping from ``start_pfn`` onward. 155 156* mmap_action_simple_ioremap() - Sets up an I/O remap from a specified 157 physical address and over a specified length. 158 159* mmap_action_map_kernel_pages() - Maps a specified array of `struct page` 160 pointers in the VMA from a specific offset. 161 162* mmap_action_map_kernel_pages_full() - Maps a specified array of `struct 163 page` pointers over the entire VMA. The caller must ensure there are 164 sufficient entries in the page array to cover the entire range of the 165 described VMA. 166 167**NOTE:** The ``action`` field should never normally be manipulated directly, 168rather you ought to use one of these helpers. 169