1c73be61cSDavid Howells============================== 2c73be61cSDavid HowellsGeneral notification mechanism 3c73be61cSDavid Howells============================== 4c73be61cSDavid Howells 5c73be61cSDavid HowellsThe general notification mechanism is built on top of the standard pipe driver 6c73be61cSDavid Howellswhereby it effectively splices notification messages from the kernel into pipes 7c73be61cSDavid Howellsopened by userspace. This can be used in conjunction with:: 8c73be61cSDavid Howells 9c73be61cSDavid Howells * Key/keyring notifications 10c73be61cSDavid Howells 11c73be61cSDavid Howells 12c73be61cSDavid HowellsThe notifications buffers can be enabled by: 13c73be61cSDavid Howells 14c73be61cSDavid Howells "General setup"/"General notification queue" 15c73be61cSDavid Howells (CONFIG_WATCH_QUEUE) 16c73be61cSDavid Howells 17c73be61cSDavid HowellsThis document has the following sections: 18c73be61cSDavid Howells 19c73be61cSDavid Howells.. contents:: :local: 20c73be61cSDavid Howells 21c73be61cSDavid Howells 22c73be61cSDavid HowellsOverview 23c73be61cSDavid Howells======== 24c73be61cSDavid Howells 25c73be61cSDavid HowellsThis facility appears as a pipe that is opened in a special mode. The pipe's 26c73be61cSDavid Howellsinternal ring buffer is used to hold messages that are generated by the kernel. 27c73be61cSDavid HowellsThese messages are then read out by read(). Splice and similar are disabled on 28c73be61cSDavid Howellssuch pipes due to them wanting to, under some circumstances, revert their 29c73be61cSDavid Howellsadditions to the ring - which might end up interleaved with notification 30c73be61cSDavid Howellsmessages. 31c73be61cSDavid Howells 32c73be61cSDavid HowellsThe owner of the pipe has to tell the kernel which sources it would like to 33c73be61cSDavid Howellswatch through that pipe. Only sources that have been connected to a pipe will 34c73be61cSDavid Howellsinsert messages into it. Note that a source may be bound to multiple pipes and 35c73be61cSDavid Howellsinsert messages into all of them simultaneously. 36c73be61cSDavid Howells 37c73be61cSDavid HowellsFilters may also be emplaced on a pipe so that certain source types and 38c73be61cSDavid Howellssubevents can be ignored if they're not of interest. 39c73be61cSDavid Howells 40c73be61cSDavid HowellsA message will be discarded if there isn't a slot available in the ring or if 41c73be61cSDavid Howellsno preallocated message buffer is available. In both of these cases, read() 42c73be61cSDavid Howellswill insert a WATCH_META_LOSS_NOTIFICATION message into the output buffer after 43c73be61cSDavid Howellsthe last message currently in the buffer has been read. 44c73be61cSDavid Howells 45c73be61cSDavid HowellsNote that when producing a notification, the kernel does not wait for the 46c73be61cSDavid Howellsconsumers to collect it, but rather just continues on. This means that 47c73be61cSDavid Howellsnotifications can be generated whilst spinlocks are held and also protects the 48c73be61cSDavid Howellskernel from being held up indefinitely by a userspace malfunction. 49c73be61cSDavid Howells 50c73be61cSDavid Howells 51c73be61cSDavid HowellsMessage Structure 52c73be61cSDavid Howells================= 53c73be61cSDavid Howells 54c73be61cSDavid HowellsNotification messages begin with a short header:: 55c73be61cSDavid Howells 56c73be61cSDavid Howells struct watch_notification { 57c73be61cSDavid Howells __u32 type:24; 58c73be61cSDavid Howells __u32 subtype:8; 59c73be61cSDavid Howells __u32 info; 60c73be61cSDavid Howells }; 61c73be61cSDavid Howells 62c73be61cSDavid Howells"type" indicates the source of the notification record and "subtype" indicates 63c73be61cSDavid Howellsthe type of record from that source (see the Watch Sources section below). The 64c73be61cSDavid Howellstype may also be "WATCH_TYPE_META". This is a special record type generated 65c73be61cSDavid Howellsinternally by the watch queue itself. There are two subtypes: 66c73be61cSDavid Howells 67c73be61cSDavid Howells * WATCH_META_REMOVAL_NOTIFICATION 68c73be61cSDavid Howells * WATCH_META_LOSS_NOTIFICATION 69c73be61cSDavid Howells 70c73be61cSDavid HowellsThe first indicates that an object on which a watch was installed was removed 71c73be61cSDavid Howellsor destroyed and the second indicates that some messages have been lost. 72c73be61cSDavid Howells 73c73be61cSDavid Howells"info" indicates a bunch of things, including: 74c73be61cSDavid Howells 75c73be61cSDavid Howells * The length of the message in bytes, including the header (mask with 76c73be61cSDavid Howells WATCH_INFO_LENGTH and shift by WATCH_INFO_LENGTH__SHIFT). This indicates 77c73be61cSDavid Howells the size of the record, which may be between 8 and 127 bytes. 78c73be61cSDavid Howells 79c73be61cSDavid Howells * The watch ID (mask with WATCH_INFO_ID and shift by WATCH_INFO_ID__SHIFT). 80c73be61cSDavid Howells This indicates that caller's ID of the watch, which may be between 0 81c73be61cSDavid Howells and 255. Multiple watches may share a queue, and this provides a means to 82c73be61cSDavid Howells distinguish them. 83c73be61cSDavid Howells 84c73be61cSDavid Howells * A type-specific field (WATCH_INFO_TYPE_INFO). This is set by the 85c73be61cSDavid Howells notification producer to indicate some meaning specific to the type and 86c73be61cSDavid Howells subtype. 87c73be61cSDavid Howells 88c73be61cSDavid HowellsEverything in info apart from the length can be used for filtering. 89c73be61cSDavid Howells 90c73be61cSDavid HowellsThe header can be followed by supplementary information. The format of this is 91c73be61cSDavid Howellsat the discretion is defined by the type and subtype. 92c73be61cSDavid Howells 93c73be61cSDavid Howells 94c73be61cSDavid HowellsWatch List (Notification Source) API 95c73be61cSDavid Howells==================================== 96c73be61cSDavid Howells 97c73be61cSDavid HowellsA "watch list" is a list of watchers that are subscribed to a source of 98c73be61cSDavid Howellsnotifications. A list may be attached to an object (say a key or a superblock) 99c73be61cSDavid Howellsor may be global (say for device events). From a userspace perspective, a 100c73be61cSDavid Howellsnon-global watch list is typically referred to by reference to the object it 101c73be61cSDavid Howellsbelongs to (such as using KEYCTL_NOTIFY and giving it a key serial number to 102c73be61cSDavid Howellswatch that specific key). 103c73be61cSDavid Howells 104c73be61cSDavid HowellsTo manage a watch list, the following functions are provided: 105c73be61cSDavid Howells 106*50f32634SMauro Carvalho Chehab * :: 107*50f32634SMauro Carvalho Chehab 108*50f32634SMauro Carvalho Chehab void init_watch_list(struct watch_list *wlist, 109*50f32634SMauro Carvalho Chehab void (*release_watch)(struct watch *wlist)); 110c73be61cSDavid Howells 111c73be61cSDavid Howells Initialise a watch list. If ``release_watch`` is not NULL, then this 112c73be61cSDavid Howells indicates a function that should be called when the watch_list object is 113c73be61cSDavid Howells destroyed to discard any references the watch list holds on the watched 114c73be61cSDavid Howells object. 115c73be61cSDavid Howells 116c73be61cSDavid Howells * ``void remove_watch_list(struct watch_list *wlist);`` 117c73be61cSDavid Howells 118c73be61cSDavid Howells This removes all of the watches subscribed to a watch_list and frees them 119c73be61cSDavid Howells and then destroys the watch_list object itself. 120c73be61cSDavid Howells 121c73be61cSDavid Howells 122c73be61cSDavid HowellsWatch Queue (Notification Output) API 123c73be61cSDavid Howells===================================== 124c73be61cSDavid Howells 125c73be61cSDavid HowellsA "watch queue" is the buffer allocated by an application that notification 126c73be61cSDavid Howellsrecords will be written into. The workings of this are hidden entirely inside 127c73be61cSDavid Howellsof the pipe device driver, but it is necessary to gain a reference to it to set 128c73be61cSDavid Howellsa watch. These can be managed with: 129c73be61cSDavid Howells 130c73be61cSDavid Howells * ``struct watch_queue *get_watch_queue(int fd);`` 131c73be61cSDavid Howells 132c73be61cSDavid Howells Since watch queues are indicated to the kernel by the fd of the pipe that 133c73be61cSDavid Howells implements the buffer, userspace must hand that fd through a system call. 134c73be61cSDavid Howells This can be used to look up an opaque pointer to the watch queue from the 135c73be61cSDavid Howells system call. 136c73be61cSDavid Howells 137c73be61cSDavid Howells * ``void put_watch_queue(struct watch_queue *wqueue);`` 138c73be61cSDavid Howells 139c73be61cSDavid Howells This discards the reference obtained from ``get_watch_queue()``. 140c73be61cSDavid Howells 141c73be61cSDavid Howells 142c73be61cSDavid HowellsWatch Subscription API 143c73be61cSDavid Howells====================== 144c73be61cSDavid Howells 145c73be61cSDavid HowellsA "watch" is a subscription on a watch list, indicating the watch queue, and 146c73be61cSDavid Howellsthus the buffer, into which notification records should be written. The watch 147c73be61cSDavid Howellsqueue object may also carry filtering rules for that object, as set by 148c73be61cSDavid Howellsuserspace. Some parts of the watch struct can be set by the driver:: 149c73be61cSDavid Howells 150c73be61cSDavid Howells struct watch { 151c73be61cSDavid Howells union { 152c73be61cSDavid Howells u32 info_id; /* ID to be OR'd in to info field */ 153c73be61cSDavid Howells ... 154c73be61cSDavid Howells }; 155c73be61cSDavid Howells void *private; /* Private data for the watched object */ 156c73be61cSDavid Howells u64 id; /* Internal identifier */ 157c73be61cSDavid Howells ... 158c73be61cSDavid Howells }; 159c73be61cSDavid Howells 160c73be61cSDavid HowellsThe ``info_id`` value should be an 8-bit number obtained from userspace and 161c73be61cSDavid Howellsshifted by WATCH_INFO_ID__SHIFT. This is OR'd into the WATCH_INFO_ID field of 162c73be61cSDavid Howellsstruct watch_notification::info when and if the notification is written into 163c73be61cSDavid Howellsthe associated watch queue buffer. 164c73be61cSDavid Howells 165c73be61cSDavid HowellsThe ``private`` field is the driver's data associated with the watch_list and 166c73be61cSDavid Howellsis cleaned up by the ``watch_list::release_watch()`` method. 167c73be61cSDavid Howells 168c73be61cSDavid HowellsThe ``id`` field is the source's ID. Notifications that are posted with a 169c73be61cSDavid Howellsdifferent ID are ignored. 170c73be61cSDavid Howells 171c73be61cSDavid HowellsThe following functions are provided to manage watches: 172c73be61cSDavid Howells 173c73be61cSDavid Howells * ``void init_watch(struct watch *watch, struct watch_queue *wqueue);`` 174c73be61cSDavid Howells 175c73be61cSDavid Howells Initialise a watch object, setting its pointer to the watch queue, using 176c73be61cSDavid Howells appropriate barriering to avoid lockdep complaints. 177c73be61cSDavid Howells 178c73be61cSDavid Howells * ``int add_watch_to_object(struct watch *watch, struct watch_list *wlist);`` 179c73be61cSDavid Howells 180c73be61cSDavid Howells Subscribe a watch to a watch list (notification source). The 181c73be61cSDavid Howells driver-settable fields in the watch struct must have been set before this 182c73be61cSDavid Howells is called. 183c73be61cSDavid Howells 184*50f32634SMauro Carvalho Chehab * :: 185*50f32634SMauro Carvalho Chehab 186*50f32634SMauro Carvalho Chehab int remove_watch_from_object(struct watch_list *wlist, 187c73be61cSDavid Howells struct watch_queue *wqueue, 188*50f32634SMauro Carvalho Chehab u64 id, false); 189c73be61cSDavid Howells 190c73be61cSDavid Howells Remove a watch from a watch list, where the watch must match the specified 191c73be61cSDavid Howells watch queue (``wqueue``) and object identifier (``id``). A notification 192c73be61cSDavid Howells (``WATCH_META_REMOVAL_NOTIFICATION``) is sent to the watch queue to 193c73be61cSDavid Howells indicate that the watch got removed. 194c73be61cSDavid Howells 195c73be61cSDavid Howells * ``int remove_watch_from_object(struct watch_list *wlist, NULL, 0, true);`` 196c73be61cSDavid Howells 197c73be61cSDavid Howells Remove all the watches from a watch list. It is expected that this will be 198c73be61cSDavid Howells called preparatory to destruction and that the watch list will be 199c73be61cSDavid Howells inaccessible to new watches by this point. A notification 200c73be61cSDavid Howells (``WATCH_META_REMOVAL_NOTIFICATION``) is sent to the watch queue of each 201c73be61cSDavid Howells subscribed watch to indicate that the watch got removed. 202c73be61cSDavid Howells 203c73be61cSDavid Howells 204c73be61cSDavid HowellsNotification Posting API 205c73be61cSDavid Howells======================== 206c73be61cSDavid Howells 207c73be61cSDavid HowellsTo post a notification to watch list so that the subscribed watches can see it, 208c73be61cSDavid Howellsthe following function should be used:: 209c73be61cSDavid Howells 210c73be61cSDavid Howells void post_watch_notification(struct watch_list *wlist, 211c73be61cSDavid Howells struct watch_notification *n, 212c73be61cSDavid Howells const struct cred *cred, 213c73be61cSDavid Howells u64 id); 214c73be61cSDavid Howells 215c73be61cSDavid HowellsThe notification should be preformatted and a pointer to the header (``n``) 216c73be61cSDavid Howellsshould be passed in. The notification may be larger than this and the size in 217c73be61cSDavid Howellsunits of buffer slots is noted in ``n->info & WATCH_INFO_LENGTH``. 218c73be61cSDavid Howells 219c73be61cSDavid HowellsThe ``cred`` struct indicates the credentials of the source (subject) and is 220c73be61cSDavid Howellspassed to the LSMs, such as SELinux, to allow or suppress the recording of the 221c73be61cSDavid Howellsnote in each individual queue according to the credentials of that queue 222c73be61cSDavid Howells(object). 223c73be61cSDavid Howells 224c73be61cSDavid HowellsThe ``id`` is the ID of the source object (such as the serial number on a key). 225c73be61cSDavid HowellsOnly watches that have the same ID set in them will see this notification. 226c73be61cSDavid Howells 227c73be61cSDavid Howells 228c73be61cSDavid HowellsWatch Sources 229c73be61cSDavid Howells============= 230c73be61cSDavid Howells 231c73be61cSDavid HowellsAny particular buffer can be fed from multiple sources. Sources include: 232c73be61cSDavid Howells 233c73be61cSDavid Howells * WATCH_TYPE_KEY_NOTIFY 234c73be61cSDavid Howells 235c73be61cSDavid Howells Notifications of this type indicate changes to keys and keyrings, including 236c73be61cSDavid Howells the changes of keyring contents or the attributes of keys. 237c73be61cSDavid Howells 238c73be61cSDavid Howells See Documentation/security/keys/core.rst for more information. 239c73be61cSDavid Howells 240c73be61cSDavid Howells 241c73be61cSDavid HowellsEvent Filtering 242c73be61cSDavid Howells=============== 243c73be61cSDavid Howells 244c73be61cSDavid HowellsOnce a watch queue has been created, a set of filters can be applied to limit 245c73be61cSDavid Howellsthe events that are received using:: 246c73be61cSDavid Howells 247c73be61cSDavid Howells struct watch_notification_filter filter = { 248c73be61cSDavid Howells ... 249c73be61cSDavid Howells }; 250c73be61cSDavid Howells ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter) 251c73be61cSDavid Howells 252c73be61cSDavid HowellsThe filter description is a variable of type:: 253c73be61cSDavid Howells 254c73be61cSDavid Howells struct watch_notification_filter { 255c73be61cSDavid Howells __u32 nr_filters; 256c73be61cSDavid Howells __u32 __reserved; 257c73be61cSDavid Howells struct watch_notification_type_filter filters[]; 258c73be61cSDavid Howells }; 259c73be61cSDavid Howells 260c73be61cSDavid HowellsWhere "nr_filters" is the number of filters in filters[] and "__reserved" 261c73be61cSDavid Howellsshould be 0. The "filters" array has elements of the following type:: 262c73be61cSDavid Howells 263c73be61cSDavid Howells struct watch_notification_type_filter { 264c73be61cSDavid Howells __u32 type; 265c73be61cSDavid Howells __u32 info_filter; 266c73be61cSDavid Howells __u32 info_mask; 267c73be61cSDavid Howells __u32 subtype_filter[8]; 268c73be61cSDavid Howells }; 269c73be61cSDavid Howells 270c73be61cSDavid HowellsWhere: 271c73be61cSDavid Howells 272c73be61cSDavid Howells * ``type`` is the event type to filter for and should be something like 273c73be61cSDavid Howells "WATCH_TYPE_KEY_NOTIFY" 274c73be61cSDavid Howells 275c73be61cSDavid Howells * ``info_filter`` and ``info_mask`` act as a filter on the info field of the 276c73be61cSDavid Howells notification record. The notification is only written into the buffer if:: 277c73be61cSDavid Howells 278c73be61cSDavid Howells (watch.info & info_mask) == info_filter 279c73be61cSDavid Howells 280c73be61cSDavid Howells This could be used, for example, to ignore events that are not exactly on 281c73be61cSDavid Howells the watched point in a mount tree. 282c73be61cSDavid Howells 283c73be61cSDavid Howells * ``subtype_filter`` is a bitmask indicating the subtypes that are of 284c73be61cSDavid Howells interest. Bit 0 of subtype_filter[0] corresponds to subtype 0, bit 1 to 285c73be61cSDavid Howells subtype 1, and so on. 286c73be61cSDavid Howells 287c73be61cSDavid HowellsIf the argument to the ioctl() is NULL, then the filters will be removed and 288c73be61cSDavid Howellsall events from the watched sources will come through. 289c73be61cSDavid Howells 290c73be61cSDavid Howells 291c73be61cSDavid HowellsUserspace Code Example 292c73be61cSDavid Howells====================== 293c73be61cSDavid Howells 294c73be61cSDavid HowellsA buffer is created with something like the following:: 295c73be61cSDavid Howells 296c73be61cSDavid Howells pipe2(fds, O_TMPFILE); 297c73be61cSDavid Howells ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256); 298c73be61cSDavid Howells 299c73be61cSDavid HowellsIt can then be set to receive keyring change notifications:: 300c73be61cSDavid Howells 301c73be61cSDavid Howells keyctl(KEYCTL_WATCH_KEY, KEY_SPEC_SESSION_KEYRING, fds[1], 0x01); 302c73be61cSDavid Howells 303c73be61cSDavid HowellsThe notifications can then be consumed by something like the following:: 304c73be61cSDavid Howells 305c73be61cSDavid Howells static void consumer(int rfd, struct watch_queue_buffer *buf) 306c73be61cSDavid Howells { 307c73be61cSDavid Howells unsigned char buffer[128]; 308c73be61cSDavid Howells ssize_t buf_len; 309c73be61cSDavid Howells 310c73be61cSDavid Howells while (buf_len = read(rfd, buffer, sizeof(buffer)), 311c73be61cSDavid Howells buf_len > 0 312c73be61cSDavid Howells ) { 313c73be61cSDavid Howells void *p = buffer; 314c73be61cSDavid Howells void *end = buffer + buf_len; 315c73be61cSDavid Howells while (p < end) { 316c73be61cSDavid Howells union { 317c73be61cSDavid Howells struct watch_notification n; 318c73be61cSDavid Howells unsigned char buf1[128]; 319c73be61cSDavid Howells } n; 320c73be61cSDavid Howells size_t largest, len; 321c73be61cSDavid Howells 322c73be61cSDavid Howells largest = end - p; 323c73be61cSDavid Howells if (largest > 128) 324c73be61cSDavid Howells largest = 128; 325c73be61cSDavid Howells memcpy(&n, p, largest); 326c73be61cSDavid Howells 327c73be61cSDavid Howells len = (n->info & WATCH_INFO_LENGTH) >> 328c73be61cSDavid Howells WATCH_INFO_LENGTH__SHIFT; 329c73be61cSDavid Howells if (len == 0 || len > largest) 330c73be61cSDavid Howells return; 331c73be61cSDavid Howells 332c73be61cSDavid Howells switch (n.n.type) { 333c73be61cSDavid Howells case WATCH_TYPE_META: 334c73be61cSDavid Howells got_meta(&n.n); 335c73be61cSDavid Howells case WATCH_TYPE_KEY_NOTIFY: 336c73be61cSDavid Howells saw_key_change(&n.n); 337c73be61cSDavid Howells break; 338c73be61cSDavid Howells } 339c73be61cSDavid Howells 340c73be61cSDavid Howells p += len; 341c73be61cSDavid Howells } 342c73be61cSDavid Howells } 343c73be61cSDavid Howells } 344