Documentation/RCU/checklist.rst

1 .. SPDX-License-Identifier: GPL-2.0
8 This document contains a checklist for producing and reviewing patches
10 result in the same sorts of problems that leaving out a locking primitive
12 over a rather long period of time, but improvements are always welcome!
14 0.	Is RCU being applied to a read-mostly situation?  If the data
18 	tool for the job.  Yes, RCU does reduce read-side overhead by
19 	increasing write-side overhead, which is exactly why normal uses
23 	provides a simpler implementation.  An example of this situation
27 	Yet another exception is where the low real-time latency of RCU's
28 	read-side primitives is critically important.
33 	counter-intuitive situation where rcu_read_lock() and
43 	a.	locking,
45 	c.	restricting updates to a single task.
49 	them -- even x86 allows later loads to be reordered to precede
52 	explain how this single task does not become a major bottleneck
57 	but a hundred CPUs was unremarkable in 2017.
59 2.	Do the RCU read-side critical sections make proper use of
63 	under your read-side code, which can greatly increase the
66 	As a rough rule of thumb, any dereference of an RCU-protected
68 	rcu_read_lock_sched(), or by the appropriate update-side lock.
74 	only in non-preemptible kernels.  Such code can and will break,
77 	Letting RCU-protected pointers "leak" out of an RCU read-side
79 	from under a lock.  Unless, of course, you have arranged some
80 	other means of protection, such as a lock or a reference count
81 	*before* letting them out of the RCU read-side critical section.
87 	be running while updates are in progress.  There are a number
90 	a.	Use the RCU variants of the list and hlist update
92 		an RCU-protected list.	Alternatively, use the other
93 		RCU-protected data structures that have been added to
98 	b.	Proceed as in (a) above, but also maintain per-element
100 		that guard per-element state.  Fields that the readers
109 		Sequences of operations performed under a lock will *not*
112 		move multiple individual fields to a separate structure,
113 		thus solving the multiple-field problem by imposing an
116 		This can work, but is starting to get a bit tricky.
122 		usually liberally sprinkle memory-ordering operations
129 		changing data into a separate structure, so that the
130 		change may be made to appear atomic by updating a pointer
131 		to reference a new structure containing updated values.
134 	are weakly ordered -- even x86 CPUs allow later loads to be
136 	the following measures to prevent memory-corruption problems:
138 	a.	Readers must maintain proper ordering of their memory
151 		with a bit of devious creativity, it is possible to
156 		various "_rcu()" list-traversal primitives, such
158 		perfectly legal (if redundant) for update-side code to
159 		use rcu_dereference() and the "_rcu()" list-traversal
163 		of an RCU read-side critical section.  See lockdep.rst
167 		list-traversal primitives can substitute for a good
184 		may be used to replace an old structure with a new one
185 		in their respective types of RCU-protected lists.
188 		type of RCU-protected linked lists.
190 	e.	Updates must ensure that initialization of a given
193 		when publicizing a pointer to a structure that can
194 		be traversed by an RCU read-side critical section.
201 	to block, run that code in a workqueue handler scheduled from
212 	as the non-expedited forms, but expediting is more CPU intensive.
214 	configuration-change operations that would not normally be
215 	undertaken while a real-time workload is running.  Note that
216 	IPI-sensitive real-time workloads can use the rcupdate.rcu_normal
221 	primitives repeatedly in a loop, please do everyone a favor:
223 	a single non-expedited primitive to cover the entire batch.
226 	of the system, especially to real-time workloads running on the
230 7.	As of v4.20, a given kernel implements only one RCU flavor, which
231 	is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
235 	and re-enables softirq, for example, rcu_read_lock_bh() and
237 	and re-enables preemption, for example, rcu_read_lock_sched() and
241 	srcu_struct.  The rules for the expedited RCU grace-period-wait
242 	primitives are the same as for their non-expedited counterparts.
246 	a.	If the updater uses synchronize_rcu_tasks() or
263 	when using non-obvious pairs of primitives, commenting is
264 	of course a must.  One example of non-obvious pairing is
266 	network-driver NAPI (softirq) context.	BPF relies heavily on RCU
268 	invocation happens entirely within a single local_bh_disable()
269 	section in a NAPI poll cycle, this usage is safe.  The reason
280 	synchronize_rcu()'s multi-millisecond latency.	So please take
282 	memory-freeing capabilities where it applies.
285 	primitive is that it automatically self-limits: if grace periods
292 	Ways of gaining this self-limiting property when using call_rcu(),
295 	a.	Keeping a count of the number of data-structure elements
296 		used by the RCU-protected data structure, including
297 		those waiting for a grace period to elapse.  Enforce a
303 		One way to stall the updates is to acquire the update-side
304 		mutex.	(Don't try this with a spinlock -- other CPUs
307 		is for the updates to use a wrapper function around
317 		guarding updates with a global lock, limiting their rate.
319 	c.	Trusted update -- if updates can only be done manually by
325 	d.	Periodically invoke rcu_barrier(), permitting a limited
336 	a determined user or administrator can still exhaust memory.
337 	This is especially the case if a system with a large number of
339 	a single CPU, or if the system has relatively little free memory.
341 9.	All RCU list-traversal primitives, which include
343 	list_for_each_safe_rcu(), must be either within an RCU read-side
344 	critical section or must be protected by appropriate update-side
345 	locks.	RCU read-side critical sections are delimited by
351 	The reason that it is permissible to use RCU list-traversal
352 	primitives when the update-side lock is held is that doing so
361 	and the read-side markers (rcu_read_lock() and rcu_read_unlock(),
364 10.	Conversely, if you are in an RCU read-side critical section,
365 	and you don't hold the appropriate update-side lock, you *must*
372 	disable softirq on a given acquisition of that lock will result
380 	an issue, the memory-allocator locking handles it).  However,
381 	if the callbacks do manipulate a shared data structure, they
387 	For example, if a given CPU goes offline while having an RCU
389 	surviving CPU.	(If this was not the case, a self-spawning RCU
393 	for some  real-time workloads, this is the whole point of using
396 	In addition, do not assume that callbacks queued in a given order
398 	same CPU.  Furthermore, do not assume that same-CPU callbacks will
400 	switched between offloaded and de-offloaded callback invocation,
401 	and while a given CPU is undergoing such a switch, its callbacks
407 	SRCU read-side critical section (demarked by srcu_read_lock()
409 	Please note that if you don't need to sleep in read-side critical
416 	and cleanup_srcu_struct().  These last two are passed a
417 	"struct srcu_struct" that defines the scope of a given
420 	synchronize_srcu_expedited(), and call_srcu().	A given
421 	synchronize_srcu() waits only for SRCU read-side critical
424 	is what makes sleeping read-side critical sections tolerable --
425 	a given subsystem delays only its own updates, not those of other
427 	system than RCU would be if RCU's read-side critical sections
430 	The ability to sleep in read-side critical sections does not
433 	Second, grace-period-detection overhead is amortized only
434 	over those updates sharing a given srcu_struct, rather than
437 	only in extremely read-intensive situations, or in situations
438 	requiring SRCU's read-side deadlock immunity or low read-side
444 	real-time workloads than is synchronize_rcu_expedited().
446 	It is also permissible to sleep in RCU Tasks Trace read-side
448 	rcu_read_unlock_trace().  However, this is a specialized flavor
457 	is to wait until all pre-existing readers have finished before
458 	carrying out some otherwise-destructive operation.  It is
464 	Because these primitives only wait for pre-existing readers, it
468 15.	The various RCU read-side primitives do *not* necessarily contain
471 	read-side critical sections.  It is the responsibility of the
472 	RCU update-side primitives to deal with this.
475 	immediately after an srcu_read_unlock() to get a full barrier.
482 		check that accesses to RCU-protected data structures
483 		are carried out under the proper RCU read-side critical
494 		tag the pointer to the RCU-protected data structure
502 17.	If you pass a callback function defined within a module to one of
506 	Note that it is absolutely *not* sufficient to wait for a grace
514 	-	call_rcu() -> rcu_barrier()
515 	-	call_srcu() -> srcu_barrier()
516 	-	call_rcu_tasks() -> rcu_barrier_tasks()
517 	-	call_rcu_tasks_rude() -> rcu_barrier_tasks_rude()
518 	-	call_rcu_tasks_trace() -> rcu_barrier_tasks_trace()
521 	to wait for a grace period.  For example, if there are no
525 	So if you need to wait for both a grace period and for all
526 	pre-existing callbacks, you will need to invoke both functions,
529 	-	Either synchronize_rcu() or synchronize_rcu_expedited(),
531 	-	Either synchronize_srcu() or synchronize_srcu_expedited(),
533 	-	synchronize_rcu_tasks() and rcu_barrier_tasks()
534 	-	synchronize_tasks_rude() and rcu_barrier_tasks_rude()
535 	-	synchronize_tasks_trace() and rcu_barrier_tasks_trace()