xref: /qemu/docs/devel/rcu.rst (revision 90655d815a202104721e342009c1357101a04985)
17911747bSPaolo BonziniUsing RCU (Read-Copy-Update) for synchronization
27911747bSPaolo Bonzini================================================
37911747bSPaolo Bonzini
47911747bSPaolo BonziniRead-copy update (RCU) is a synchronization mechanism that is used to
57911747bSPaolo Bonziniprotect read-mostly data structures.  RCU is very efficient and scalable
67911747bSPaolo Bonzinion the read side (it is wait-free), and thus can make the read paths
77911747bSPaolo Bonziniextremely fast.
87911747bSPaolo Bonzini
97911747bSPaolo BonziniRCU supports concurrency between a single writer and multiple readers,
107911747bSPaolo Bonzinithus it is not used alone.  Typically, the write-side will use a lock to
117911747bSPaolo Bonziniserialize multiple updates, but other approaches are possible (e.g.,
127911747bSPaolo Bonzinirestricting updates to a single task).  In QEMU, when a lock is used,
137911747bSPaolo Bonzinithis will often be the "iothread mutex", also known as the "big QEMU
147911747bSPaolo Bonzinilock" (BQL).  Also, restricting updates to a single task is done in
157911747bSPaolo BonziniQEMU using the "bottom half" API.
167911747bSPaolo Bonzini
177911747bSPaolo BonziniRCU is fundamentally a "wait-to-finish" mechanism.  The read side marks
187911747bSPaolo Bonzinisections of code with "critical sections", and the update side will wait
197911747bSPaolo Bonzinifor the execution of all *currently running* critical sections before
207911747bSPaolo Bonziniproceeding, or before asynchronously executing a callback.
217911747bSPaolo Bonzini
227911747bSPaolo BonziniThe key point here is that only the currently running critical sections
23*90655d81SPeter Maydellare waited for; critical sections that are started **after** the beginning
247911747bSPaolo Bonziniof the wait do not extend the wait, despite running concurrently with
257911747bSPaolo Bonzinithe updater.  This is the reason why RCU is more scalable than,
267911747bSPaolo Bonzinifor example, reader-writer locks.  It is so much more scalable that
277911747bSPaolo Bonzinithe system will have a single instance of the RCU mechanism; a single
287911747bSPaolo Bonzinimechanism can be used for an arbitrary number of "things", without
297911747bSPaolo Bonzinihaving to worry about things such as contention or deadlocks.
307911747bSPaolo Bonzini
317911747bSPaolo BonziniHow is this possible?  The basic idea is to split updates in two phases,
327911747bSPaolo Bonzini"removal" and "reclamation".  During removal, we ensure that subsequent
337911747bSPaolo Bonzinireaders will not be able to get a reference to the old data.  After
347911747bSPaolo Bonziniremoval has completed, a critical section will not be able to access
357911747bSPaolo Bonzinithe old data.  Therefore, critical sections that begin after removal
367911747bSPaolo Bonzinido not matter; as soon as all previous critical sections have finished,
377911747bSPaolo Bonzinithere cannot be any readers who hold references to the data structure,
387911747bSPaolo Bonziniand these can now be safely reclaimed (e.g., freed or unref'ed).
397911747bSPaolo Bonzini
40*90655d81SPeter MaydellHere is a picture::
417911747bSPaolo Bonzini
427911747bSPaolo Bonzini        thread 1                  thread 2                  thread 3
437911747bSPaolo Bonzini    -------------------    ------------------------    -------------------
447911747bSPaolo Bonzini    enter RCU crit.sec.
457911747bSPaolo Bonzini           |                finish removal phase
467911747bSPaolo Bonzini           |                begin wait
477911747bSPaolo Bonzini           |                      |                    enter RCU crit.sec.
487911747bSPaolo Bonzini    exit RCU crit.sec             |                           |
497911747bSPaolo Bonzini                            complete wait                     |
507911747bSPaolo Bonzini                            begin reclamation phase           |
517911747bSPaolo Bonzini                                                       exit RCU crit.sec.
527911747bSPaolo Bonzini
537911747bSPaolo Bonzini
547911747bSPaolo BonziniNote how thread 3 is still executing its critical section when thread 2
557911747bSPaolo Bonzinistarts reclaiming data.  This is possible, because the old version of the
567911747bSPaolo Bonzinidata structure was not accessible at the time thread 3 began executing
577911747bSPaolo Bonzinithat critical section.
587911747bSPaolo Bonzini
597911747bSPaolo Bonzini
607911747bSPaolo BonziniRCU API
61*90655d81SPeter Maydell-------
627911747bSPaolo Bonzini
637911747bSPaolo BonziniThe core RCU API is small:
647911747bSPaolo Bonzini
65*90655d81SPeter Maydell``void rcu_read_lock(void);``
667911747bSPaolo Bonzini        Used by a reader to inform the reclaimer that the reader is
677911747bSPaolo Bonzini        entering an RCU read-side critical section.
687911747bSPaolo Bonzini
69*90655d81SPeter Maydell``void rcu_read_unlock(void);``
707911747bSPaolo Bonzini        Used by a reader to inform the reclaimer that the reader is
717911747bSPaolo Bonzini        exiting an RCU read-side critical section.  Note that RCU
727911747bSPaolo Bonzini        read-side critical sections may be nested and/or overlapping.
737911747bSPaolo Bonzini
74*90655d81SPeter Maydell``void synchronize_rcu(void);``
757911747bSPaolo Bonzini        Blocks until all pre-existing RCU read-side critical sections
767911747bSPaolo Bonzini        on all threads have completed.  This marks the end of the removal
777911747bSPaolo Bonzini        phase and the beginning of reclamation phase.
787911747bSPaolo Bonzini
797911747bSPaolo Bonzini        Note that it would be valid for another update to come while
80*90655d81SPeter Maydell        ``synchronize_rcu`` is running.  Because of this, it is better that
817911747bSPaolo Bonzini        the updater releases any locks it may hold before calling
82*90655d81SPeter Maydell        ``synchronize_rcu``.  If this is not possible (for example, because
83*90655d81SPeter Maydell        the updater is protected by the BQL), you can use ``call_rcu``.
8426387f86SPaolo Bonzini
85*90655d81SPeter Maydell``void call_rcu1(struct rcu_head * head, void (*func)(struct rcu_head *head));``
86*90655d81SPeter Maydell        This function invokes ``func(head)`` after all pre-existing RCU
8726387f86SPaolo Bonzini        read-side critical sections on all threads have completed.  This
8826387f86SPaolo Bonzini        marks the end of the removal phase, with func taking care
8926387f86SPaolo Bonzini        asynchronously of the reclamation phase.
9026387f86SPaolo Bonzini
91*90655d81SPeter Maydell        The ``foo`` struct needs to have an ``rcu_head`` structure added,
92*90655d81SPeter Maydell        perhaps as follows::
9326387f86SPaolo Bonzini
9426387f86SPaolo Bonzini            struct foo {
9526387f86SPaolo Bonzini                struct rcu_head rcu;
9626387f86SPaolo Bonzini                int a;
9726387f86SPaolo Bonzini                char b;
9826387f86SPaolo Bonzini                long c;
9926387f86SPaolo Bonzini            };
10026387f86SPaolo Bonzini
101*90655d81SPeter Maydell        so that the reclaimer function can fetch the ``struct foo`` address
102*90655d81SPeter Maydell        and free it::
10326387f86SPaolo Bonzini
10426387f86SPaolo Bonzini            call_rcu1(&foo.rcu, foo_reclaim);
10526387f86SPaolo Bonzini
10626387f86SPaolo Bonzini            void foo_reclaim(struct rcu_head *rp)
10726387f86SPaolo Bonzini            {
10826387f86SPaolo Bonzini                struct foo *fp = container_of(rp, struct foo, rcu);
10926387f86SPaolo Bonzini                g_free(fp);
11026387f86SPaolo Bonzini            }
11126387f86SPaolo Bonzini
112*90655d81SPeter Maydell        ``call_rcu1`` is typically used via either the ``call_rcu`` or
113*90655d81SPeter Maydell        ``g_free_rcu`` macros, which handle the common case where the
114*90655d81SPeter Maydell        ``rcu_head`` member is the first of the struct.
11526387f86SPaolo Bonzini
116*90655d81SPeter Maydell``void call_rcu(T *p, void (*func)(T *p), field-name);``
117*90655d81SPeter Maydell        If the ``struct rcu_head`` is the first field in the struct, you can
118*90655d81SPeter Maydell        use this macro instead of ``call_rcu1``.
11926387f86SPaolo Bonzini
120*90655d81SPeter Maydell``void g_free_rcu(T *p, field-name);``
121*90655d81SPeter Maydell        This is a special-case version of ``call_rcu`` where the callback
122*90655d81SPeter Maydell        function is ``g_free``.
123*90655d81SPeter Maydell        In the example given in ``call_rcu1``, one could have written simply::
12426387f86SPaolo Bonzini
1259bda456eSSergey Fedorov            g_free_rcu(&foo, rcu);
1267911747bSPaolo Bonzini
127*90655d81SPeter Maydell``typeof(*p) qatomic_rcu_read(p);``
128*90655d81SPeter Maydell        ``qatomic_rcu_read()`` is similar to ``qatomic_load_acquire()``, but
129*90655d81SPeter Maydell        it makes some assumptions on the code that calls it.  This allows a
130*90655d81SPeter Maydell        more optimized implementation.
1317911747bSPaolo Bonzini
132*90655d81SPeter Maydell        ``qatomic_rcu_read`` assumes that whenever a single RCU critical
1337911747bSPaolo Bonzini        section reads multiple shared data, these reads are either
1347911747bSPaolo Bonzini        data-dependent or need no ordering.  This is almost always the
1357911747bSPaolo Bonzini        case when using RCU, because read-side critical sections typically
1367911747bSPaolo Bonzini        navigate one or more pointers (the pointers that are changed on
1377911747bSPaolo Bonzini        every update) until reaching a data structure of interest,
1387911747bSPaolo Bonzini        and then read from there.
1397911747bSPaolo Bonzini
140*90655d81SPeter Maydell        RCU read-side critical sections must use ``qatomic_rcu_read()`` to
14185cdeb36SPranith Kumar        read data, unless concurrent writes are prevented by another
1427911747bSPaolo Bonzini        synchronization mechanism.
1437911747bSPaolo Bonzini
1447911747bSPaolo Bonzini        Furthermore, RCU read-side critical sections should traverse the
1457911747bSPaolo Bonzini        data structure in a single direction, opposite to the direction
1467911747bSPaolo Bonzini        in which the updater initializes it.
1477911747bSPaolo Bonzini
148*90655d81SPeter Maydell``void qatomic_rcu_set(p, typeof(*p) v);``
149*90655d81SPeter Maydell        ``qatomic_rcu_set()`` is similar to ``qatomic_store_release()``,
150*90655d81SPeter Maydell        though it also makes assumptions on the code that calls it in
151*90655d81SPeter Maydell        order to allow a more optimized implementation.
1527911747bSPaolo Bonzini
153*90655d81SPeter Maydell        In particular, ``qatomic_rcu_set()`` suffices for synchronization
1547911747bSPaolo Bonzini        with readers, if the updater never mutates a field within a
1557911747bSPaolo Bonzini        data item that is already accessible to readers.  This is the
1567911747bSPaolo Bonzini        case when initializing a new copy of the RCU-protected data
157*90655d81SPeter Maydell        structure; just ensure that initialization of ``*p`` is carried out
158*90655d81SPeter Maydell        before ``qatomic_rcu_set()`` makes the data item visible to readers.
1597911747bSPaolo Bonzini        If this rule is observed, writes will happen in the opposite
1607911747bSPaolo Bonzini        order as reads in the RCU read-side critical sections (or if
1617911747bSPaolo Bonzini        there is just one update), and there will be no need for other
1627911747bSPaolo Bonzini        synchronization mechanism to coordinate the accesses.
1637911747bSPaolo Bonzini
1647911747bSPaolo BonziniThe following APIs must be used before RCU is used in a thread:
1657911747bSPaolo Bonzini
166*90655d81SPeter Maydell``void rcu_register_thread(void);``
1677911747bSPaolo Bonzini        Mark a thread as taking part in the RCU mechanism.  Such a thread
1687911747bSPaolo Bonzini        will have to report quiescent points regularly, either manually
169*90655d81SPeter Maydell        or through the ``QemuCond``/``QemuSemaphore``/``QemuEvent`` APIs.
1707911747bSPaolo Bonzini
171*90655d81SPeter Maydell``void rcu_unregister_thread(void);``
1727911747bSPaolo Bonzini        Mark a thread as not taking part anymore in the RCU mechanism.
1737911747bSPaolo Bonzini        It is not a problem if such a thread reports quiescent points,
174*90655d81SPeter Maydell        either manually or by using the
175*90655d81SPeter Maydell        ``QemuCond``/``QemuSemaphore``/``QemuEvent`` APIs.
1767911747bSPaolo Bonzini
177*90655d81SPeter MaydellNote that these APIs are relatively heavyweight, and should **not** be
1787911747bSPaolo Bonzininested.
1797911747bSPaolo Bonzini
1805626f8c6SDr. David Alan GilbertConvenience macros
181*90655d81SPeter Maydell------------------
1825626f8c6SDr. David Alan Gilbert
1835626f8c6SDr. David Alan GilbertTwo macros are provided that automatically release the read lock at the
1845626f8c6SDr. David Alan Gilbertend of the scope.
1855626f8c6SDr. David Alan Gilbert
186*90655d81SPeter Maydell``RCU_READ_LOCK_GUARD()``
1875626f8c6SDr. David Alan Gilbert         Takes the lock and will release it at the end of the block it's
1885626f8c6SDr. David Alan Gilbert         used in.
1895626f8c6SDr. David Alan Gilbert
190*90655d81SPeter Maydell``WITH_RCU_READ_LOCK_GUARD()  { code }``
1915626f8c6SDr. David Alan Gilbert         Is used at the head of a block to protect the code within the block.
1925626f8c6SDr. David Alan Gilbert
193*90655d81SPeter MaydellNote that a ``goto`` out of the guarded block will also drop the lock.
1947911747bSPaolo Bonzini
195*90655d81SPeter MaydellDifferences with Linux
196*90655d81SPeter Maydell----------------------
1977911747bSPaolo Bonzini
1987911747bSPaolo Bonzini- Waiting on a mutex is possible, though discouraged, within an RCU critical
1997911747bSPaolo Bonzini  section.  This is because spinlocks are rarely (if ever) used in userspace
2007911747bSPaolo Bonzini  programming; not allowing this would prevent upgrading an RCU read-side
2017911747bSPaolo Bonzini  critical section to become an updater.
2027911747bSPaolo Bonzini
203*90655d81SPeter Maydell- ``qatomic_rcu_read`` and ``qatomic_rcu_set`` replace ``rcu_dereference`` and
204*90655d81SPeter Maydell  ``rcu_assign_pointer``.  They take a **pointer** to the variable being accessed.
2057911747bSPaolo Bonzini
206*90655d81SPeter Maydell- ``call_rcu`` is a macro that has an extra argument (the name of the first
207*90655d81SPeter Maydell  field in the struct, which must be a struct ``rcu_head``), and expects the
20826387f86SPaolo Bonzini  type of the callback's argument to be the type of the first argument.
209*90655d81SPeter Maydell  ``call_rcu1`` is the same as Linux's ``call_rcu``.
21026387f86SPaolo Bonzini
2117911747bSPaolo Bonzini
212*90655d81SPeter MaydellRCU Patterns
213*90655d81SPeter Maydell------------
2147911747bSPaolo Bonzini
2157911747bSPaolo BonziniMany patterns using read-writer locks translate directly to RCU, with
2167911747bSPaolo Bonzinithe advantages of higher scalability and deadlock immunity.
2177911747bSPaolo Bonzini
2187911747bSPaolo BonziniIn general, RCU can be used whenever it is possible to create a new
2197911747bSPaolo Bonzini"version" of a data structure every time the updater runs.  This may
2207911747bSPaolo Bonzinisound like a very strict restriction, however:
2217911747bSPaolo Bonzini
2227911747bSPaolo Bonzini- the updater does not mean "everything that writes to a data structure",
2237911747bSPaolo Bonzini  but rather "everything that involves a reclamation step".  See the
2247911747bSPaolo Bonzini  array example below
2257911747bSPaolo Bonzini
2267911747bSPaolo Bonzini- in some cases, creating a new version of a data structure may actually
2277911747bSPaolo Bonzini  be very cheap.  For example, modifying the "next" pointer of a singly
2287911747bSPaolo Bonzini  linked list is effectively creating a new version of the list.
2297911747bSPaolo Bonzini
2307911747bSPaolo BonziniHere are some frequently-used RCU idioms that are worth noting.
2317911747bSPaolo Bonzini
2327911747bSPaolo Bonzini
2337911747bSPaolo BonziniRCU list processing
234*90655d81SPeter Maydell^^^^^^^^^^^^^^^^^^^
2357911747bSPaolo Bonzini
2367911747bSPaolo BonziniTBD (not yet used in QEMU)
2377911747bSPaolo Bonzini
2387911747bSPaolo Bonzini
2397911747bSPaolo BonziniRCU reference counting
240*90655d81SPeter Maydell^^^^^^^^^^^^^^^^^^^^^^
2417911747bSPaolo Bonzini
2427911747bSPaolo BonziniBecause grace periods are not allowed to complete while there is an RCU
2437911747bSPaolo Bonziniread-side critical section in progress, the RCU read-side primitives
2447911747bSPaolo Bonzinimay be used as a restricted reference-counting mechanism.  For example,
245*90655d81SPeter Maydellconsider the following code fragment::
2467911747bSPaolo Bonzini
2477911747bSPaolo Bonzini    rcu_read_lock();
248d73415a3SStefan Hajnoczi    p = qatomic_rcu_read(&foo);
2497911747bSPaolo Bonzini    /* do something with p. */
2507911747bSPaolo Bonzini    rcu_read_unlock();
2517911747bSPaolo Bonzini
252*90655d81SPeter MaydellThe RCU read-side critical section ensures that the value of ``p`` remains
253*90655d81SPeter Maydellvalid until after the ``rcu_read_unlock()``.  In some sense, it is acquiring
254*90655d81SPeter Maydella reference to ``p`` that is later released when the critical section ends.
255*90655d81SPeter MaydellThe write side looks simply like this (with appropriate locking)::
2567911747bSPaolo Bonzini
2577911747bSPaolo Bonzini    qemu_mutex_lock(&foo_mutex);
2587911747bSPaolo Bonzini    old = foo;
259d73415a3SStefan Hajnoczi    qatomic_rcu_set(&foo, new);
2607911747bSPaolo Bonzini    qemu_mutex_unlock(&foo_mutex);
2617911747bSPaolo Bonzini    synchronize_rcu();
2627911747bSPaolo Bonzini    free(old);
2637911747bSPaolo Bonzini
26426387f86SPaolo BonziniIf the processing cannot be done purely within the critical section, it
265*90655d81SPeter Maydellis possible to combine this idiom with a "real" reference count::
26626387f86SPaolo Bonzini
26726387f86SPaolo Bonzini    rcu_read_lock();
268d73415a3SStefan Hajnoczi    p = qatomic_rcu_read(&foo);
26926387f86SPaolo Bonzini    foo_ref(p);
27026387f86SPaolo Bonzini    rcu_read_unlock();
27126387f86SPaolo Bonzini    /* do something with p. */
27226387f86SPaolo Bonzini    foo_unref(p);
27326387f86SPaolo Bonzini
274*90655d81SPeter MaydellThe write side can be like this::
27526387f86SPaolo Bonzini
27626387f86SPaolo Bonzini    qemu_mutex_lock(&foo_mutex);
27726387f86SPaolo Bonzini    old = foo;
278d73415a3SStefan Hajnoczi    qatomic_rcu_set(&foo, new);
27926387f86SPaolo Bonzini    qemu_mutex_unlock(&foo_mutex);
28026387f86SPaolo Bonzini    synchronize_rcu();
28126387f86SPaolo Bonzini    foo_unref(old);
28226387f86SPaolo Bonzini
283*90655d81SPeter Maydellor with ``call_rcu``::
28426387f86SPaolo Bonzini
28526387f86SPaolo Bonzini    qemu_mutex_lock(&foo_mutex);
28626387f86SPaolo Bonzini    old = foo;
287d73415a3SStefan Hajnoczi    qatomic_rcu_set(&foo, new);
28826387f86SPaolo Bonzini    qemu_mutex_unlock(&foo_mutex);
28926387f86SPaolo Bonzini    call_rcu(foo_unref, old, rcu);
29026387f86SPaolo Bonzini
29126387f86SPaolo BonziniIn both cases, the write side only performs removal.  Reclamation
292*90655d81SPeter Maydellhappens when the last reference to a ``foo`` object is dropped.
293*90655d81SPeter MaydellUsing ``synchronize_rcu()`` is undesirably expensive, because the
29426387f86SPaolo Bonzinilast reference may be dropped on the read side.  Hence you can
295*90655d81SPeter Maydelluse ``call_rcu()`` instead::
29626387f86SPaolo Bonzini
29726387f86SPaolo Bonzini     foo_unref(struct foo *p) {
298d73415a3SStefan Hajnoczi        if (qatomic_fetch_dec(&p->refcount) == 1) {
29926387f86SPaolo Bonzini            call_rcu(foo_destroy, p, rcu);
30026387f86SPaolo Bonzini        }
30126387f86SPaolo Bonzini    }
30226387f86SPaolo Bonzini
30326387f86SPaolo Bonzini
30426387f86SPaolo BonziniNote that the same idioms would be possible with reader/writer
305*90655d81SPeter Maydelllocks::
3067911747bSPaolo Bonzini
3077911747bSPaolo Bonzini    read_lock(&foo_rwlock);         write_mutex_lock(&foo_rwlock);
3087911747bSPaolo Bonzini    p = foo;                        p = foo;
3097911747bSPaolo Bonzini    /* do something with p. */      foo = new;
3107911747bSPaolo Bonzini    read_unlock(&foo_rwlock);       free(p);
3117911747bSPaolo Bonzini                                    write_mutex_unlock(&foo_rwlock);
3127911747bSPaolo Bonzini                                    free(p);
3137911747bSPaolo Bonzini
31426387f86SPaolo Bonzini    ------------------------------------------------------------------
31526387f86SPaolo Bonzini
31626387f86SPaolo Bonzini    read_lock(&foo_rwlock);         write_mutex_lock(&foo_rwlock);
31726387f86SPaolo Bonzini    p = foo;                        old = foo;
31826387f86SPaolo Bonzini    foo_ref(p);                     foo = new;
31926387f86SPaolo Bonzini    read_unlock(&foo_rwlock);       foo_unref(old);
32026387f86SPaolo Bonzini    /* do something with p. */      write_mutex_unlock(&foo_rwlock);
32126387f86SPaolo Bonzini    read_lock(&foo_rwlock);
32226387f86SPaolo Bonzini    foo_unref(p);
32326387f86SPaolo Bonzini    read_unlock(&foo_rwlock);
32426387f86SPaolo Bonzini
325*90655d81SPeter Maydell``foo_unref`` could use a mechanism such as bottom halves to move deallocation
32626387f86SPaolo Bonziniout of the write-side critical section.
32726387f86SPaolo Bonzini
3287911747bSPaolo Bonzini
3297911747bSPaolo BonziniRCU resizable arrays
330*90655d81SPeter Maydell^^^^^^^^^^^^^^^^^^^^
3317911747bSPaolo Bonzini
3327911747bSPaolo BonziniResizable arrays can be used with RCU.  The expensive RCU synchronization
333*90655d81SPeter Maydell(or ``call_rcu``) only needs to take place when the array is resized.
33426387f86SPaolo BonziniThe two items to take care of are:
3357911747bSPaolo Bonzini
3367911747bSPaolo Bonzini- ensuring that the old version of the array is available between removal
3377911747bSPaolo Bonzini  and reclamation;
3387911747bSPaolo Bonzini
3397911747bSPaolo Bonzini- avoiding mismatches in the read side between the array data and the
3407911747bSPaolo Bonzini  array size.
3417911747bSPaolo Bonzini
342*90655d81SPeter MaydellThe first problem is avoided simply by not using ``realloc``.  Instead,
3437911747bSPaolo Bonzinieach resize will allocate a new array and copy the old data into it.
3447911747bSPaolo BonziniThe second problem would arise if the size and the data pointers were
345*90655d81SPeter Maydelltwo members of a larger struct::
3467911747bSPaolo Bonzini
3477911747bSPaolo Bonzini    struct mystuff {
3487911747bSPaolo Bonzini        ...
3497911747bSPaolo Bonzini        int data_size;
3507911747bSPaolo Bonzini        int data_alloc;
3517911747bSPaolo Bonzini        T   *data;
3527911747bSPaolo Bonzini        ...
3537911747bSPaolo Bonzini    };
3547911747bSPaolo Bonzini
355*90655d81SPeter MaydellInstead, we store the size of the array with the array itself::
3567911747bSPaolo Bonzini
3577911747bSPaolo Bonzini    struct arr {
3587911747bSPaolo Bonzini        int size;
3597911747bSPaolo Bonzini        int alloc;
3607911747bSPaolo Bonzini        T   data[];
3617911747bSPaolo Bonzini    };
3627911747bSPaolo Bonzini    struct arr *global_array;
3637911747bSPaolo Bonzini
3647911747bSPaolo Bonzini    read side:
3657911747bSPaolo Bonzini        rcu_read_lock();
366d73415a3SStefan Hajnoczi        struct arr *array = qatomic_rcu_read(&global_array);
3677911747bSPaolo Bonzini        x = i < array->size ? array->data[i] : -1;
3687911747bSPaolo Bonzini        rcu_read_unlock();
3697911747bSPaolo Bonzini        return x;
3707911747bSPaolo Bonzini
3717911747bSPaolo Bonzini    write side (running under a lock):
3727911747bSPaolo Bonzini        if (global_array->size == global_array->alloc) {
3737911747bSPaolo Bonzini            /* Creating a new version.  */
3747911747bSPaolo Bonzini            new_array = g_malloc(sizeof(struct arr) +
3757911747bSPaolo Bonzini                                 global_array->alloc * 2 * sizeof(T));
3767911747bSPaolo Bonzini            new_array->size = global_array->size;
3777911747bSPaolo Bonzini            new_array->alloc = global_array->alloc * 2;
3787911747bSPaolo Bonzini            memcpy(new_array->data, global_array->data,
3797911747bSPaolo Bonzini                   global_array->alloc * sizeof(T));
3807911747bSPaolo Bonzini
3817911747bSPaolo Bonzini            /* Removal phase.  */
3827911747bSPaolo Bonzini            old_array = global_array;
383d533d635SKeqian Zhu            qatomic_rcu_set(&global_array, new_array);
3847911747bSPaolo Bonzini            synchronize_rcu();
3857911747bSPaolo Bonzini
3867911747bSPaolo Bonzini            /* Reclamation phase.  */
3877911747bSPaolo Bonzini            free(old_array);
3887911747bSPaolo Bonzini        }
3897911747bSPaolo Bonzini
3907911747bSPaolo Bonzini
391*90655d81SPeter MaydellReferences
392*90655d81SPeter Maydell----------
3937911747bSPaolo Bonzini
394*90655d81SPeter Maydell* The `Linux kernel RCU documentation <https://docs.kernel.org/RCU/>`__
395