Lines Matching full:online

18 XFS Online Fsck Design
21 This document captures the design of the online filesystem check feature for
25 - To help kernel distributors understand exactly what the XFS online fsck
34 As the online fsck code is merged, the links in this document to topic branches
42 Parts 2 and 3 present a high level overview of how online fsck process works
49 might be built atop online fsck.
112 Each kernel patchset adding an online repair function will use the same branch
118 The online fsck tool described here will be the third tool in the history of
162 while the filesystem is online.
188 | The userspace driver program for the new online fsck tool can be |
190 | The kernel portion of online fsck that validates metadata is called |
191 | "online scrub", and portion of the kernel that fixes metadata is called |
192 | "online repair". |
208 In summary, online fsck takes advantage of resource sharding and redundant
217 Because it is necessary for online fsck to lock and scan live metadata objects,
218 online fsck consists of three separate code components.
224 and repair each type of online fsck work item.
229 | For brevity, this document shortens the phrase "online fsck work |
240 In principle, online fsck should be able to check and to repair everything that
242 However, online fsck cannot be running 100% of the time, which means that
247 A second limitation of online fsck is that it must follow the same resource
251 In other words, online fsck is not a complete replacement for offline fsck, and
252 a complete run of online fsck may take longer than online fsck.
254 different motivations of online fsck, which are to **minimize system downtime**
269 discover the online fsck capabilities of the kernel, and open the
312 to online fsck; neither of the previous tools have this capability.
403 Despite these limitations, the advantage that online repair holds is clear:
425 but are only needed for online fsck or for reorganization of the filesystem.
460 Introducing concurrency helps online repair avoid various locking problems, but
488 the new index already online.
498 To minimize changes to the rest of the codebase, XFS online repair keeps the
544 sections 2.12 ("Online Index Operations") through 2.14 ("Incremental View
549 Since quotas are non-negative integer counts of resource usage, online
558 Each online fsck function will be discussed as case studies later in this
564 During the development of online fsck, several risk factors were identified
574 reduces the ability of online fsck to find inconsistencies and repair them.
593 render the filesystem unusable, the online repair functions have been
597 - **Misbehavior**: Online fsck requires many privileges -- raw IO to block
639 With ample hardware availability in mind, the testing strategy for the online
652 This improves code quality by enabling the authors of online fsck to find and
659 Even before development work began on online fsck, fstests (when run on XFS)
664 During development of the online checking code, fstests was modified to run
668 To start development of online repair, fstests was modified to run
671 after it exists, or trigger complaints from the online check.
673 To complete the first phase of development of online repair, fstests was
675 This enables a comparison of the effectiveness of online repair as compared to
683 Before development of online fsck even began, a set of fstests were created
695 This part of the test suite was extended to cover online fsck in exactly the
708 3. Online repair (``xfs_scrub``) to detect and fix
713 The testing plan for online fsck includes extending the existing fs testing
747 4. Online checking (``xfs_scrub -n``)
748 5. Online repair (``xfs_scrub``)
749 … 6. Both repair tools (``xfs_scrub`` and then ``xfs_repair`` if online repair doesn't succeed)
763 allow the online fsck developers to compare online fsck against offline fsck,
777 A unique requirement to online fsck is the ability to operate on a filesystem
780 impact on the running system, the online repair code should never introduce
812 The primary user of online fsck is the system administrator, just like offline
814 Online fsck presents two modes of operation to administrators:
815 A foreground CLI process for online fsck on demand, and a background service
847 run online fsck automatically on weekends by default.
903 service window to run the online repair tool to correct the problem.
990 enabling online fsck and other requested functionality such as free space
1035 Online filesystem checking judges the consistency of each primary metadata
1041 what online checking can consult.
1131 Every online fsck scrubbing function is expected to read every ondisk metadata
1207 The XFS btree code has keyspace scanning functions that online fsck uses to
1507 and correction in the online and offline checking tools.
1509 Eventual Consistency vs. Online Fsck
1517 online checking must coordinate with chained operations that are in progress to
1519 Furthermore, online repair must not run when operations are pending because
1523 Only online fsck has this requirement of total consistency of AG metadata, and
1525 Online fsck coordinates with transaction chains as follows:
1532 * When online fsck wants to examine an AG, it should lock the AG header
1538 This may lead to online fsck taking a long time to complete, but regular
1549 Midway through the development of online scrubbing, the fsstress tests
1550 uncovered a misinteraction between online fsck and compound transaction chains
1665 However, online fsck changes the rules -- remember that although physical
1698 3. Teach online fsck to walk all transactions waiting for whichever lock(s)
1710 Online fsck uses an atomic intent item counter and lock cycling to coordinate
1721 is an explicit deprioritization of online fsck to benefit file operations.
1768 Online fsck for XFS separates the regular filesystem from the checking and
1770 However, there are a few parts of online fsck (such as the intent drains, and
1771 later, live update hooks) where it is useful for the online fsck code to know
1773 Since it is not expected that online fsck will be constantly running in the
1775 these hooks when online fsck is compiled into the kernel but not actively
1781 replace a static branch to hook code with ``nop`` sleds when online fsck isn't
1787 When online fsck enables the static key, the sled is replaced with an
1790 program that invoked online fsck, and can be amortized if multiple threads
1791 enter online fsck at the same time, or if multiple filesystems are being
1794 CPU initialization requires memory allocation, online fsck must be careful not
1817 distributor turns off online fsck at build time.
1827 Online scrub has resource acquisition helpers (e.g. ``xchk_perag_lock``) to
1846 Some online checking functions work by scanning the filesystem to build a
1849 For online repair to rebuild a metadata structure, it must compute the record
1869 At any given time, online fsck does not need to keep the entire record set in
1871 Continued development of online fsck demonstrated that the ability to perform
1883 | The first edition of online repair inserted records into a new btree as |
1912 to share functionality between online fsck functions.
1928 For online repair, squashing error conditions in this manner is an acceptable
1936 Online fsck must not drive the system into OOM conditions, which means that
2009 Array access patterns in online fsck tend to fall into three categories.
2087 During the fourth demonstration of online repair, a community reviewer remarked
2088 that for performance reasons, online repair ought to load batches of records
2206 Given that indexed lookups of scan data is required for both strategies, online
2271 An online fsck function that wants to create an xfbtree should proceed as
2344 As mentioned previously, early iterations of online repair built new btree
2357 To prepare for online fsck, each of the four bulk loaders were studied, notes
2569 Online repair functions minimize the chances of this occurring by using very
2778 Whenever online fsck builds a new data structure to replace one that is
2789 As part of a repair, online fsck relies heavily on the reverse mapping records
2850 As stated earlier, online repair functions use very large transactions to
3011 There is a very high potential for cache coherency issues if online fsck is not
3014 When online fsck wants to open a damaged file for scrubbing, it must use
3087 online fsck to check them, since there is no way to quiesce a percpu counter
3089 Although online fsck can read the filesystem metadata to compute the correct
3093 Earlier versions of online scrub would return to userspace with an incomplete
3099 To satisfy this requirement, online fsck must prevent other programs in the
3166 Like every other type of online repair, repairs are made by writing those
3170 Therefore, online fsck must build the infrastructure to manage a live scan of
3261 Online fsck functions scan all files in the filesystem as follows:
3292 `online quotacheck
3358 However, online fsck differs from regular XFS operations because it may examine
3362 The next few sections detail the specific ways in which online fsck takes care
3387 To capture these nuances, the online fsck code has a separate ``xchk_irele``
3414 Online fsck cannot abide these conventions, because for a directory tree
3421 Solving both of these problems is straightforward -- any time online fsck
3428 However, trylock loops means that online fsck must be prepared to measure the
3438 Online fsck must verify that the dotdot dirent of a directory points up to a
3463 The second piece of support that online fsck functions need during a full
3467 Two pieces of Linux kernel infrastructure enable online fsck to monitor regular
3472 In this case, the downstream consumer is always an online fsck function.
3473 Because multiple fsck functions can run in parallel, online fsck uses the Linux
3510 - The online fsck function should define a structure to hold scan data, a lock
3515 - The online fsck code must contain a C function to catch the hook action code
3520 - Prior to unlocking inodes to start the scan, online fsck must call
3524 - Online fsck must call ``xfs_hooks_del`` to disable the hook once the scan is
3529 zero when online fsck is not running.
3536 The code paths of the online fsck scanning code and the :ref:`hooked<fshooks>`
3614 It is useful to compare the mount time quotacheck code to the online repair
3632 Like most online fsck functions, online quotacheck can't write to regular
3635 Therefore, online quotacheck records file resource usage to a shadow dquot
3663 For online quotacheck, hooks are placed in steps 2 and 4.
3697 `online quotacheck
3860 Therefore, online repair of file-based metadata createas a temporary file in
3870 This dependency is the reason why online repair can only use pageable kernel
3916 | requirement means that online repair would have to be able to perform |
3949 Online repair code should use the ``xrep_tempfile_create`` function to create a
3994 for online repair because:
4001 b. Reverse-mapping is critical for the operation of online fsck, so the old
4010 d. Online repair needs to swap the contents of two files that are by definition
4250 referential integrity, so prior to performing the extent swap, online repair
4260 However, this iunlink processing omits the cross-link detection of online
4266 To repair a metadata file, online repair proceeds as follows:
4388 The best that online repair can do at this time is to read directory data
4476 Both online and offline repair can use this strategy.
4568 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=pptrs-online-dir-repai…
4610 Online reconstruction of a file's parent pointer information works similarly to
4651 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=pptrs-online-parent-re…
4746 Without parent pointers, the directory parent pointer online scrub code can
4841 for online fsck functionality.
5111 in this document and now has some familiarity with how XFS performs online
5126 necessary refinements to online repair and lack of customer demand mean that
5205 online fsck can use that instead of adding a separate vectored scrub system
5219 One serious shortcoming of the online fsck code is that the amount of time that
5246 The third piece is the ability to force an online repair.