|
Revision tags: v5.19-rc8, v5.19-rc7 |
|
| #
0d8869fb
|
| 12-Jul-2022 |
Filipe Manana <fdmanana@suse.com> |
btrfs: send: always use the rbtree based inode ref management infrastructure
After the patch "btrfs: send: fix sending link commands for existing file paths", we now have two infrastructures to dete
btrfs: send: always use the rbtree based inode ref management infrastructure
After the patch "btrfs: send: fix sending link commands for existing file paths", we now have two infrastructures to detect and eliminate duplicated inode references (due to names that got removed and re-added between the send and parent snapshots):
1) One that works on a single inode ref/extref item;
2) A new one that works acrosss all ref/extref items for an inode, and it's also more efficient because even in the single ref/extref item case, it does not do a linear search for all the names encoded in the ref/extref item, it uses red black trees to speedup up the search.
There's no good reason to keep both infrastructures, we can use the new one everywhere, and it's always more efficient.
So remove the old infrastructure and change all sites that are using it to use the new one.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
3aa5bd36
|
| 12-Jul-2022 |
BingJing Chang <bingjingc@synology.com> |
btrfs: send: fix sending link commands for existing file paths
There is a bug sending link commands for existing file paths. When we're processing an inode, we go over all references. All the new fi
btrfs: send: fix sending link commands for existing file paths
There is a bug sending link commands for existing file paths. When we're processing an inode, we go over all references. All the new file paths are added to the "new_refs" list. And all the deleted file paths are added to the "deleted_refs" list. In the end, when we finish processing the inode, we iterate over all the items in the "new_refs" list and send link commands for those file paths. After that, we go over all the items in the "deleted_refs" list and send unlink commands for them. If there are duplicated file paths in both lists, we will try to create them before we remove them. Then the receiver gets an -EEXIST error when trying the link operations.
Example for having duplicated file paths in both list:
$ btrfs subvolume create vol
# create a file and 2000 hard links to the same inode $ touch vol/foo $ for i in {1..2000}; do link vol/foo vol/$i ; done
# take a snapshot for a parent snapshot $ btrfs subvolume snapshot -r vol snap1
# remove 2000 hard links and re-create the last 1000 links $ for i in {1..2000}; do rm vol/$i; done; $ for i in {1001..2000}; do link vol/foo vol/$i; done
# take another one for a send snapshot $ btrfs subvolume snapshot -r vol snap2
$ mkdir receive_dir $ btrfs send snap2 -p snap1 | btrfs receive receive_dir/ At subvol snap2 link 1238 -> foo ERROR: link 1238 -> foo failed: File exists
In this case, we will have the same file paths added to both lists. In the parent snapshot, reference paths {1..1237} are stored in inode references, but reference paths {1238..2000} are stored in inode extended references. In the send snapshot, all reference paths {1001..2000} are stored in inode references. During the incremental send, we process their inode references first. In record_changed_ref(), we iterate all its inode references in the send/parent snapshot. For every inode reference, we also use find_iref() to check whether the same file path also appears in the parent/send snapshot or not. Inode references {1238..2000} which appear in the send snapshot but not in the parent snapshot are added to the "new_refs" list. On the other hand, Inode references {1..1000} which appear in the parent snapshot but not in the send snapshot are added to the "deleted_refs" list. Next, when we process their inode extended references, reference paths {1238..2000} are added to the "deleted_refs" list because all of them only appear in the parent snapshot. Now two lists contain items as below: "new_refs" list: {1238..2000} "deleted_refs" list: {1..1000}, {1238..2000}
Reference paths {1238..2000} appear in both lists. And as the processing order mentioned about before, the receiver gets an -EEXIST error when trying the link operations.
To fix the bug, the idea is to process the "deleted_refs" list before the "new_refs" list. However, it's not easy to reshuffle the processing order. For one reason, if we do so, we may unlink all the existing paths first, there's no valid path anymore for links. And it's inefficient because we do a bunch of unlinks followed by links for the same paths. Moreover, it makes less sense to have duplications in both lists. A reference path cannot not only be regarded as new but also has been seen in the past, or we won't call it a new path. However, it's also not a good idea to make find_iref() check a reference against all inode references and all inode extended references because it may result in large disk reads.
So we introduce two rbtrees to make the references easier for lookups. And we also introduce record_new_ref_if_needed() and record_deleted_ref_if_needed() for changed_ref() to check and remove duplicated references early.
Reviewed-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: BingJing Chang <bingjingc@synology.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
71ecfc13
|
| 12-Jul-2022 |
BingJing Chang <bingjingc@synology.com> |
btrfs: send: introduce recorded_ref_alloc and recorded_ref_free
Introduce wrappers to allocate and free recorded_ref structures.
Reviewed-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe M
btrfs: send: introduce recorded_ref_alloc and recorded_ref_free
Introduce wrappers to allocate and free recorded_ref structures.
Reviewed-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: BingJing Chang <bingjingc@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
|
Revision tags: v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18 |
|
| #
48247359
|
| 18-May-2022 |
David Sterba <dsterba@suse.com> |
btrfs: send: add new command FILEATTR for file attributes
There are file attributes inherited from previous ext2 SETFLAGS/GETFLAGS and later from XFLAGS interfaces, now commonly found under the 'fil
btrfs: send: add new command FILEATTR for file attributes
There are file attributes inherited from previous ext2 SETFLAGS/GETFLAGS and later from XFLAGS interfaces, now commonly found under the 'fileattr' API. This corresponds to the individual inode bits and that's part of the on-disk format, so this is suitable for the protocol. The other interfaces contain a lot of cruft or bits that btrfs does not support yet.
Currently the value is u64 and matches btrfs_inode_item. Not all the bits can be set by ioctls (like NODATASUM or READONLY), but we can send them over the protocol and leave it up to the receiving side what and how to apply.
As some of the flags, eg. IMMUTABLE, can prevent any further changes, the receiving side needs to understand that and apply the changes in the right order, or possibly with some intermediate steps. This should be easier, future proof and simpler on the protocol layer than implementing in kernel.
Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
22a5b2ab
|
| 17-May-2022 |
David Sterba <dsterba@suse.com> |
btrfs: send: add OTIME as utimes attribute for proto 2+ by default
When send v1 was introduced the otime (inode creation time) was not available, however the attribute in btrfs send protocol exists.
btrfs: send: add OTIME as utimes attribute for proto 2+ by default
When send v1 was introduced the otime (inode creation time) was not available, however the attribute in btrfs send protocol exists. Though it would be possible to add it for v1 too as the attribute would be ignored by v1 receive, let's not change the layout of v1 and only add that to v2+. The otime cannot be changed and is only informative.
Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
9555e1f1
|
| 02-Jun-2022 |
David Sterba <dsterba@suse.com> |
btrfs: send: use boolean types for current inode status
The new, new_gen and deleted indicate a status, use boolean type instead of int.
Signed-off-by: David Sterba <dsterba@suse.com>
|
| #
cec3dad9
|
| 02-Jun-2022 |
David Sterba <dsterba@suse.com> |
btrfs: send: remove old TODO regarding ERESTARTSYS
The whole send operation is restartable and handling properly a buffer write may not be easy. We can't know what caused that and if a short delay a
btrfs: send: remove old TODO regarding ERESTARTSYS
The whole send operation is restartable and handling properly a buffer write may not be easy. We can't know what caused that and if a short delay and retry will fix it or how many retries should be performed in case it's a temporary condition.
The error value is returned to the ioctl caller so in case it's transient problem, the user would be notified about the reason. Remove the TODO note as there's no plan to handle ERESTARTSYS.
Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
8234d3f6
|
| 02-Jun-2022 |
David Sterba <dsterba@suse.com> |
btrfs: send: simplify includes
We don't need the whole ctree.h in send.h, none of the data types defined there are used.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Revision tags: v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17 |
|
| #
d6815592
|
| 17-Mar-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: enable support for stream v2 and compressed writes
Now that the new support is implemented, allow the ioctl to accept v2 and the compressed flag, and update the version in sysfs.
Signe
btrfs: send: enable support for stream v2 and compressed writes
Now that the new support is implemented, allow the ioctl to accept v2 and the compressed flag, and update the version in sysfs.
Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
3ea4dc5b
|
| 17-Mar-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: send compressed extents with encoded writes
Now that all of the pieces are in place, we can use the ENCODED_WRITE command to send compressed extents when appropriate.
Signed-off-by: Om
btrfs: send: send compressed extents with encoded writes
Now that all of the pieces are in place, we can use the ENCODED_WRITE command to send compressed extents when appropriate.
Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
a4b333f2
|
| 04-Apr-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: get send buffer pages for protocol v2
For encoded writes in send v2, we will get the encoded data with btrfs_encoded_read_regular_fill_pages(), which expects a list of raw pages. To avo
btrfs: send: get send buffer pages for protocol v2
For encoded writes in send v2, we will get the encoded data with btrfs_encoded_read_regular_fill_pages(), which expects a list of raw pages. To avoid extra buffers and copies, we should read directly into the send buffer. Therefore, we need the raw pages for the send buffer.
We currently allocate the send buffer with kvmalloc(), which may return a kmalloc'd buffer or a vmalloc'd buffer. For vmalloc, we can get the pages with vmalloc_to_page(). For kmalloc, we could use virt_to_page(). However, the buffer size we use (144K) is not a power of two, which in theory is not guaranteed to return a page-aligned buffer, and in practice would waste a lot of memory due to rounding up to the next power of two. 144K is large enough that it usually gets allocated with vmalloc(), anyways. So, for send v2, replace kvmalloc() with vmalloc() and save the pages in an array.
Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
356bbbb6
|
| 17-Mar-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: write larger chunks when using stream v2
The length field of the send stream TLV header is 16 bits. This means that the maximum amount of data that can be sent for one write is 64K minu
btrfs: send: write larger chunks when using stream v2
The length field of the send stream TLV header is 16 bits. This means that the maximum amount of data that can be sent for one write is 64K minus one. However, encoded writes must be able to send the maximum compressed extent (128K) in one command, or more. To support this, send stream version 2 encodes the DATA attribute differently: it has no length field, and the length is implicitly up to the end of containing command (which has a 32bit length field). Although this is necessary for encoded writes, normal writes can benefit from it, too.
Also add a check to enforce that the DATA attribute is last. It is only strictly necessary for v2, but we might as well make v1 consistent with it.
For v2, let's bump up the send buffer to the maximum compressed extent size plus 16K for the other metadata (144K total). Since this will most likely be vmalloc'd (and always will be after the next commit), we round it up to the next page since we might as well use the rest of the page on systems with >16K pages.
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
b7c14f23
|
| 17-Mar-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: add stream v2 definitions
This adds the definitions of the new commands for send stream version 2 and their respective attributes: fallocate, FS_IOC_SETFLAGS (a.k.a. chattr), and encode
btrfs: send: add stream v2 definitions
This adds the definitions of the new commands for send stream version 2 and their respective attributes: fallocate, FS_IOC_SETFLAGS (a.k.a. chattr), and encoded writes. It also documents two changes to the send stream format in v2: the receiver shouldn't assume a maximum command size, and the DATA attribute is encoded differently to allow for writes larger than 64k. These will be implemented in subsequent changes, and then the ioctl will accept the new version and flag.
Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
54cab6af
|
| 17-Mar-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: explicitly number commands and attributes
Commit e77fbf990316 ("btrfs: send: prepare for v2 protocol") added _BTRFS_SEND_C_MAX_V* macros equal to the maximum command number for the vers
btrfs: send: explicitly number commands and attributes
Commit e77fbf990316 ("btrfs: send: prepare for v2 protocol") added _BTRFS_SEND_C_MAX_V* macros equal to the maximum command number for the version plus 1, but as written this creates gaps in the number space.
The maximum command number is currently 22, and __BTRFS_SEND_C_MAX_V1 is accordingly 23. But then __BTRFS_SEND_C_MAX_V2 is 24, suggesting that v2 has a command numbered 23, and __BTRFS_SEND_C_MAX is 25, suggesting that 23 and 24 are valid commands.
Instead, let's explicitly number all of the commands, attributes, and sentinel MAX constants.
Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
ca182acc
|
| 17-Mar-2022 |
Omar Sandoval <osandov@fb.com> |
btrfs: send: remove unused send_ctx::{total,cmd}_send_size
We collect these statistics but have never exposed them in any way. I also didn't find any patches that ever attempted to make use of them.
btrfs: send: remove unused send_ctx::{total,cmd}_send_size
We collect these statistics but have never exposed them in any way. I also didn't find any patches that ever attempted to make use of them.
Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
3c69a99b
|
| 25-Jul-2022 |
Michael Ellerman <mpe@ellerman.id.au> |
Merge tag 'v5.19-rc7' into fixes
Merge v5.19-rc7 into fixes to bring in: d11219ad53dc ("amdgpu: disable powerpc support for the newer display engine")
|
| #
4effe18f
|
| 25-Jul-2022 |
Jens Axboe <axboe@kernel.dk> |
Merge branch 'for-5.20/io_uring' into for-5.20/io_uring-zerocopy-send
* for-5.20/io_uring: (716 commits) io_uring: ensure REQ_F_ISREG is set async offload net: fix compat pointer in get_compat_m
Merge branch 'for-5.20/io_uring' into for-5.20/io_uring-zerocopy-send
* for-5.20/io_uring: (716 commits) io_uring: ensure REQ_F_ISREG is set async offload net: fix compat pointer in get_compat_msghdr() io_uring: Don't require reinitable percpu_ref io_uring: fix types in io_recvmsg_multishot_overflow io_uring: Use atomic_long_try_cmpxchg in __io_account_mem io_uring: support multishot in recvmsg net: copy from user before calling __get_compat_msghdr net: copy from user before calling __copy_msghdr io_uring: support 0 length iov in buffer select in compat io_uring: fix multishot ending when not polled io_uring: add netmsg cache io_uring: impose max limit on apoll cache io_uring: add abstraction around apoll cache io_uring: move apoll cache to poll.c io_uring: consolidate hash_locked io-wq handling io_uring: clear REQ_F_HASH_LOCKED on hash removal io_uring: don't race double poll setting REQ_F_ASYNC_DATA io_uring: don't miss setting REQ_F_DOUBLE_POLL io_uring: disable multishot recvmsg io_uring: only trace one of complete or overflow ...
Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
| #
6e0e846e
|
| 21-Jul-2022 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
| #
dc14036f
|
| 18-Jul-2022 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 5.19-rc7 into usb-next
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| #
0698461a
|
| 18-Jul-2022 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
Merge remote-tracking branch 'torvalds/master' into perf/core
To update the perf/core codebase.
Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end of evsel__config(), after what
Merge remote-tracking branch 'torvalds/master' into perf/core
To update the perf/core codebase.
Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end of evsel__config(), after what was added in:
49c692b7dfc9b6c0 ("perf offcpu: Accept allowed sample types only")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
| #
972a278f
|
| 16-Jul-2022 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'for-5.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs reverts from David Sterba: "Due to a recent report [1] we need to revert the radix tree to xarra
Merge tag 'for-5.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs reverts from David Sterba: "Due to a recent report [1] we need to revert the radix tree to xarray conversion patches.
There's a problem with sleeping under spinlock, when xa_insert could allocate memory under pressure. We use GFP_NOFS so this is a real problem that we unfortunately did not discover during review.
I'm sorry to do such change at rc6 time but the revert is IMO the safer option, there are patches to use mutex instead of the spin locks but that would need more testing. The revert branch has been tested on a few setups, all seem ok.
The conversion to xarray will be revisited in the future"
Link: https://lore.kernel.org/linux-btrfs/cover.1657097693.git.fdmanana@suse.com/ [1]
* tag 'for-5.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Revert "btrfs: turn delayed_nodes_tree into an XArray" Revert "btrfs: turn name_cache radix tree into XArray in send_ctx" Revert "btrfs: turn fs_info member buffer_radix into XArray" Revert "btrfs: turn fs_roots_radix in btrfs_fs_info into an XArray"
show more ...
|
| #
5b8418b8
|
| 15-Jul-2022 |
David Sterba <dsterba@suse.com> |
Revert "btrfs: turn name_cache radix tree into XArray in send_ctx"
This reverts commit 4076942021fe14efecae33bf98566df6dd5ae6f7.
Revert the xarray conversion, there's a problem with potential sleep
Revert "btrfs: turn name_cache radix tree into XArray in send_ctx"
This reverts commit 4076942021fe14efecae33bf98566df6dd5ae6f7.
Revert the xarray conversion, there's a problem with potential sleep-inside-spinlock [1] when calling xa_insert that triggers GFP_NOFS allocation. The radix tree used the preloading mechanism to avoid sleeping but this is not available in xarray.
Conversion from spin lock to mutex is possible but at time of rc6 is riskier than a clean revert.
[1] https://lore.kernel.org/linux-btrfs/cover.1657097693.git.fdmanana@suse.com/
Reported-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
| #
f83d9396
|
| 14-Jul-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging from drm/drm-next for the final fixes that will go into v5.20.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
| #
a63f7778
|
| 08-Jul-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.19-rc5' into next
Merge with mainline to bring up the latest definition from MFD subsystem needed for Mediatek keypad driver.
|
| #
dd84cfff
|
| 04-Jul-2022 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v5.19-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.19
A collection of fixes for v5.19, quite large but nothing major -
Merge tag 'asoc-fix-v5.19-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.19
A collection of fixes for v5.19, quite large but nothing major - a good chunk of it is more stuff that was identified by mixer-test regarding event generation.
show more ...
|