- 17 12月, 2018 4 次提交
-
-
由 Filipe Manana 提交于
Since cloning and deduplication are no longer Btrfs specific operations, we now have generic code to handle parameter validation, compare file ranges used for deduplication, clear capabilities when cloning, etc. This change makes Btrfs use it, eliminating a lot of code in Btrfs and also fixing a few bugs, such as: 1) When cloning, the destination file's capabilities were not dropped (the fstest generic/513 tests this); 2) We were not checking if the destination file is immutable; 3) Not checking if either the source or destination files are swap files (swap file support is coming soon for Btrfs); 4) System limits were not checked (resource limits and O_LARGEFILE). Note that the generic helper generic_remap_file_range_prep() does start and waits for writeback by calling filemap_write_and_wait_range(), however that is not enough for Btrfs for two reasons: 1) With compression, we need to start writeback twice in order to get the pages marked for writeback and ordered extents created; 2) filemap_write_and_wait_range() (and all its other variants) only waits for the IO to complete, but we need to wait for the ordered extents to finish, so that when we do the actual reflinking operations the file extent items are in the fs tree. This is also important due to the fact that the generic helper, for the deduplication case, compares the contents of the pages in the requested range, which might require reading extents from disk in the very unlikely case that pages get invalidated after writeback finishes (so the file extent items must be up to date in the fs tree). Since these reasons are specific to Btrfs we have to do it in the Btrfs code before calling generic_remap_file_range_prep(). This also results in a simpler way of dealing with existing delalloc in the source/target ranges, specially for the deduplication case where we used to lock all the pages first and then if we found any dealloc for the range, or ordered extent, we would unlock the pages trigger writeback and wait for ordered extents to complete, then lock all the pages again and check if deduplication can be done. So now we get a simpler approach: lock the inodes, then trigger writeback and then wait for ordered extents to complete. So make btrfs use generic_remap_file_range_prep() (XFS and OCFS2 use it) to eliminate duplicated code, fix a few bugs and benefit from future bug fixes done there - for example the recent clone and dedupe bugs involving reflinking a partial EOF block got a counterpart fix in the generic helper, since it affected all filesystems supporting these operations, so we no longer need special checks in Btrfs for them. Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Currently btrfs_fs_info structure contains a copy of the fsid/metadata_uuid fields. Same values are also contained in the btrfs_fs_devices structure which fs_info has a reference to. Let's reduce duplication by removing the fields from fs_info and always refer to the ones in fs_devices. No functional changes. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Ethan Lien 提交于
Snapshot is expected to be fast. But if there are writers steadily creating dirty pages in our subvolume, the snapshot may take a very long time to complete. To fix the problem, we use tagged writepage for snapshot flusher as we do in the generic write_cache_pages(), so we can omit pages dirtied after the snapshot command. This does not change the semantics regarding which data get to the snapshot, if there are pages being dirtied during the snapshotting operation. There's a sync called before snapshot is taken in old/new case, any IO in flight just after that may be in the snapshot but this depends on other system effects that might still sync the IO. We do a simple snapshot speed test on a Intel D-1531 box: fio --ioengine=libaio --iodepth=32 --bs=4k --rw=write --size=64G --direct=0 --thread=1 --numjobs=1 --time_based --runtime=120 --filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5; time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio original: 1m58sec patched: 6.54sec This is the best case for this patch since for a sequential write case, we omit nearly all pages dirtied after the snapshot command. For a multi writers, random write test: fio --ioengine=libaio --iodepth=32 --bs=4k --rw=randwrite --size=64G --direct=0 --thread=1 --numjobs=4 --time_based --runtime=120 --filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5; time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio original: 15.83sec patched: 10.35sec The improvement is smaller compared to the sequential write case, since we omit only half of the pages dirtied after snapshot command. Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NEthan Lien <ethanlien@synology.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Omar Sandoval 提交于
A later patch will implement swap file support for Btrfs, but before we do that, we need to make sure that the various Btrfs ioctls cannot change a swap file. When a swap file is active, we must make sure that the extents of the file are not moved and that they don't become shared. That means that the following are not safe: - chattr +c (enable compression) - reflink - dedupe - snapshot - defrag Don't allow those to happen on an active swap file. Additionally, balance, resize, device remove, and device replace are also unsafe if they affect an active swapfile. Add a red-black tree of block groups and devices which contain an active swapfile. Relocation checks each block group against this tree and skips it or errors out for balance or resize, respectively. Device remove and device replace check the tree for the device they will operate on. Note that we don't have to worry about chattr -C (disable nocow), which we ignore for non-empty files, because an active swapfile must be non-empty and can't be truncated. We also don't have to worry about autodefrag because it's only done on COW files. Truncate and fallocate are already taken care of by the generic code. Device add doesn't do relocation so it's not an issue, either. Signed-off-by: NOmar Sandoval <osandov@fb.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 06 11月, 2018 2 次提交
-
-
由 Filipe Manana 提交于
We currently allow cloning a range from a file which includes the last block of the file even if the file's size is not aligned to the block size. This is fine and useful when the destination file has the same size, but when it does not and the range ends somewhere in the middle of the destination file, it leads to corruption because the bytes between the EOF and the end of the block have undefined data (when there is support for discard/trimming they have a value of 0x00). Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ export foo_size=$((256 * 1024 + 100)) $ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo $ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar $ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar $ od -A d -t x1 /mnt/bar 0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c * 0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00 0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 1048576 The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527 (512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead of 0xb5. This is similar to the problem we had for deduplication that got recently fixed by commit de02b9f6 ("Btrfs: fix data corruption when deduplicating between different files"). Fix this by not allowing such operations to be performed and return the errno -EINVAL to user space. This is what XFS is doing as well at the VFS level. This change however now makes us return -EINVAL instead of -EOPNOTSUPP for cases where the source range maps to an inline extent and the destination range's end is smaller then the destination file's size, since the detection of inline extents is done during the actual process of dropping file extent items (at __btrfs_drop_extents()). Returning the -EINVAL error is done early on and solely based on the input parameters (offsets and length) and destination file's size. This makes us consistent with XFS and anyone else supporting cloning since this case is now checked at a higher level in the VFS and is where the -EINVAL will be returned from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1 by commit 07d19dc9 ("vfs: avoid problematic remapping requests into partial EOF block"). So this change is more geared towards stable kernels, as it's unlikely the new VFS checks get removed intentionally. A test case for fstests follows soon, as well as an update to filter existing tests that expect -EOPNOTSUPP to accept -EINVAL as well. CC: <stable@vger.kernel.org> # 4.4+ Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
If we attempt to deduplicate the last block of a file A into the middle of a file B, and file A's size is not a multiple of the block size, we end rounding the deduplication length to 0 bytes, to avoid the data corruption issue fixed by commit de02b9f6 ("Btrfs: fix data corruption when deduplicating between different files"). However a length of zero will cause the insertion of an extent state with a start value greater (by 1) then the end value, leading to a corrupt extent state that will trigger a warning and cause chaos such as an infinite loop during inode eviction. Example trace: [96049.833585] ------------[ cut here ]------------ [96049.833714] WARNING: CPU: 0 PID: 24448 at fs/btrfs/extent_io.c:436 insert_state+0x101/0x120 [btrfs] [96049.833767] CPU: 0 PID: 24448 Comm: xfs_io Not tainted 4.19.0-rc7-btrfs-next-39 #1 [96049.833768] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [96049.833780] RIP: 0010:insert_state+0x101/0x120 [btrfs] [96049.833783] RSP: 0018:ffffafd2c3707af0 EFLAGS: 00010282 [96049.833785] RAX: 0000000000000000 RBX: 000000000004dfff RCX: 0000000000000006 [96049.833786] RDX: 0000000000000007 RSI: ffff99045c143230 RDI: ffff99047b2168a0 [96049.833787] RBP: ffff990457851cd0 R08: 0000000000000001 R09: 0000000000000000 [96049.833787] R10: ffffafd2c3707ab8 R11: 0000000000000000 R12: ffff9903b93b12c8 [96049.833788] R13: 000000000004e000 R14: ffffafd2c3707b80 R15: ffffafd2c3707b78 [96049.833790] FS: 00007f5c14e7d700(0000) GS:ffff99047b200000(0000) knlGS:0000000000000000 [96049.833791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [96049.833792] CR2: 00007f5c146abff8 CR3: 0000000115f4c004 CR4: 00000000003606f0 [96049.833795] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [96049.833796] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [96049.833796] Call Trace: [96049.833809] __set_extent_bit+0x46c/0x6a0 [btrfs] [96049.833823] lock_extent_bits+0x6b/0x210 [btrfs] [96049.833831] ? _raw_spin_unlock+0x24/0x30 [96049.833841] ? test_range_bit+0xdf/0x130 [btrfs] [96049.833853] lock_extent_range+0x8e/0x150 [btrfs] [96049.833864] btrfs_double_extent_lock+0x78/0xb0 [btrfs] [96049.833875] btrfs_extent_same_range+0x14e/0x550 [btrfs] [96049.833885] ? rcu_read_lock_sched_held+0x3f/0x70 [96049.833890] ? __kmalloc_node+0x2b0/0x2f0 [96049.833899] ? btrfs_dedupe_file_range+0x19a/0x280 [btrfs] [96049.833909] btrfs_dedupe_file_range+0x270/0x280 [btrfs] [96049.833916] vfs_dedupe_file_range_one+0xd9/0xe0 [96049.833919] vfs_dedupe_file_range+0x131/0x1b0 [96049.833924] do_vfs_ioctl+0x272/0x6e0 [96049.833927] ? __fget+0x113/0x200 [96049.833931] ksys_ioctl+0x70/0x80 [96049.833933] __x64_sys_ioctl+0x16/0x20 [96049.833937] do_syscall_64+0x60/0x1b0 [96049.833939] entry_SYSCALL_64_after_hwframe+0x49/0xbe [96049.833941] RIP: 0033:0x7f5c1478ddd7 [96049.833943] RSP: 002b:00007ffe15b196a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [96049.833945] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c1478ddd7 [96049.833946] RDX: 00005625ece322d0 RSI: 00000000c0189436 RDI: 0000000000000004 [96049.833947] RBP: 0000000000000000 R08: 00007f5c14a46f48 R09: 0000000000000040 [96049.833948] R10: 0000000000000541 R11: 0000000000000202 R12: 0000000000000000 [96049.833949] R13: 0000000000000000 R14: 0000000000000004 R15: 00005625ece322d0 [96049.833954] irq event stamp: 6196 [96049.833956] hardirqs last enabled at (6195): [<ffffffff91b00663>] console_unlock+0x503/0x640 [96049.833958] hardirqs last disabled at (6196): [<ffffffff91a037dd>] trace_hardirqs_off_thunk+0x1a/0x1c [96049.833959] softirqs last enabled at (6114): [<ffffffff92600370>] __do_softirq+0x370/0x421 [96049.833964] softirqs last disabled at (6095): [<ffffffff91a8dd4d>] irq_exit+0xcd/0xe0 [96049.833965] ---[ end trace db7b05f01b7fa10c ]--- [96049.935816] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910 [96049.935822] irq event stamp: 6584 [96049.935823] hardirqs last enabled at (6583): [<ffffffff91b00663>] console_unlock+0x503/0x640 [96049.935825] hardirqs last disabled at (6584): [<ffffffff91a037dd>] trace_hardirqs_off_thunk+0x1a/0x1c [96049.935827] softirqs last enabled at (6328): [<ffffffff92600370>] __do_softirq+0x370/0x421 [96049.935828] softirqs last disabled at (6313): [<ffffffff91a8dd4d>] irq_exit+0xcd/0xe0 [96049.935829] ---[ end trace db7b05f01b7fa123 ]--- [96049.935840] ------------[ cut here ]------------ [96049.936065] WARNING: CPU: 1 PID: 24463 at fs/btrfs/extent_io.c:436 insert_state+0x101/0x120 [btrfs] [96049.936107] CPU: 1 PID: 24463 Comm: umount Tainted: G W 4.19.0-rc7-btrfs-next-39 #1 [96049.936108] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [96049.936117] RIP: 0010:insert_state+0x101/0x120 [btrfs] [96049.936119] RSP: 0018:ffffafd2c3637bc0 EFLAGS: 00010282 [96049.936120] RAX: 0000000000000000 RBX: 000000000004dfff RCX: 0000000000000006 [96049.936121] RDX: 0000000000000007 RSI: ffff990445cf88e0 RDI: ffff99047b2968a0 [96049.936122] RBP: ffff990457851cd0 R08: 0000000000000001 R09: 0000000000000000 [96049.936123] R10: ffffafd2c3637b88 R11: 0000000000000000 R12: ffff9904574301e8 [96049.936124] R13: 000000000004e000 R14: ffffafd2c3637c50 R15: ffffafd2c3637c48 [96049.936125] FS: 00007fe4b87e72c0(0000) GS:ffff99047b280000(0000) knlGS:0000000000000000 [96049.936126] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [96049.936128] CR2: 00005562e52618d8 CR3: 00000001151c8005 CR4: 00000000003606e0 [96049.936129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [96049.936131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [96049.936131] Call Trace: [96049.936141] __set_extent_bit+0x46c/0x6a0 [btrfs] [96049.936154] lock_extent_bits+0x6b/0x210 [btrfs] [96049.936167] btrfs_evict_inode+0x1e1/0x5a0 [btrfs] [96049.936172] evict+0xbf/0x1c0 [96049.936174] dispose_list+0x51/0x80 [96049.936176] evict_inodes+0x193/0x1c0 [96049.936180] generic_shutdown_super+0x3f/0x110 [96049.936182] kill_anon_super+0xe/0x30 [96049.936189] btrfs_kill_super+0x13/0x100 [btrfs] [96049.936191] deactivate_locked_super+0x3a/0x70 [96049.936193] cleanup_mnt+0x3b/0x80 [96049.936195] task_work_run+0x93/0xc0 [96049.936198] exit_to_usermode_loop+0xfa/0x100 [96049.936201] do_syscall_64+0x17f/0x1b0 [96049.936202] entry_SYSCALL_64_after_hwframe+0x49/0xbe [96049.936204] RIP: 0033:0x7fe4b80cfb37 [96049.936206] RSP: 002b:00007ffff092b688 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [96049.936207] RAX: 0000000000000000 RBX: 00005562e5259060 RCX: 00007fe4b80cfb37 [96049.936208] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005562e525faa0 [96049.936209] RBP: 00005562e525faa0 R08: 00005562e525f770 R09: 0000000000000015 [96049.936210] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fe4b85d1e64 [96049.936211] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910 [96049.936211] R13: 0000000000000000 R14: 00005562e5259240 R15: 00007ffff092b910 [96049.936216] irq event stamp: 6616 [96049.936219] hardirqs last enabled at (6615): [<ffffffff91b00663>] console_unlock+0x503/0x640 [96049.936219] hardirqs last disabled at (6616): [<ffffffff91a037dd>] trace_hardirqs_off_thunk+0x1a/0x1c [96049.936222] softirqs last enabled at (6328): [<ffffffff92600370>] __do_softirq+0x370/0x421 [96049.936222] softirqs last disabled at (6313): [<ffffffff91a8dd4d>] irq_exit+0xcd/0xe0 [96049.936223] ---[ end trace db7b05f01b7fa124 ]--- The second stack trace, from inode eviction, is repeated forever due to the infinite loop during eviction. This is the same type of problem fixed way back in 2015 by commit 113e8283 ("Btrfs: fix inode eviction infinite loop after extent_same ioctl") and commit ccccf3d6 ("Btrfs: fix inode eviction infinite loop after cloning into it"). So fix this by returning immediately if the deduplication range length gets rounded down to 0 bytes, as there is nothing that needs to be done in such case. Example reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0xe6 0 100" /mnt/foo $ xfs_io -f -c "pwrite -S 0xe6 0 1M" /mnt/bar # Unmount the filesystem and mount it again so that we start without any # extent state records when we ask for the deduplication. $ umount /mnt $ mount /dev/sdb /mnt $ xfs_io -c "dedupe /mnt/foo 0 500K 100" /mnt/bar # This unmount triggers the infinite loop. $ umount /mnt A test case for fstests will follow soon. Fixes: de02b9f6 ("Btrfs: fix data corruption when deduplicating between different files") CC: <stable@vger.kernel.org> # 4.19+ Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 30 10月, 2018 2 次提交
-
-
由 Darrick J. Wong 提交于
Change the remap_file_range functions to take a number of bytes to operate upon and return the number of bytes they operated on. This is a requirement for allowing fs implementations to return short clone/dedupe results to the user, which will enable us to obey resource limits in a graceful manner. A subsequent patch will enable copy_file_range to signal to the ->clone_file_range implementation that it can handle a short length, which will be returned in the function's return value. For now the short return is not implemented anywhere so the behavior won't change -- either copy_file_range manages to clone the entire range or it tries an alternative. Neither clone ioctl can take advantage of this, alas. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NAmir Goldstein <amir73il@gmail.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Darrick J. Wong 提交于
Combine the clone_file_range and dedupe_file_range operations into a single remap_file_range file operation dispatch since they're fundamentally the same operation. The differences between the two can be made in the prep functions. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NAmir Goldstein <amir73il@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 15 10月, 2018 4 次提交
-
-
由 Qu Wenruo 提交于
[BUG] fstrim on some btrfs only trims the unallocated space, not trimming any space in existing block groups. [CAUSE] Before fstrim_range passed to btrfs_trim_fs(), it gets truncated to range [0, super->total_bytes). So later btrfs_trim_fs() will only be able to trim block groups in range [0, super->total_bytes). While for btrfs, any bytenr aligned to sectorsize is valid, since btrfs uses its logical address space, there is nothing limiting the location where we put block groups. For filesystem with frequent balance, it's quite easy to relocate all block groups and bytenr of block groups will start beyond super->total_bytes. In that case, btrfs will not trim existing block groups. [FIX] Just remove the truncation in btrfs_ioctl_fitrim(), so btrfs_trim_fs() can get the unmodified range, which is normally set to [0, U64_MAX]. Reported-by: NChris Murphy <lists@colorremedies.com> Fixes: f4c697e6 ("btrfs: return EINVAL if start > total_bytes in fitrim ioctl") CC: <stable@vger.kernel.org> # v4.4+ Signed-off-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Su Yue 提交于
Since commit 8b62f87b ("Btrfs: rework outstanding_extents"), manual operations of outstanding_extent in btrfs_inode are replaced by btrfs_mod_outstanding_extents(). The one in cluster_pages_for_defrag seems to be lost, so replace it of btrfs_mod_outstanding_extents(). Fixes: 8b62f87b ("Btrfs: rework outstanding_extents") Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Misono Tomohiro 提交于
There are two members in struct btrfs_root which indicate root's objectid: objectid and root_key.objectid. They are both set to the same value in __setup_root(): static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info, u64 objectid) { ... root->objectid = objectid; ... root->root_key.objectid = objecitd; ... } and not changed to other value after initialization. grep in btrfs directory shows both are used in many places: $ grep -rI "root->root_key.objectid" | wc -l 133 $ grep -rI "root->objectid" | wc -l 55 (4.17, inc. some noise) It is confusing to have two similar variable names and it seems that there is no rule about which should be used in a certain case. Since ->root_key itself is needed for tree reloc tree, let's remove 'objecitd' member and unify code to use ->root_key.objectid in all places. Signed-off-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
All callers pass the root tree of dir, we can push that down to the function itself. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 23 8月, 2018 1 次提交
-
-
由 Filipe Manana 提交于
If we deduplicate extents between two different files we can end up corrupting data if the source range ends at the size of the source file, the source file's size is not aligned to the filesystem's block size and the destination range does not go past the size of the destination file size. Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0x6b 0 2518890" /mnt/foo # The first byte with a value of 0xae starts at an offset (2518890) # which is not a multiple of the sector size. $ xfs_io -c "pwrite -S 0xae 2518890 102398" /mnt/foo # Confirm the file content is full of bytes with values 0x6b and 0xae. $ od -t x1 /mnt/foo 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b * 11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae 11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae * 11777540 ae ae ae ae ae ae ae ae 11777550 # Create a second file with a length not aligned to the sector size, # whose bytes all have the value 0x6b, so that its extent(s) can be # deduplicated with the first file. $ xfs_io -f -c "pwrite -S 0x6b 0 557771" /mnt/bar # Now deduplicate the entire second file into a range of the first file # that also has all bytes with the value 0x6b. The destination range's # end offset must not be aligned to the sector size and must be less # then the offset of the first byte with the value 0xae (byte at offset # 2518890). $ xfs_io -c "dedupe /mnt/bar 0 1957888 557771" /mnt/foo # The bytes in the range starting at offset 2515659 (end of the # deduplication range) and ending at offset 2519040 (start offset # rounded up to the block size) must all have the value 0xae (and not # replaced with 0x00 values). In other words, we should have exactly # the same data we had before we asked for deduplication. $ od -t x1 /mnt/foo 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b * 11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae 11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae * 11777540 ae ae ae ae ae ae ae ae 11777550 # Unmount the filesystem and mount it again. This guarantees any file # data in the page cache is dropped. $ umount /dev/sdb $ mount /dev/sdb /mnt $ od -t x1 /mnt/foo 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b * 11461300 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 00 11461320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 11470000 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae * 11777540 ae ae ae ae ae ae ae ae 11777550 # The bytes in range 2515659 to 2519040 have a value of 0x00 and not a # value of 0xae, data corruption happened due to the deduplication # operation. So fix this by rounding down, to the sector size, the length used for the deduplication when the following conditions are met: 1) Source file's range ends at its i_size; 2) Source file's i_size is not aligned to the sector size; 3) Destination range does not cross the i_size of the destination file. Fixes: e1d227a4 ("btrfs: Handle unaligned length in extent_same") CC: stable@vger.kernel.org # 4.2+ Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 18 8月, 2018 1 次提交
-
-
由 Robbie Ko 提交于
Commit e9894fd3 ("Btrfs: fix snapshot vs nocow writting") forced nocow writes to fallback to COW, during writeback, when a snapshot is created. This resulted in writes made before creating the snapshot to unexpectedly fail with ENOSPC during writeback when success (0) was returned to user space through the write system call. The steps leading to this problem are: 1. When it's not possible to allocate data space for a write, the buffered write path checks if a NOCOW write is possible. If it is, it will not reserve space and success (0) is returned to user space. 2. Then when a snapshot is created, the root's will_be_snapshotted atomic is incremented and writeback is triggered for all inode's that belong to the root being snapshotted. Incrementing that atomic forces all previous writes to fallback to COW during writeback (running delalloc). 3. This results in the writeback for the inodes to fail and therefore setting the ENOSPC error in their mappings, so that a subsequent fsync on them will report the error to user space. So it's not a completely silent data loss (since fsync will report ENOSPC) but it's a very unexpected and undesirable behaviour, because if a clean shutdown/unmount of the filesystem happens without previous calls to fsync, it is expected to have the data present in the files after mounting the filesystem again. So fix this by adding a new atomic named snapshot_force_cow to the root structure which prevents this behaviour and works the following way: 1. It is incremented when we start to create a snapshot after triggering writeback and before waiting for writeback to finish. 2. This new atomic is now what is used by writeback (running delalloc) to decide whether we need to fallback to COW or not. Because we incremented this new atomic after triggering writeback in the snapshot creation ioctl, we ensure that all buffered writes that happened before snapshot creation will succeed and not fallback to COW (which would make them fail with ENOSPC). 3. The existing atomic, will_be_snapshotted, is kept because it is used to force new buffered writes, that start after we started snapshotting, to reserve data space even when NOCOW is possible. This makes these writes fail early with ENOSPC when there's no available space to allocate, preventing the unexpected behaviour of writeback later failing with ENOSPC due to a fallback to COW mode. Fixes: e9894fd3 ("Btrfs: fix snapshot vs nocow writting") Signed-off-by: NRobbie Ko <robbieko@synology.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 06 8月, 2018 14 次提交
-
-
由 Misono Tomohiro 提交于
Cleanup patch and no functional changes. Signed-off-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be referenced from the passed transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Adam Borowski 提交于
Requiring a read-write descriptor conflicts both ways with exec, returning ETXTBSY whenever you try to defrag a program that's currently being run, or causing intermittent exec failures on a live system being defragged. As defrag doesn't change the file's contents in any way, there's no reason to consider it a rw operation. Thus, let's check only whether the file could have been opened rw. Such access control is still needed as currently defrag can use extra disk space, and might trigger bugs. We return EINVAL when the request is invalid; here it's ok but merely the user has insufficient privileges. Thus, the EPERM return value reflects the error better -- as discussed in the identical case for dedupe. According to codesearch.debian.net, no userspace program distinguishes these values beyond strerror(). Signed-off-by: NAdam Borowski <kilobyte@angband.pl> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ fold the EPERM patch from Adam ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
It can be fetched from the transaction handle. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Commit 5d23515b ("btrfs: Move qgroup rescan on quota enable to btrfs_quota_enable") not only resulted in an easier to follow code but it also introduced a subtle bug. It changed the timing when the initial transaction rescan was happening: - before the commit: it would happen after transaction commit had occured - after the commit: it might happen before the transaction was committed This results in failure to correctly rescan the quota since there could be data which is still not committed on disk. This patch aims to fix this by moving the transaction creation/commit inside btrfs_quota_enable, which allows to schedule the quota commit after the transaction has been committed. Fixes: 5d23515b ("btrfs: Move qgroup rescan on quota enable to btrfs_quota_enable") Reported-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com> Link: https://marc.info/?l=linux-btrfs&m=152999289017582Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Remove includes if none of the interfaces and exports is used in the given source file. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
When a new extent buffer is allocated there are a few mandatory fields which need to be set in order for the buffer to be sane: level, generation, bytenr, backref_rev, owner and FSID/UUID. Currently this is open coded in the callers of btrfs_alloc_tree_block, meaning it's fairly high in the abstraction hierarchy of operations. This patch solves this by simply moving this init code in btrfs_init_new_buffer, since this is the function which initializes a newly allocated extent buffer. No functional changes. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Bart Van Assche 提交于
This patch avoids that building the BTRFS source code with smatch triggers complaints about inconsistent indenting. Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 13 7月, 2018 1 次提交
-
-
由 Naohiro Aota 提交于
btrfs_cmp_data_free() puts cmp's src_pages and dst_pages, but leaves their page address intact. Now, if you hit "goto again" in btrfs_extent_same_range() and hit some error in btrfs_cmp_data_prepare(), you'll try to unlock/put already put pages. This is simple fix to reset the address to avoid use-after-free. Fixes: 67b07bd4 ("Btrfs: reuse cmp workspace in EXTENT_SAME ioctl") Signed-off-by: NNaohiro Aota <naota@elisp.net> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 07 7月, 2018 2 次提交
-
-
由 Miklos Szeredi 提交于
Clean up f_op->dedupe_file_range() interface. 1) Use loff_t for offsets and length instead of u64 2) Order the arguments the same way as {copy|clone}_file_range(). Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Miklos Szeredi 提交于
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 22 6月, 2018 1 次提交
-
-
由 Lu Fengqi 提交于
If this condition ((BTRFS_I(src)->flags & BTRFS_INODE_NODATASUM) != (BTRFS_I(dst)->flags & BTRFS_INODE_NODATASUM)) is hit, we will go to free the uninitialized cmp.src_pages and cmp.dst_pages. Fixes: 67b07bd4 ("Btrfs: reuse cmp workspace in EXTENT_SAME ioctl") Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 06 6月, 2018 1 次提交
-
-
由 Deepa Dinamani 提交于
struct timespec is not y2038 safe. Transition vfs to use y2038 safe struct timespec64 instead. The change was made with the help of the following cocinelle script. This catches about 80% of the changes. All the header file and logic changes are included in the first 5 rules. The rest are trivial substitutions. I avoid changing any of the function signatures or any other filesystem specific data structures to keep the patch simple for review. The script can be a little shorter by combining different cases. But, this version was sufficient for my usecase. virtual patch @ depends on patch @ identifier now; @@ - struct timespec + struct timespec64 current_time ( ... ) { - struct timespec now = current_kernel_time(); + struct timespec64 now = current_kernel_time64(); ... - return timespec_trunc( + return timespec64_trunc( ... ); } @ depends on patch @ identifier xtime; @@ struct \( iattr \| inode \| kstat \) { ... - struct timespec xtime; + struct timespec64 xtime; ... } @ depends on patch @ identifier t; @@ struct inode_operations { ... int (*update_time) (..., - struct timespec t, + struct timespec64 t, ...); ... } @ depends on patch @ identifier t; identifier fn_update_time =~ "update_time$"; @@ fn_update_time (..., - struct timespec *t, + struct timespec64 *t, ...) { ... } @ depends on patch @ identifier t; @@ lease_get_mtime( ... , - struct timespec *t + struct timespec64 *t ) { ... } @te depends on patch forall@ identifier ts; local idexpression struct inode *inode_node; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn_update_time =~ "update_time$"; identifier fn; expression e, E3; local idexpression struct inode *node1; local idexpression struct inode *node2; local idexpression struct iattr *attr1; local idexpression struct iattr *attr2; local idexpression struct iattr attr; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; @@ ( ( - struct timespec ts; + struct timespec64 ts; | - struct timespec ts = current_time(inode_node); + struct timespec64 ts = current_time(inode_node); ) <+... when != ts ( - timespec_equal(&inode_node->i_xtime, &ts) + timespec64_equal(&inode_node->i_xtime, &ts) | - timespec_equal(&ts, &inode_node->i_xtime) + timespec64_equal(&ts, &inode_node->i_xtime) | - timespec_compare(&inode_node->i_xtime, &ts) + timespec64_compare(&inode_node->i_xtime, &ts) | - timespec_compare(&ts, &inode_node->i_xtime) + timespec64_compare(&ts, &inode_node->i_xtime) | ts = current_time(e) | fn_update_time(..., &ts,...) | inode_node->i_xtime = ts | node1->i_xtime = ts | ts = inode_node->i_xtime | <+... attr1->ia_xtime ...+> = ts | ts = attr1->ia_xtime | ts.tv_sec | ts.tv_nsec | btrfs_set_stack_timespec_sec(..., ts.tv_sec) | btrfs_set_stack_timespec_nsec(..., ts.tv_nsec) | - ts = timespec64_to_timespec( + ts = ... -) | - ts = ktime_to_timespec( + ts = ktime_to_timespec64( ...) | - ts = E3 + ts = timespec_to_timespec64(E3) | - ktime_get_real_ts(&ts) + ktime_get_real_ts64(&ts) | fn(..., - ts + timespec64_to_timespec(ts) ,...) ) ...+> ( <... when != ts - return ts; + return timespec64_to_timespec(ts); ...> ) | - timespec_equal(&node1->i_xtime1, &node2->i_xtime2) + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2) | - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2) + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2) | - timespec_compare(&node1->i_xtime1, &node2->i_xtime2) + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2) | node1->i_xtime1 = - timespec_trunc(attr1->ia_xtime1, + timespec64_trunc(attr1->ia_xtime1, ...) | - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2, + attr1->ia_xtime1 = timespec64_trunc(attr2->ia_xtime2, ...) | - ktime_get_real_ts(&attr1->ia_xtime1) + ktime_get_real_ts64(&attr1->ia_xtime1) | - ktime_get_real_ts(&attr.ia_xtime1) + ktime_get_real_ts64(&attr.ia_xtime1) ) @ depends on patch @ struct inode *node; struct iattr *attr; identifier fn; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; expression e; @@ ( - fn(node->i_xtime); + fn(timespec64_to_timespec(node->i_xtime)); | fn(..., - node->i_xtime); + timespec64_to_timespec(node->i_xtime)); | - e = fn(attr->ia_xtime); + e = fn(timespec64_to_timespec(attr->ia_xtime)); ) @ depends on patch forall @ struct inode *node; struct iattr *attr; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); fn (..., - &node->i_xtime, + &ts, ...); | + ts = timespec64_to_timespec(attr->ia_xtime); fn (..., - &attr->ia_xtime, + &ts, ...); ) ...+> } @ depends on patch forall @ struct inode *node; struct iattr *attr; struct kstat *stat; identifier ia_xtime =~ "^ia_[acm]time$"; identifier i_xtime =~ "^i_[acm]time$"; identifier xtime =~ "^[acm]time$"; identifier fn, ret; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime, + &ts, ...); | + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime); + &ts); | + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime, + &ts, ...); | + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime); + &ts); | + ts = timespec64_to_timespec(stat->xtime); ret = fn (..., - &stat->xtime); + &ts); ) ...+> } @ depends on patch @ struct inode *node; struct inode *node2; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier i_xtime3 =~ "^i_[acm]time$"; struct iattr *attrp; struct iattr *attrp2; struct iattr attr ; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; struct kstat *stat; struct kstat stat1; struct timespec64 ts; identifier xtime =~ "^[acmb]time$"; expression e; @@ ( ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1 ; | node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \); | node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \); | node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \); | stat->xtime = node2->i_xtime1; | stat1.xtime = node2->i_xtime1; | ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1 ; | ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2; | - e = node->i_xtime1; + e = timespec64_to_timespec( node->i_xtime1 ); | - e = attrp->ia_xtime1; + e = timespec64_to_timespec( attrp->ia_xtime1 ); | node->i_xtime1 = current_time(...); | node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); | node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); | - node->i_xtime1 = e; + node->i_xtime1 = timespec_to_timespec64(e); ) Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Cc: <anton@tuxera.com> Cc: <balbi@kernel.org> Cc: <bfields@fieldses.org> Cc: <darrick.wong@oracle.com> Cc: <dhowells@redhat.com> Cc: <dsterba@suse.com> Cc: <dwmw2@infradead.org> Cc: <hch@lst.de> Cc: <hirofumi@mail.parknet.co.jp> Cc: <hubcap@omnibond.com> Cc: <jack@suse.com> Cc: <jaegeuk@kernel.org> Cc: <jaharkes@cs.cmu.edu> Cc: <jslaby@suse.com> Cc: <keescook@chromium.org> Cc: <mark@fasheh.com> Cc: <miklos@szeredi.hu> Cc: <nico@linaro.org> Cc: <reiserfs-devel@vger.kernel.org> Cc: <richard@nod.at> Cc: <sage@redhat.com> Cc: <sfrench@samba.org> Cc: <swhiteho@redhat.com> Cc: <tj@kernel.org> Cc: <trond.myklebust@primarydata.com> Cc: <tytso@mit.edu> Cc: <viro@zeniv.linux.org.uk>
-
- 05 6月, 2018 1 次提交
-
-
由 Misono Tomohiro 提交于
The patch introducing the ioctl was not the latest version at the time of merging to the mainline and needs a fixup from this patch. Fixes: ba637a252d30 ("btrfs: Check error of btrfs_iget() in btrfs_search_path_in_tree_user") Signed-off-by: NMisono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 31 5月, 2018 3 次提交
-
-
由 Tomohiro Misono 提交于
Add unprivileged version of ino_lookup ioctl BTRFS_IOC_INO_LOOKUP_USER to allow normal users to call "btrfs subvolume list/show" etc. in combination with BTRFS_IOC_GET_SUBVOL_INFO/BTRFS_IOC_GET_SUBVOL_ROOTREF. This can be used like BTRFS_IOC_INO_LOOKUP but the argument is different. This is because it always searches the fs/file tree correspoinding to the fd with which this ioctl is called and also returns the name of bottom subvolume. The main differences from original ino_lookup ioctl are: 1. Read + Exec permission will be checked using inode_permission() during path construction. -EACCES will be returned in case of failure. 2. Path construction will be stopped at the inode number which corresponds to the fd with which this ioctl is called. If constructed path does not exist under fd's inode, -EACCES will be returned. 3. The name of bottom subvolume is also searched and filled. Note that the maximum length of path is shorter 256 (BTRFS_VOL_NAME_MAX+1) bytes than ino_lookup ioctl because of space of subvolume's name. Reviewed-by: NGu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: NQu Wenruo <wqu@suse.com> Tested-by: NGu Jinxiang <gujx@cn.fujitsu.com> Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com> [ style fixes ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Tomohiro Misono 提交于
Add unprivileged ioctl BTRFS_IOC_GET_SUBVOL_ROOTREF which returns ROOT_REF information of the subvolume containing this inode except the subvolume name (this is because to prevent potential name leak). The subvolume name will be gained by user version of ino_lookup ioctl (BTRFS_IOC_INO_LOOKUP_USER) which also performs permission check. The min id of root ref's subvolume to be searched is specified by @min_id in struct btrfs_ioctl_get_subvol_rootref_args. After the search ends, @min_id is set to the last searched root ref's subvolid + 1. Also, if there are more root refs than BTRFS_MAX_ROOTREF_BUFFER_NUM, -EOVERFLOW is returned. Therefore the caller can just call this ioctl again without changing the argument to continue search. Reviewed-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NGu Jinxiang <gujx@cn.fujitsu.com> Tested-by: NGu Jinxiang <gujx@cn.fujitsu.com> Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com> [ style fixes and struct item renames ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Tomohiro Misono 提交于
Add new unprivileged ioctl BTRFS_IOC_GET_SUBVOL_INFO which returns the information of subvolume containing this inode. (i.e. returns the information in ROOT_ITEM and ROOT_BACKREF.) Reviewed-by: NGu Jinxiang <gujx@cn.fujitsu.com> Tested-by: NGu Jinxiang <gujx@cn.fujitsu.com> Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com> [ minor style fixes, update struct comments ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 30 5月, 2018 3 次提交
-
-
由 Gu JinXiang 提交于
Since commit 7775c818 ("btrfs: remove unused parameter from btrfs_subvolume_release_metadata") parameter qgroup_reserved is not used by caller of function btrfs_subvolume_reserve_metadata. So remove it. Signed-off-by: NGu JinXiang <gujx@cn.fujitsu.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: NNikolay Borisov <nborisov@suse.com> [ rename the function ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Lu Fengqi 提交于
This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-