- 31 3月, 2018 25 次提交
-
-
由 David Sterba 提交于
The wrappers are trivial and do not bring any extra value on top of the plain locking primitives. Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
It's provided by the extent_buffer. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
The tree_mod_move is not used anywhere and can be embedded as anonymous structure. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
It's provided by the extent_buffer. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
It's provided by the extent_buffer. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
It's provided by the extent_buffer. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
It's provided by the extent_buffer. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
It's provided by the extent_buffer. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
The merge call was factored out to a separate helper but it's a trivial one and arguably we can opencode it and cache the value. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
The value of page_end is only stored to end, no other use. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
All callers pass a valid pointer so we can drop the redundant checks. The call to submit_one_bio never happend and can be removed. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K) as unit, however, scrub_extent() sets blocksize as unit, so rebuild process may be triggered on every block on a same stripe. A typical example would be that when we're replacing a disappeared disk, all reads on the disks get -EIO, every block (size is 4K if blocksize is 4K) would go thru these, scrub_handle_errored_block scrub_recheck_block # re-read pages one by one scrub_recheck_block # rebuild by calling raid56_parity_recover() page by page Although with raid56 stripe cache most of reads during rebuild can be avoided, the parity recover calculation(xor or raid6 algorithms) needs to be done $(BTRFS_STRIPE_LEN / blocksize) times. This makes it smarter by doing raid56 scrub/replace on stripe length. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Sort mount options by the primary name, followed by the 'no-' counterpart if it exists. Group the deprecated and debugging options. Enum and token defintions are synced. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Howard McLauchlan 提交于
Btrfs has two mount options for SSD optimizations: ssd and ssd_spread. Presently there is an option to disable all SSD optimizations, but there isn't an option to disable just ssd_spread. This patch adds a mount option nossd_spread that disables ssd_spread only. Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NHoward McLauchlan <hmclauchlan@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Since userspace transaction have been removed we no longer have use for this field so delete it. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Now that the userspace transaction ioctls have been removed, TRANS_USERSPACE is no longer used hence we can remove it. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Now that the userspace transaction IOCTL have been removed, this member is no longer used so just remove it Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Commit 3558d4f8 ("btrfs: Deprecate userspace transaction ioctls") marked the beginning of the end of userspace transaction. This commit finishes the job! There are no known users and ceph does not use the ioctl anymore. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Acked-by: NSage Weil <sage@redhat.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Qu Wenruo 提交于
btrfs: qgroup: Fix root item corruption when multiple same source snapshots are created with quota enabled When multiple pending snapshots referring to the same source subvolume are executed, enabled quota will cause root item corruption, where root items are using old bytenr (no backref in extent tree). This can be triggered by fstests btrfs/152. The cause is when source subvolume is still dirty, extra commit (simplied transaction commit) of qgroup_account_snapshot() can skip dirty roots not recorded in current transaction, making root item of source subvolume not updated. Fix it by forcing recording source subvolume in current transaction before qgroup sub-transaction commit. Reported-by: NJustin Maggard <jmaggard@netgear.com> Signed-off-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
When performing an unlock on an extent buffer we'd like to order the decrement of extent_buffer::blocking_writers with waking up any waiters. In such situations it's sufficient to use smp_mb__after_atomic rather than the heavy smp_mb. On architectures where atomic operations are fully ordered (such as x86 or s390) unconditionally executing a heavyweight smp_mb instruction causes a severe hit to performance while bringin no improvements in terms of correctness. The better thing is to use the appropriate smp_mb__after_atomic routine which will do the correct thing (invoke a full smp_mb or in the case of ordered atomics insert a compiler barrier). Put another way, an RMW atomic op + smp_load__after_atomic equals, in terms of semantics, to a full smp_mb. This ensures that none of the problems described in the accompanying comment of waitqueue_active occur. No functional changes. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Anand Jain 提交于
Some functions can filter metadata by the generation. Add a define that will annotate such arguments. Signed-off-by: NAnand Jain <anand.jain@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
The called function name is self explanatory. Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Matthew Wilcox 提交于
The current implementation of btrfs_page_exists_in_range() gives the wrong answer if the workingset code has stored a shadow entry in the page cache. The filemap_range_has_page() function does not have this problem, and it's shared code, so use it instead. eigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 26 3月, 2018 15 次提交
-
-
由 Liu Bo 提交于
In the last step of scrub_handle_error_block, we try to combine good copies on all possible mirrors, this works fine for raid1 and raid10, but not for raid56 as it's doing parity rebuild. If parity rebuild doesn't get back with correct data which matches its checksum, in case of replace we'd rather write what is stored in the source device than the data calculuated from parity. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
async_missing_raid56() is identical to async_read_rebuild(). Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Su Yue 提交于
Previously, btrfs_inode_by_name() returned 0 which left caller to check objectid of location even location if the type was invalid. Let btrfs_inode_by_name() return -EUCLEAN if a corrupted location of a dir entry is found. Removal of label out_err also simplifies the function. Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ drop unlikely ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Anand Jain 提交于
This function btrfs_close_extra_devices() is about freeing extra devids which once it may have belonged to this filesystem. So rename it and add the comment. The _devid suffix is appropriate as this function won't handle devices which are outside of the filesytem being mounted. Signed-off-by: NAnand Jain <anand.jain@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
This argument is always set to the root of the inode, which is also passed. So let's get a reference inside the function and simplify the arg list. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Liu Bo 提交于
According to tlv_put()'s prototype, data and attrlen needs to be exchanged in the macro, but seems all callers are already aware of this misorder and are therefore not affected. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
Now that nothing uses the root arg of btrfs_log_dentry_safe it can be safely removed. No functional changes. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
btrfs_log_inode_parent is called from 2 places (btrfs_log_dentry_safe and btrfs_log_new_name) both of which pass inode->root as the root argument and the inode itself. Remove the redundant root argument and get a reference to the root directly from the inode, also remove redundant root != inode->root check from the same function. No functional change. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
This function always sets keep_locks to 1 and saves the old value of keep_locks which is restored at the end. So there is no way it can be called without keep_locks being set. Remove comment imposing redundant requirement on callers. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
There's a proper header for xattr handlers. Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
The xattr_handler::get prototype returns int, use it. The only ssize_t exception is the per-inode listxattr handler. Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Extern for functions does not make any difference, there are only a few so let's remove them before it's too late. Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 David Sterba 提交于
Reviewed-by: NNikolay Borisov <nborisov@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
When send finishes processing an inode representing a regular file, it always issues a truncate operation for that file, even if its size did not change or the last write sets the file size correctly. In the most common cases, the issued write operations set the file to correct size (either full or incremental sends) or the file size did not change (for incremental sends), so the only case where a truncate operation is needed is when a file size becomes smaller in the send snapshot when compared to the parent snapshot. By not issuing unnecessary truncate operations we reduce the stream size and save time in the receiver. Currently truncating a file to the same size triggers writeback of its last page (if it's dirty) and waits for it to complete (only if the file size is not aligned with the filesystem's sector size). This is being fixed by another patch and is independent of this change (that patch's title is "Btrfs: skip writeback of last page when truncating file to same size"). The following script was used to measure time spent by a receiver without this change applied, with this change applied, and without this change and with the truncate fix applied (the fix to not make it start and wait for writeback to complete). $ cat test_send.sh #!/bin/bash SRC_DEV=/dev/sdc DST_DEV=/dev/sdd SRC_MNT=/mnt/sdc DST_MNT=/mnt/sdd mkfs.btrfs -f $SRC_DEV >/dev/null mkfs.btrfs -f $DST_DEV >/dev/null mount $SRC_DEV $SRC_MNT mount $DST_DEV $DST_MNT echo "Creating source filesystem" for ((t = 0; t < 10; t++)); do ( for ((i = 1; i <= 20000; i++)); do xfs_io -f -c "pwrite -S 0xab 0 5000" \ $SRC_MNT/file_$i > /dev/null done ) & worker_pids[$t]=$! done wait ${worker_pids[@]} echo "Creating and sending snapshot" btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null /usr/bin/time -f "send took %e seconds" \ btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1 /usr/bin/time -f "receive took %e seconds" \ btrfs receive -f $SRC_MNT/send_file $DST_MNT umount $SRC_MNT umount $DST_MNT The results, which are averages for 5 runs for each case, were the following: * Without this change average receive time was 26.49 seconds standard deviation of 2.53 seconds * Without this change and with the truncate fix average receive time was 12.51 seconds standard deviation of 0.32 seconds * With this change and without the truncate fix average receive time was 10.02 seconds standard deviation of 1.11 seconds Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Filipe Manana 提交于
When we truncate a file to the same size and that size is not aligned with the sector size, we end up triggering writeback (and wait for it to complete) of the last page. This is unncessary as we can not have delayed allocation beyond the inode's i_size and the goal of truncating a file to its own size is to discard prealloc extents (allocated via the fallocate(2) system call). Besides the unnecessary IO start and wait, it also breaks the oppurtunity for larger contiguous extents on disk, as before the last dirty page there might be other dirty pages. This scenario is probably not very common in general, however it is common for btrfs receive implementations because currently the send stream always issues a truncate operation for each processed inode as the last operation for that inode (this truncate operation is not always needed and the send implementation will be addressed to avoid them). So improve this by not starting and waiting for writeback of the inode's last page when we are truncating to exactly the same size. The following script was used to quickly measure the time a receive operation takes: $ cat test_send.sh #!/bin/bash SRC_DEV=/dev/sdc DST_DEV=/dev/sdd SRC_MNT=/mnt/sdc DST_MNT=/mnt/sdd mkfs.btrfs -f $SRC_DEV >/dev/null mkfs.btrfs -f $DST_DEV >/dev/null mount $SRC_DEV $SRC_MNT mount $DST_DEV $DST_MNT echo "Creating source filesystem" for ((t = 0; t < 10; t++)); do ( for ((i = 1; i <= 20000; i++)); do xfs_io -f -c "pwrite -S 0xab 0 5000" \ $SRC_MNT/file_$i > /dev/null done ) & worker_pids[$t]=$! done wait ${worker_pids[@]} echo "Creating and sending snapshot" btrfs subvolume snapshot -r $SRC_MNT $SRC_MNT/snap1 >/dev/null /usr/bin/time -f "send took %e seconds" \ btrfs send -f $SRC_MNT/send_file $SRC_MNT/snap1 /usr/bin/time -f "receive took %e seconds" \ btrfs receive -f $SRC_MNT/send_file $DST_MNT umount $SRC_MNT umount $DST_MNT The results for 5 runs were the following: * Without this change average receive time was 26.49 seconds standard deviation of 2.53 seconds * With this change average receive time was 12.51 seconds standard deviation of 0.32 seconds Reported-by: NRobbie Ko <robbieko@synology.com> Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-