- 18 4月, 2017 2 次提交
-
-
由 Edmund Nadolski 提交于
Replace hardcoded numeric values for __merge_refs 'mode' argument with descriptive constants. Signed-off-by: NEdmund Nadolski <enadolski@suse.com> Reviewed-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Elena Reshetova 提交于
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 17 2月, 2017 1 次提交
-
-
由 David Sterba 提交于
Never used. Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 14 2月, 2017 1 次提交
-
-
由 Liu Bo 提交于
All we need is @delayed_refs, all callers have get it ahead of calling btrfs_find_delayed_ref_head since lock needs to be acquired firstly, there is no reason to deference it again inside the function. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 06 12月, 2016 3 次提交
-
-
由 Jeff Mahoney 提交于
Now we only use the root parameter to print the root objectid in a tracepoint. We can use the root parameter from the transaction handle for that. It's also used to join the transaction with async commits, so we remove the comment that it's just for checking. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
There are loads of functions in btrfs that accept a root parameter but only use it to obtain an fs_info pointer. Let's convert those to just accept an fs_info pointer directly. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
We track the node sizes per-root, but they never vary from the values in the superblock. This patch messes with the 80-column style a bit, but subsequent patches to factor out root->fs_info into a convenience variable fix it up again. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 27 9月, 2016 2 次提交
-
-
由 Jeff Mahoney 提交于
For many printks, we want to know which file system issued the message. This patch converts most pr_* calls to use the btrfs_* versions instead. In some cases, this means adding plumbing to allow call sites access to an fs_info pointer. fs/btrfs/check-integrity.c is left alone for another day. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Jeff Mahoney 提交于
CodingStyle chapter 2: "[...] never break user-visible strings such as printk messages, because that breaks the ability to grep for them." This patch unsplits user-visible strings. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 26 9月, 2016 1 次提交
-
-
由 Lu Fengqi 提交于
Only in the case of different root_id or different object_id, check_shared identified extent as the shared. However, If a extent was referred by different offset of same file, it should also be identified as shared. In addition, check_shared's loop scale is at least n^3, so if a extent has too many references, even causes soft hang up. First, add all delayed_ref to the ref_tree and calculate the unqiue_refs, if the unique_refs is greater than one, return BACKREF_FOUND_SHARED. Then individually add the on-disk reference(inline/keyed) to the ref_tree and calculate the unique_refs of the ref_tree to check if the unique_refs is greater than one.Because once there are two references to return SHARED, so the time complexity is close to the constant. Reported-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 25 8月, 2016 1 次提交
-
-
由 Qu Wenruo 提交于
When over 1000 file extents refers to one extent, find_parent_nodes() will be obviously slow, due to the O(n^2)~O(n^3) loops inside __merge_refs(). The following ftrace shows the cubic growth of execution time: 256 refs 5) + 91.768 us | __add_keyed_refs.isra.12 [btrfs](); 5) 1.447 us | __add_missing_keys.isra.13 [btrfs](); 5) ! 114.544 us | __merge_refs [btrfs](); 5) ! 136.399 us | __merge_refs [btrfs](); 512 refs 6) ! 279.859 us | __add_keyed_refs.isra.12 [btrfs](); 6) 3.164 us | __add_missing_keys.isra.13 [btrfs](); 6) ! 442.498 us | __merge_refs [btrfs](); 6) # 2091.073 us | __merge_refs [btrfs](); and 1024 refs 7) ! 368.683 us | __add_keyed_refs.isra.12 [btrfs](); 7) 4.810 us | __add_missing_keys.isra.13 [btrfs](); 7) # 2043.428 us | __merge_refs [btrfs](); 7) * 18964.23 us | __merge_refs [btrfs](); And sort them into the following char: (Unit: us) ------------------------------------------------------------------------ Trace function | 256 ref | 512 refs | 1024 refs | ------------------------------------------------------------------------ __add_keyed_refs | 91 | 249 | 368 | __add_missing_keys | 1 | 3 | 4 | __merge_refs 1st call | 114 | 442 | 2043 | __merge_refs 2nd call | 136 | 2091 | 18964 | ------------------------------------------------------------------------ We can see the that __add_keyed_refs() grows almost in linear behavior. And __add_missing_keys() in this case doesn't change much or takes much time. While for the 1st __merge_refs() it's square growth for the 2nd __merge_refs() call it's cubic growth. It's no doubt that merge_refs() will take a long long time to execute if the number of refs continues its grows. So add a cond_resced() into the loop of __merge_refs(). Although this will solve the problem of soft lockup, we need to use the new rb_tree based structure introduced by Lu Fengqi to really solve the long execution time. Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 26 7月, 2016 2 次提交
-
-
由 Jeff Mahoney 提交于
Now that we have a dummy fs_info associated with each test that uses a root, we don't need the DUMMY_ROOT bit anymore. This lets us make choices without needing an actual root like in e.g. btrfs_find_create_tree_block. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Nikolay Borisov 提交于
BTRFS is using a variety of slab caches to satisfy internal needs. Those slab caches are always allocated with the SLAB_RECLAIM_ACCOUNT, meaning allocations from the caches are going to be accounted as SReclaimable. At the same time btrfs is not registering any shrinkers whatsoever, thus preventing memory from the slabs to be shrunk. This means those caches are not in fact reclaimable. To fix this remove the SLAB_RECLAIM_ACCOUNT on all caches apart from the inode cache, since this one is being freed by the generic VFS super_block shrinker. Also set the transaction related caches as SLAB_TEMPORARY, to better document the lifetime of the objects (it just translates to SLAB_RECLAIM_ACCOUNT). Signed-off-by: NNikolay Borisov <n.borisov.lkml@gmail.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 26 5月, 2016 1 次提交
-
-
由 Nicholas D Steeves 提交于
Signed-off-by: NNicholas D Steeves <nsteeves@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 10 5月, 2016 1 次提交
-
-
由 Vincent Stehlé 提交于
Make sure to deallocate fspath with vfree() in case of error in init_ipath(). fspath is allocated with vmalloc() in init_data_container() since commit 425d17a2 ("Btrfs: use larger limit for translation of logical to inode"). Signed-off-by: NVincent Stehlé <vincent.stehle@intel.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 18 2月, 2016 2 次提交
-
-
由 Kinglong Mee 提交于
Cleanup. kmem_cache_destroy has support NULL argument checking, so drop the double null testing before calling it. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
由 Dave Jones 提交于
The kernel provides a swap() that does the same thing as this code. Signed-off-by: NDave Jones <dsj@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 05 2月, 2016 1 次提交
-
-
由 Filipe Manana 提交于
While doing some tests I ran into an hang on an extent buffer's rwlock that produced the following trace: [39389.800012] NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [fdm-stress:32166] [39389.800016] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 22s! [fdm-stress:32165] [39389.800016] Modules linked in: btrfs dm_mod ppdev xor sha256_generic hmac raid6_pq drbg ansi_cprng aesni_intel i2c_piix4 acpi_cpufreq aes_x86_64 ablk_helper tpm_tis parport_pc i2c_core sg cryptd evdev psmouse lrw tpm parport gf128mul serio_raw pcspkr glue_helper processor button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs] [39389.800016] irq event stamp: 0 [39389.800016] hardirqs last enabled at (0): [< (null)>] (null) [39389.800016] hardirqs last disabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35 [39389.800016] softirqs last enabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35 [39389.800016] softirqs last disabled at (0): [< (null)>] (null) [39389.800016] CPU: 14 PID: 32165 Comm: fdm-stress Not tainted 4.4.0-rc6-btrfs-next-18+ #1 [39389.800016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [39389.800016] task: ffff880175b1ca40 ti: ffff8800a185c000 task.ti: ffff8800a185c000 [39389.800016] RIP: 0010:[<ffffffff810902af>] [<ffffffff810902af>] queued_spin_lock_slowpath+0x57/0x158 [39389.800016] RSP: 0018:ffff8800a185fb80 EFLAGS: 00000202 [39389.800016] RAX: 0000000000000101 RBX: ffff8801710c4e9c RCX: 0000000000000101 [39389.800016] RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000001 [39389.800016] RBP: ffff8800a185fb98 R08: 0000000000000001 R09: 0000000000000000 [39389.800016] R10: ffff8800a185fb68 R11: 6db6db6db6db6db7 R12: ffff8801710c4e98 [39389.800016] R13: ffff880175b1ca40 R14: ffff8800a185fc10 R15: ffff880175b1ca40 [39389.800016] FS: 00007f6d37fff700(0000) GS:ffff8802be9c0000(0000) knlGS:0000000000000000 [39389.800016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39389.800016] CR2: 00007f6d300019b8 CR3: 0000000037c93000 CR4: 00000000001406e0 [39389.800016] Stack: [39389.800016] ffff8801710c4e98 ffff8801710c4e98 ffff880175b1ca40 ffff8800a185fbb0 [39389.800016] ffffffff81091e11 ffff8801710c4e98 ffff8800a185fbc8 ffffffff81091895 [39389.800016] ffff8801710c4e98 ffff8800a185fbe8 ffffffff81486c5c ffffffffa067288c [39389.800016] Call Trace: [39389.800016] [<ffffffff81091e11>] queued_read_lock_slowpath+0x46/0x60 [39389.800016] [<ffffffff81091895>] do_raw_read_lock+0x3e/0x41 [39389.800016] [<ffffffff81486c5c>] _raw_read_lock+0x3d/0x44 [39389.800016] [<ffffffffa067288c>] ? btrfs_tree_read_lock+0x54/0x125 [btrfs] [39389.800016] [<ffffffffa067288c>] btrfs_tree_read_lock+0x54/0x125 [btrfs] [39389.800016] [<ffffffffa0622ced>] ? btrfs_find_item+0xa7/0xd2 [btrfs] [39389.800016] [<ffffffffa069363f>] btrfs_ref_to_path+0xd6/0x174 [btrfs] [39389.800016] [<ffffffffa0693730>] inode_to_path+0x53/0xa2 [btrfs] [39389.800016] [<ffffffffa0693e2e>] paths_from_inode+0x117/0x2ec [btrfs] [39389.800016] [<ffffffffa0670cff>] btrfs_ioctl+0xd5b/0x2793 [btrfs] [39389.800016] [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc [39389.800016] [<ffffffff81276727>] ? __this_cpu_preempt_check+0x13/0x15 [39389.800016] [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc [39389.800016] [<ffffffff8118b3d4>] ? rcu_read_unlock+0x3e/0x5d [39389.800016] [<ffffffff811822f8>] do_vfs_ioctl+0x42b/0x4ea [39389.800016] [<ffffffff8118b4f3>] ? __fget_light+0x62/0x71 [39389.800016] [<ffffffff8118240e>] SyS_ioctl+0x57/0x79 [39389.800016] [<ffffffff814872d7>] entry_SYSCALL_64_fastpath+0x12/0x6f [39389.800016] Code: b9 01 01 00 00 f7 c6 00 ff ff ff 75 32 83 fe 01 89 ca 89 f0 0f 45 d7 f0 0f b1 13 39 f0 74 04 89 c6 eb e2 ff ca 0f 84 fa 00 00 00 <8b> 03 84 c0 74 04 f3 90 eb f6 66 c7 03 01 00 e9 e6 00 00 00 e8 [39389.800012] Modules linked in: btrfs dm_mod ppdev xor sha256_generic hmac raid6_pq drbg ansi_cprng aesni_intel i2c_piix4 acpi_cpufreq aes_x86_64 ablk_helper tpm_tis parport_pc i2c_core sg cryptd evdev psmouse lrw tpm parport gf128mul serio_raw pcspkr glue_helper processor button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs] [39389.800012] irq event stamp: 0 [39389.800012] hardirqs last enabled at (0): [< (null)>] (null) [39389.800012] hardirqs last disabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35 [39389.800012] softirqs last enabled at (0): [<ffffffff8104e58d>] copy_process+0x638/0x1a35 [39389.800012] softirqs last disabled at (0): [< (null)>] (null) [39389.800012] CPU: 15 PID: 32166 Comm: fdm-stress Tainted: G L 4.4.0-rc6-btrfs-next-18+ #1 [39389.800012] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [39389.800012] task: ffff880179294380 ti: ffff880034a60000 task.ti: ffff880034a60000 [39389.800012] RIP: 0010:[<ffffffff81091e8d>] [<ffffffff81091e8d>] queued_write_lock_slowpath+0x62/0x72 [39389.800012] RSP: 0018:ffff880034a639f0 EFLAGS: 00000206 [39389.800012] RAX: 0000000000000101 RBX: ffff8801710c4e98 RCX: 0000000000000000 [39389.800012] RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffff8801710c4e9c [39389.800012] RBP: ffff880034a639f8 R08: 0000000000000001 R09: 0000000000000000 [39389.800012] R10: ffff880034a639b0 R11: 0000000000001000 R12: ffff8801710c4e98 [39389.800012] R13: 0000000000000001 R14: ffff880172cbc000 R15: ffff8801710c4e00 [39389.800012] FS: 00007f6d377fe700(0000) GS:ffff8802be9e0000(0000) knlGS:0000000000000000 [39389.800012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39389.800012] CR2: 00007f6d3d3c1000 CR3: 0000000037c93000 CR4: 00000000001406e0 [39389.800012] Stack: [39389.800012] ffff8801710c4e98 ffff880034a63a10 ffffffff81091963 ffff8801710c4e98 [39389.800012] ffff880034a63a30 ffffffff81486f1b ffffffffa0672cb3 ffff8801710c4e00 [39389.800012] ffff880034a63a78 ffffffffa0672cb3 ffff8801710c4e00 ffff880034a63a58 [39389.800012] Call Trace: [39389.800012] [<ffffffff81091963>] do_raw_write_lock+0x72/0x8c [39389.800012] [<ffffffff81486f1b>] _raw_write_lock+0x3a/0x41 [39389.800012] [<ffffffffa0672cb3>] ? btrfs_tree_lock+0x119/0x251 [btrfs] [39389.800012] [<ffffffffa0672cb3>] btrfs_tree_lock+0x119/0x251 [btrfs] [39389.800012] [<ffffffffa061aeba>] ? rcu_read_unlock+0x5b/0x5d [btrfs] [39389.800012] [<ffffffffa061ce13>] ? btrfs_root_node+0xda/0xe6 [btrfs] [39389.800012] [<ffffffffa061ce83>] btrfs_lock_root_node+0x22/0x42 [btrfs] [39389.800012] [<ffffffffa062046b>] btrfs_search_slot+0x1b8/0x758 [btrfs] [39389.800012] [<ffffffff810fc6b0>] ? time_hardirqs_on+0x15/0x28 [39389.800012] [<ffffffffa06365db>] btrfs_lookup_inode+0x31/0x95 [btrfs] [39389.800012] [<ffffffff8108d62f>] ? trace_hardirqs_on+0xd/0xf [39389.800012] [<ffffffff8148482b>] ? mutex_lock_nested+0x397/0x3bc [39389.800012] [<ffffffffa068821b>] __btrfs_update_delayed_inode+0x59/0x1c0 [btrfs] [39389.800012] [<ffffffffa068858e>] __btrfs_commit_inode_delayed_items+0x194/0x5aa [btrfs] [39389.800012] [<ffffffff81486ab7>] ? _raw_spin_unlock+0x31/0x44 [39389.800012] [<ffffffffa0688a48>] __btrfs_run_delayed_items+0xa4/0x15c [btrfs] [39389.800012] [<ffffffffa0688d62>] btrfs_run_delayed_items+0x11/0x13 [btrfs] [39389.800012] [<ffffffffa064048e>] btrfs_commit_transaction+0x234/0x96e [btrfs] [39389.800012] [<ffffffffa0618d10>] btrfs_sync_fs+0x145/0x1ad [btrfs] [39389.800012] [<ffffffffa0671176>] btrfs_ioctl+0x11d2/0x2793 [btrfs] [39389.800012] [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc [39389.800012] [<ffffffff81140261>] ? __might_fault+0x4c/0xa7 [39389.800012] [<ffffffff81140261>] ? __might_fault+0x4c/0xa7 [39389.800012] [<ffffffff8108a8b0>] ? arch_local_irq_save+0x9/0xc [39389.800012] [<ffffffff8118b3d4>] ? rcu_read_unlock+0x3e/0x5d [39389.800012] [<ffffffff811822f8>] do_vfs_ioctl+0x42b/0x4ea [39389.800012] [<ffffffff8118b4f3>] ? __fget_light+0x62/0x71 [39389.800012] [<ffffffff8118240e>] SyS_ioctl+0x57/0x79 [39389.800012] [<ffffffff814872d7>] entry_SYSCALL_64_fastpath+0x12/0x6f [39389.800012] Code: f0 0f b1 13 85 c0 75 ef eb 2a f3 90 8a 03 84 c0 75 f8 f0 0f b0 13 84 c0 75 f0 ba ff 00 00 00 eb 0a f0 0f b1 13 ff c8 74 0b f3 90 <8b> 03 83 f8 01 75 f7 eb ed c6 43 04 00 5b 5d c3 0f 1f 44 00 00 This happens because in the code path executed by the inode_paths ioctl we end up nesting two calls to read lock a leaf's rwlock when after the first call to read_lock() and before the second call to read_lock(), another task (running the delayed items as part of a transaction commit) has already called write_lock() against the leaf's rwlock. This situation is illustrated by the following diagram: Task A Task B btrfs_ref_to_path() btrfs_commit_transaction() read_lock(&eb->lock); btrfs_run_delayed_items() __btrfs_commit_inode_delayed_items() __btrfs_update_delayed_inode() btrfs_lookup_inode() write_lock(&eb->lock); --> task waits for lock read_lock(&eb->lock); --> makes this task hang forever (and task B too of course) So fix this by avoiding doing the nested read lock, which is easily avoidable. This issue does not happen if task B calls write_lock() after task A does the second call to read_lock(), however there does not seem to exist anything in the documentation that mentions what is the expected behaviour for recursive locking of rwlocks (leaving the idea that doing so is not a good usage of rwlocks). Also, as a side effect necessary for this fix, make sure we do not needlessly read lock extent buffers when the input path has skip_locking set (used when called from send). Cc: stable@vger.kernel.org Signed-off-by: NFilipe Manana <fdmanana@suse.com>
-
- 16 1月, 2016 1 次提交
-
-
由 Geliang Tang 提交于
Fix the following error: fs/btrfs/backref.c:565:1-20: iterator with update on line 577 Fixes: a7ca4225('btrfs: use list_for_each_entry* in backref.c') Signed-off-by: NGeliang Tang <geliangtang@163.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 07 1月, 2016 1 次提交
-
-
由 Geliang Tang 提交于
Use list_for_each_entry*() to simplify the code. Signed-off-by: NGeliang Tang <geliangtang@163.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> Signed-off-by: NDavid Sterba <dsterba@suse.com>
-
- 25 11月, 2015 1 次提交
-
-
由 Josef Bacik 提交于
The backref code will look up the fs_root we're trying to resolve our indirect refs for, unfortunately we use btrfs_read_fs_root_no_name, which returns -ENOENT if the ref is 0. This isn't helpful for the qgroup stuff with snapshot delete as it won't be able to search down the snapshot we are deleting, which will cause us to miss roots. So use btrfs_get_fs_root and send false for check_ref so we can always get the root we're looking for. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NChris Mason <clm@fb.com>
-
- 27 10月, 2015 1 次提交
-
-
由 Chris Mason 提交于
The code for btrfs inode-resolve has never worked properly for files with enough hard links to trigger extrefs. It was trying to get the leaf out of a path after freeing the path: btrfs_release_path(path); leaf = path->nodes[0]; item_size = btrfs_item_size_nr(leaf, slot); The fix here is to use the extent buffer we cloned just a little higher up to avoid deadlocks caused by using the leaf in the path. Signed-off-by: NChris Mason <clm@fb.com> cc: stable@vger.kernel.org # v3.7+ cc: Mark Fasheh <mfasheh@suse.de> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NChris Mason <clm@fb.com>
-
- 22 10月, 2015 1 次提交
-
-
由 Josef Bacik 提交于
With my changes to allow us to find old roots when resolving indirect refs I introduced a regression to the sanity tests. Since we don't really care to go down into the fs roots we just need to have the old behavior of returning ENOENT for dummy roots for the sanity tests. In the future if we want to get fancy we can populate the test fs trees with the references as well. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 14 10月, 2015 1 次提交
-
-
由 Chris Mason 提交于
The code for btrfs inode-resolve has never worked properly for files with enough hard links to trigger extrefs. It was trying to get the leaf out of a path after freeing the path: btrfs_release_path(path); leaf = path->nodes[0]; item_size = btrfs_item_size_nr(leaf, slot); The fix here is to use the extent buffer we cloned just a little higher up to avoid deadlocks caused by using the leaf in the path. Signed-off-by: NChris Mason <clm@fb.com> cc: stable@vger.kernel.org # v3.7+ cc: Mark Fasheh <mfasheh@suse.de> Reviewed-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NMark Fasheh <mfasheh@suse.de>
-
- 09 8月, 2015 2 次提交
-
-
由 Liu Bo 提交于
When we do backref walking, we search firstly in queued delayed refs and then the on-disk backrefs, but we parse differently for shared references, for delayed refs we also add 'ref->root' while for on-disk backrefs we don't, this can prevent us from merging refs indexed by the same bytenr and cause find_parent_nodes() to throw a warning at 'WARN_ON(ref->count < 0)', for example, when we have a shared data extent with 'ref_cnt=1' and a delayed shared data with a BTRFS_DROP_DELAYED_REF, that happens. For shared references, no matter if it's delayed or on-disk, ref->root is not at all used, instead it's ref->parent that really matters, so this has delayed refs handled as the same way as on-disk refs. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
When cloning/deduplicating file extents (through the clone and extent_same ioctls) we can get data back references with offset values that are a result of an unsigned integer arithmetic underflow, that is, values that are much larger then they could be otherwise. This is not a problem when decrementing or dropping the back references (happens when we overwrite the extents or punch a hole for example, through __btrfs_drop_extents()), since we compute the same too large offset value, but it is a problem for the backref walking code, used by an incremental send and the ioctls that are used by the btrfs tool "inspect-internal" commands, as it makes it miss the corresponding file extent items because the search key is set for an extent item that starts at an offset matching the exceptionally large offset value of the data back reference. For an incremental send this causes the send ioctl to fail with -EIO. So teach the backref walking code to deal with these cases by setting the search key's offset to 0 if the backref's offset value is larger than LLONG_MAX (the largest possible file offset). This makes sure the backref walking code finds the corresponding file extent items at the expense of scanning more items and leafs in the btree. Fixing the clone/dedup ioctls to not produce such underflowed results would require major changes breaking backward compatibility, updating user space tools, etc. Simple reproducer case for fstests: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { rm -fr $send_files_dir rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # real QA test starts here _supported_fs btrfs _supported_os Linux _require_scratch _require_cloner _need_to_be_root send_files_dir=$TEST_DIR/btrfs-test-$seq rm -f $seqres.full rm -fr $send_files_dir mkdir $send_files_dir _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount # Create our test file with a single extent of 64K starting at file # offset 128K. $XFS_IO_PROG -f -c "pwrite -S 0xaa 128K 64K" $SCRATCH_MNT/foo \ | _filter_xfs_io _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \ $SCRATCH_MNT/mysnap1 # Now clone parts of the original extent into lower offsets of the file. # # The first clone operation adds a file extent item to file offset 0 # that points to our initial extent with a data offset of 16K. The # corresponding data back reference in the extent tree has an offset of # 18446744073709535232, which is the result of file_offset - data_offset # = 0 - 16K. # # The second clone operation adds a file extent item to file offset 16K # that points to our initial extent with a data offset of 48K. The # corresponding data back reference in the extent tree has an offset of # 18446744073709518848, which is the result of file_offset - data_offset # = 16K - 48K. # # Those large back reference offsets (result of unsigned arithmetic # underflow) confused the back reference walking code (used by an # incremental send and the multiple inspect-internal ioctls) and made it # miss the back references, which for the case of an incremental send it # made it fail with -EIO and print a message like the following to # dmesg: # # "BTRFS error (device sdc): did not find backref in send_root. \ # inode=257, offset=0, disk_byte=12845056 found extent=12845056" # $CLONER_PROG -s $(((128 + 16) * 1024)) -d 0 -l $((16 * 1024)) \ $SCRATCH_MNT/foo $SCRATCH_MNT/foo $CLONER_PROG -s $(((128 + 48) * 1024)) -d $((16 * 1024)) \ -l $((16 * 1024)) $SCRATCH_MNT/foo $SCRATCH_MNT/foo _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \ $SCRATCH_MNT/mysnap2 _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \ -f $send_files_dir/2.snap echo "File digest in the original filesystem:" md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch # Now recreate the filesystem by receiving both send streams and verify # we get the same file contents that the original filesystem had. _scratch_unmount _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/2.snap echo "File digest in the new filesystem:" md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch status=0 exit The test's expected golden output is: wrote 65536/65536 bytes at offset 131072 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) File digest in the original filesystem: 6c6079335cff141b8a31233ead04cbff SCRATCH_MNT/mysnap2/foo File digest in the new filesystem: 6c6079335cff141b8a31233ead04cbff SCRATCH_MNT/mysnap2/foo But it failed with: (...) @@ -1,7 +1,5 @@ QA output created by 097 wrote 65536/65536 bytes at offset 131072 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -File digest in the original filesystem: -6c6079335cff141b8a31233ead04cbff SCRATCH_MNT/mysnap2/foo -File digest in the new filesystem: -6c6079335cff141b8a31233ead04cbff SCRATCH_MNT/mysnap2/foo ... $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/097.full (...) ERROR: send ioctl failed with -5: Input/output error Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 11 6月, 2015 3 次提交
-
-
由 Qu Wenruo 提交于
btrfs_find_all_roots(). Allow btrfs_find_all_roots() to skip all delayed_ref_head lock and tree lock to do tree search. This is important for later qgroup implement which will call find_all_roots() after fs trees are committed. Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Qu Wenruo 提交于
This patch replace the rbtree used in ref_head to list. This has the following advantage: 1) Easier merge logic. With the new list implement, we only need to care merging the tail ref_node with the new ref_node. And this can be done quite easy at insert time, no need to do a indicated merge at run_delayed_refs(). Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Qu Wenruo 提交于
Old __merge_refs() in backref.c will even merge refs whose root_id are different, which makes qgroup gives wrong result. Fix it by checking ref_for_same_block() before any mode specific works. Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 03 6月, 2015 1 次提交
-
-
由 Liu Bo 提交于
The return value of read_tree_block() can confuse callers as it always returns NULL for either -ENOMEM or -EIO, so it's likely that callers parse it to a wrong error, for instance, in btrfs_read_tree_root(). This fixes the above issue. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
- 20 5月, 2015 1 次提交
-
-
由 Mark Fasheh 提交于
btrfs_check_shared() is leaking a return value of '1' from find_parent_nodes(). As a result, callers (in this case, extent_fiemap()) are told extents are shared when they are not. This in turn broke fiemap on btrfs for kernels v3.18 and up. The fix is simple - we just have to clear 'ret' after we are done processing the results of find_parent_nodes(). It wasn't clear to me at first what was happening with return values in btrfs_check_shared() and find_parent_nodes() - thanks to Josef for the help on irc. I added documentation to both functions to make things more clear for the next hacker who might come across them. If we could queue this up for -stable too that would be great. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 04 3月, 2015 1 次提交
-
-
由 David Sterba 提交于
Using {} as initializer for struct seq_elem does not properly initialize the list_head member, but it currently works because it gets set through btrfs_get_tree_mod_seq if 'seq' is 0. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 15 1月, 2015 2 次提交
-
-
由 David Sterba 提交于
A simple wrapper around btrfs_find_item. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 David Sterba 提交于
It's only a simple wrapper around btrfs_find_item, the locally defined key is not used. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 03 1月, 2015 1 次提交
-
-
由 Filipe Manana 提交于
If we are using skinny metadata, the block's tree level is in the offset of the key and not in a btrfs_tree_block_info structure following the extent item (it doesn't exist). Therefore fix it. Besides returning the correct level in the tree, this also prevents reading past the leaf's end in the case where the extent item is the last item in the leaf (eb) and it has only 1 inline reference - this is because sizeof(struct btrfs_tree_block_info) is greater than sizeof(struct btrfs_extent_inline_ref). Got it while running a scrub which produced the following warning: BTRFS: checksum error at logical 42123264 on dev /dev/sde, sector 15840: metadata node (level 24) in tree 5 Signed-off-by: NFilipe Manana <fdmanana@suse.com> Reviewed-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: NChris Mason <clm@fb.com>
-
- 02 10月, 2014 1 次提交
-
-
由 David Sterba 提交于
We know the tree block size, no need to pass it around. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 18 9月, 2014 3 次提交
-
-
由 Josef Bacik 提交于
We have been iterating all references for each extent we have in a file when we do fiemap to see if it is shared. This is fine when you have a few clones or a few snapshots, but when you have 5k snapshots suddenly fiemap just sits there and stares at you. So add btrfs_check_shared which will use the backref walking code but will short circuit as soon as it finds a root or inode that doesn't match the one we currently have. This makes fiemap on my testbox go from looking at me blankly for a day to spitting out actual output in a reasonable amount of time. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 David Sterba 提交于
The nodesize and leafsize were never of different values. Unify the usage and make nodesize the one. Cleanup the redundant checks and helpers. Shaves a few bytes from .text: text data bss dec hex filename 852418 24560 23112 900090 dbbfa btrfs.ko.before 851074 24584 23112 898770 db6d2 btrfs.ko.after Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
由 David Sterba 提交于
btrfs_set_key_type and btrfs_key_type are used inconsistently along with open coded variants. Other members of btrfs_key are accessed directly without any helpers anyway. Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
- 15 8月, 2014 1 次提交
-
-
由 Takashi Iwai 提交于
We've got bug reports that btrfs crashes when quota is enabled on 32bit kernel, typically with the Oops like below: BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<f9234590>] find_parent_nodes+0x360/0x1380 [btrfs] *pde = 00000000 Oops: 0000 [#1] SMP CPU: 0 PID: 151 Comm: kworker/u8:2 Tainted: G S W 3.15.2-1.gd43d97e-default #1 Workqueue: btrfs-qgroup-rescan normal_work_helper [btrfs] task: f1478130 ti: f147c000 task.ti: f147c000 EIP: 0060:[<f9234590>] EFLAGS: 00010213 CPU: 0 EIP is at find_parent_nodes+0x360/0x1380 [btrfs] EAX: f147dda8 EBX: f147ddb0 ECX: 00000011 EDX: 00000000 ESI: 00000000 EDI: f147dda4 EBP: f147ddf8 ESP: f147dd38 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000004 CR3: 00bf3000 CR4: 00000690 Stack: 00000000 00000000 f147dda4 00000050 00000001 00000000 00000001 00000050 00000001 00000000 d3059000 00000001 00000022 000000a8 00000000 00000000 00000000 000000a1 00000000 00000000 00000001 00000000 00000000 11800000 Call Trace: [<f923564d>] __btrfs_find_all_roots+0x9d/0xf0 [btrfs] [<f9237bb1>] btrfs_qgroup_rescan_worker+0x401/0x760 [btrfs] [<f9206148>] normal_work_helper+0xc8/0x270 [btrfs] [<c025e38b>] process_one_work+0x11b/0x390 [<c025eea1>] worker_thread+0x101/0x340 [<c026432b>] kthread+0x9b/0xb0 [<c0712a71>] ret_from_kernel_thread+0x21/0x30 [<c0264290>] kthread_create_on_node+0x110/0x110 This indicates a NULL corruption in prefs_delayed list. The further investigation and bisection pointed that the call of ulist_add_merge() results in the corruption. ulist_add_merge() takes u64 as aux and writes a 64bit value into old_aux. The callers of this function in backref.c, however, pass a pointer of a pointer to old_aux. That is, the function overwrites 64bit value on 32bit pointer. This caused a NULL in the adjacent variable, in this case, prefs_delayed. Here is a quick attempt to band-aid over this: a new function, ulist_add_merge_ptr() is introduced to pass/store properly a pointer value instead of u64. There are still ugly void ** cast remaining in the callers because void ** cannot be taken implicitly. But, it's safer than explicit cast to u64, anyway. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=887046 Cc: <stable@vger.kernel.org> [v3.11+] Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NChris Mason <clm@fb.com>
-