1. 22 1月, 2018 40 次提交
    • L
      Btrfs: do not cache rbio pages if using raid6 recover · 44ac474d
      Liu Bo 提交于
      Since raid6 recover tries all possible combinations of failed stripes,
      
      - when raid6 rebuild algorithm is used, i.e. raid6_datap_recov() and
        raid6_2data_recov(), it may change the in-memory content of failed
        stripes, if such a raid bio is cached, a later raid write rmw or recover
        can steal @stripe_pages from it instead of reading from disks, such that
        it carries the wrong content to do write rmw or recovery and ends up
        with corruption or recovery failures.
      
      - when raid5 rebuild algorithm is used, i.e. xor, raid bio can be cached
        because the only failed stripe which contains @rbio->bio_pages gets
        modified, others remain the same so that their in-memory content is
        consistent with their on-disk content.
      
      This adds a check to skip caching rbio if using raid6 recover.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      44ac474d
    • L
      Btrfs: raid56: iterate raid56 internal bio with bio_for_each_segment_all · 0198e5b7
      Liu Bo 提交于
      Bio iterated by set_bio_pages_uptodate() is raid56 internal one, so it
      will never be a BIO_CLONED bio, and since this is called by end_io
      functions, bio->bi_iter.bi_size is zero, we mustn't use
      bio_for_each_segment() as that is a no-op if bi_size is zero.
      
      Fixes: 6592e58c ("Btrfs: fix write corruption due to bio cloning on raid5/6")
      Cc: <stable@vger.kernel.org> # v4.12-rc6+
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      0198e5b7
    • S
      btrfs: correct wrong comment about magic number of index_cnt · df6703e1
      Su Yue 提交于
      There is no function named btrfs_get_inode_index_count.
      Explanation for magic number index_cnt=2 in btrfs_new_inode() is
      actually located in btrfs_set_inode_index_count().
      
      So replace 'btrfs_get_inode_index_count' in the comment by
      'btrfs_set_inode_index_count'.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      df6703e1
    • N
      btrfs: Make btrfs_inode_rsv_release static · d2560ebd
      Nikolay Borisov 提交于
      It's not used outside of extent-tree so there is no reason to not be
      static.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d2560ebd
    • A
      btrfs: cleanup btrfs_free_stale_device() usage · 1c94da9d
      Anand Jain 提交于
      We call btrfs_free_stale_device() only when we alloc a new struct
      btrfs_device (ret=1), so move it closer to where we alloc the new
      device. Also drop the comments.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1c94da9d
    • D
      btrfs: tree-check: reduce stack consumption in check_dir_item · e2683fc9
      David Sterba 提交于
      I've noticed that the updated item checker stack consumption increased
      dramatically in 542f5385e20cf97447 ("btrfs: tree-checker: Add checker
      for dir item")
      
      tree-checker.c:check_leaf                    +552 (176 -> 728)
      
      The array is 255 bytes long, dynamic allocation would slow down the
      sanity checks so it's more reasonable to keep it on-stack. Moving the
      variable to the scope of use reduces the stack usage again
      
      tree-checker.c:check_leaf                    -264 (728 -> 464)
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e2683fc9
    • X
      btrfs: use correct string length in DEV_INFO ioctl · 6670d4c2
      Xiongfeng Wang 提交于
      gcc-8 reports:
      
      fs/btrfs/ioctl.c: In function 'btrfs_ioctl':
      ./include/linux/string.h:245:9: warning: '__builtin_strncpy' specified
      bound 1024 equals destination size [-Wstringop-truncation]
      
      We need one less byte or call strlcpy() to make it a nul-terminated
      string. This is done on the next line anyway, but we want to avoid the
      warning.
      Signed-off-by: NXiongfeng Wang <xiongfeng.wang@linaro.org>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ update changelog ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6670d4c2
    • A
      btrfs: fail mount when sb flag is not in BTRFS_SUPER_FLAG_SUPP · 6f794e3c
      Anand Jain 提交于
      It appears from the original commit [1] that there isn't any design
      specific reason not to fail the mount instead of just warning. This
      patch will change it to fail.
      
      [1]
       commit 319e4d06
          btrfs: Enhance super validation check
      
      Fixes: 319e4d06 ("btrfs: Enhance super validation check")
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6f794e3c
    • A
      btrfs: add support for SUPER_FLAG_CHANGING_FSID · 98820a7e
      Anand Jain 提交于
      The UUID change by btrfstune sets SUPER_FLAG_CHANGING_FSID and resets it
      only when changing fsid is complete. Its not a good idea to mount the
      device anything in between, reading metadata blocks would fail with UUID
      mismatch.
      
      This patch doesn't add SUPER_FLAG_CHANGING_FSID into
      BTRFS_SUPER_FLAG_SUPP list, so mount will fail (along with the fix in
      the next patch) when SUPER_FLAG_CHANGING_FSID is set.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ update changelog ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      98820a7e
    • A
      btrfs: define SUPER_FLAG_METADUMP_V2 · e2731e55
      Anand Jain 提交于
      btrfs-progs uses super flag bit BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34).
      So just define that in kernel so that we know its been used.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e2731e55
    • L
      Btrfs: avoid losing data raid profile when deleting a device · a6f93c71
      Liu Bo 提交于
      We've avoided data losing raid profile when doing balance, but it
      turns out that deleting a device could also result in the same
      problem.
      
      Say we have 3 disks, and they're created with '-d raid1' profile.
      
      - We have chunk P (the only data chunk on the empty btrfs).
      
      - Suppose that chunk P's two raid1 copies reside in disk A and disk B.
      
      - Now, 'btrfs device remove disk B'
               btrfs_rm_device()
      	   -> btrfs_shrink_device()
      	      -> btrfs_relocate_chunk() #relocate any chunk on disk B
      	      	 			 to other places.
      
      - Chunk P will be removed and a new chunk will be created to hold
        those data, but as chunk P is the only one holding raid1 profile,
        after it goes away, the new chunk will be created as single profile
        which is our default profile.
      
      This fixes the problem by creating an empty data chunk before
      relocating the data chunk.
      
      Metadata/System chunk are supposed to have non-zero bytes all the time
      so their raid profile is preserved.
      Reported-by: NJames Alandt <James.Alandt@wdc.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a6f93c71
    • F
      Btrfs: fix space leak after fallocate and zero range operations · 81fdf638
      Filipe Manana 提交于
      If we do a buffered write after a zero range operation that has an
      unaligned (with the filesystem's sector size) end which also falls within
      an unwritten (prealloc) extent that is currently beyond the inode's
      i_size, and the zero range operation has the flag FALLOC_FL_KEEP_SIZE,
      we end up leaking data and metadata space. This happens because when
      zeroing a range we call btrfs_truncate_block(), which does delalloc
      (loads the page and partially zeroes its content), and in the buffered
      write path we only clear existing delalloc space reservation for the
      range we are writing into if that range starts at an offset smaller then
      the inode's i_size, which makes sense since we can not have delalloc
      extents beyond the i_size, only unwritten extents are allowed.
      
      Example reproducer:
      
       $ mkfs.btrfs -f /dev/sdb
       $ mount /dev/sdb /mnt
       $ xfs_io -f -c "falloc -k 428K 4K" /mnt/foobar
       $ xfs_io -c "fzero -k 0 430K" /mnt/foobar
       $ xfs_io -c "pwrite -S 0xaa 428K 4K" /mnt/foobar
       $ umount /mnt
      
      After the unmount we get the metadata and data space leaks reported in
      dmesg/syslog:
      
       [95794.602253] ------------[ cut here ]------------
       [95794.603322] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9561 btrfs_destroy_inode+0x4e/0x206 [btrfs]
       [95794.605167] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.613000] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.614448] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.615972] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.617114] RIP: 0010:btrfs_destroy_inode+0x4e/0x206 [btrfs]
       [95794.618001] RSP: 0018:ffffc90001737d00 EFLAGS: 00010202
       [95794.618721] RAX: 0000000000000000 RBX: ffff880070fa1418 RCX: ffffc90001737c7c
       [95794.619645] RDX: 0000000175aa0240 RSI: 0000000000000001 RDI: ffff880070fa1418
       [95794.620711] RBP: ffffc90001737d38 R08: 0000000000000000 R09: 0000000000000000
       [95794.621932] R10: ffffc90001737c48 R11: ffff88007123e158 R12: ffff880075b6a000
       [95794.623124] R13: ffff88006145c000 R14: ffff880070fa1418 R15: ffff880070c3b4a0
       [95794.624188] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.625578] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.626522] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.627647] Call Trace:
       [95794.628128]  destroy_inode+0x3d/0x55
       [95794.628573]  evict+0x177/0x17e
       [95794.629010]  dispose_list+0x50/0x71
       [95794.629478]  evict_inodes+0x132/0x141
       [95794.630289]  generic_shutdown_super+0x3f/0x10b
       [95794.630864]  kill_anon_super+0x12/0x1c
       [95794.631383]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.631930]  deactivate_locked_super+0x30/0x68
       [95794.632539]  deactivate_super+0x36/0x39
       [95794.633200]  cleanup_mnt+0x49/0x67
       [95794.633818]  __cleanup_mnt+0x12/0x14
       [95794.634416]  task_work_run+0x82/0xa6
       [95794.634902]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.635525]  syscall_return_slowpath+0x18c/0x1af
       [95794.636122]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.636834] RIP: 0033:0x7fa678cb99a7
       [95794.637370] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.638672] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.639596] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.640703] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.641773] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.643150] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.644249] Code: ff 4c 8b a8 80 06 00 00 48 8b 87 c0 01 00 00 48 85 c0 74 02 0f ff 48 83 bb e0 02 00 00 00 74 02 0f ff 83 bb 3c ff ff ff 00 74 02 <0f> ff 83 bb 40 ff ff ff 00 74 02 0f ff 48 83 bb f8 fe ff ff 00
       [95794.646929] ---[ end trace e95877675c6ec007 ]---
       [95794.647751] ------------[ cut here ]------------
       [95794.648509] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9562 btrfs_destroy_inode+0x59/0x206 [btrfs]
       [95794.649842] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.654659] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.655894] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.657546] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.658433] RIP: 0010:btrfs_destroy_inode+0x59/0x206 [btrfs]
       [95794.659279] RSP: 0018:ffffc90001737d00 EFLAGS: 00010202
       [95794.660054] RAX: 0000000000000000 RBX: ffff880070fa1418 RCX: ffffc90001737c7c
       [95794.660753] RDX: 0000000175aa0240 RSI: 0000000000000001 RDI: ffff880070fa1418
       [95794.661513] RBP: ffffc90001737d38 R08: 0000000000000000 R09: 0000000000000000
       [95794.662289] R10: ffffc90001737c48 R11: ffff88007123e158 R12: ffff880075b6a000
       [95794.663393] R13: ffff88006145c000 R14: ffff880070fa1418 R15: ffff880070c3b4a0
       [95794.664342] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.665673] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.666593] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.667629] Call Trace:
       [95794.668065]  destroy_inode+0x3d/0x55
       [95794.668637]  evict+0x177/0x17e
       [95794.669179]  dispose_list+0x50/0x71
       [95794.669830]  evict_inodes+0x132/0x141
       [95794.670416]  generic_shutdown_super+0x3f/0x10b
       [95794.671103]  kill_anon_super+0x12/0x1c
       [95794.671786]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.672552]  deactivate_locked_super+0x30/0x68
       [95794.673393]  deactivate_super+0x36/0x39
       [95794.674107]  cleanup_mnt+0x49/0x67
       [95794.674706]  __cleanup_mnt+0x12/0x14
       [95794.675279]  task_work_run+0x82/0xa6
       [95794.675795]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.676507]  syscall_return_slowpath+0x18c/0x1af
       [95794.677275]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.678006] RIP: 0033:0x7fa678cb99a7
       [95794.678600] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.679739] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.680779] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.681837] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.682867] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.683891] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.684843] Code: c0 01 00 00 48 85 c0 74 02 0f ff 48 83 bb e0 02 00 00 00 74 02 0f ff 83 bb 3c ff ff ff 00 74 02 0f ff 83 bb 40 ff ff ff 00 74 02 <0f> ff 48 83 bb f8 fe ff ff 00 74 02 0f ff 48 83 bb 00 ff ff ff
       [95794.687156] ---[ end trace e95877675c6ec008 ]---
       [95794.687876] ------------[ cut here ]------------
       [95794.688579] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9565 btrfs_destroy_inode+0x7d/0x206 [btrfs]
       [95794.689735] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.695015] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.696396] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.697956] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.698925] RIP: 0010:btrfs_destroy_inode+0x7d/0x206 [btrfs]
       [95794.699763] RSP: 0018:ffffc90001737d00 EFLAGS: 00010206
       [95794.700434] RAX: 0000000000000000 RBX: ffff880070fa1418 RCX: ffffc90001737c7c
       [95794.701445] RDX: 0000000175aa0240 RSI: 0000000000000001 RDI: ffff880070fa1418
       [95794.702448] RBP: ffffc90001737d38 R08: 0000000000000000 R09: 0000000000000000
       [95794.703557] R10: ffffc90001737c48 R11: ffff88007123e158 R12: ffff880075b6a000
       [95794.704441] R13: ffff88006145c000 R14: ffff880070fa1418 R15: ffff880070c3b4a0
       [95794.705270] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.706341] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.707001] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.708030] Call Trace:
       [95794.708466]  destroy_inode+0x3d/0x55
       [95794.709071]  evict+0x177/0x17e
       [95794.709497]  dispose_list+0x50/0x71
       [95794.709973]  evict_inodes+0x132/0x141
       [95794.710564]  generic_shutdown_super+0x3f/0x10b
       [95794.711200]  kill_anon_super+0x12/0x1c
       [95794.711633]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.712139]  deactivate_locked_super+0x30/0x68
       [95794.712608]  deactivate_super+0x36/0x39
       [95794.713093]  cleanup_mnt+0x49/0x67
       [95794.713514]  __cleanup_mnt+0x12/0x14
       [95794.713933]  task_work_run+0x82/0xa6
       [95794.714543]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.715247]  syscall_return_slowpath+0x18c/0x1af
       [95794.715952]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.716653] RIP: 0033:0x7fa678cb99a7
       [95794.721100] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.722052] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.722856] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.723698] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.724736] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.725928] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.726728] Code: 40 ff ff ff 00 74 02 0f ff 48 83 bb f8 fe ff ff 00 74 02 0f ff 48 83 bb 00 ff ff ff 00 74 02 0f ff 48 83 bb 30 ff ff ff 00 74 02 <0f> ff 48 83 bb 08 ff ff ff 00 74 02 0f ff 4d 85 e4 0f 84 52 01
       [95794.729203] ---[ end trace e95877675c6ec009 ]---
       [95794.841054] ------------[ cut here ]------------
       [95794.841829] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:5831 btrfs_free_block_groups+0x235/0x36a [btrfs]
       [95794.843425] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.850658] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.852590] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.854752] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.855812] RIP: 0010:btrfs_free_block_groups+0x235/0x36a [btrfs]
       [95794.856811] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.857805] RAX: 0000000080000000 RBX: ffff88006145c000 RCX: 0000000000000001
       [95794.859014] RDX: 00000001810af668 RSI: 0000000000000002 RDI: 00000000ffffffff
       [95794.860270] RBP: ffffc90001737d98 R08: 0000000000000000 R09: ffffffff817e22b9
       [95794.861525] R10: ffffc90001737c80 R11: 00000000000337fd R12: 0000000000000000
       [95794.862700] R13: ffff88006145c0c0 R14: ffff88021b61a800 R15: ffff88006145c100
       [95794.863810] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.865149] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.866099] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.867198] Call Trace:
       [95794.867626]  close_ctree+0x1db/0x2b8 [btrfs]
       [95794.868188]  ? evict_inodes+0x132/0x141
       [95794.869037]  btrfs_put_super+0x15/0x17 [btrfs]
       [95794.870400]  generic_shutdown_super+0x6a/0x10b
       [95794.871262]  kill_anon_super+0x12/0x1c
       [95794.872046]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.872746]  deactivate_locked_super+0x30/0x68
       [95794.873687]  deactivate_super+0x36/0x39
       [95794.874639]  cleanup_mnt+0x49/0x67
       [95794.875504]  __cleanup_mnt+0x12/0x14
       [95794.876126]  task_work_run+0x82/0xa6
       [95794.876788]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.877777]  syscall_return_slowpath+0x18c/0x1af
       [95794.878381]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.878888] RIP: 0033:0x7fa678cb99a7
       [95794.879307] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.880204] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.881640] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.882690] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.883538] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.884562] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.885664] Code: 89 ef e8 07 ec 32 e1 e8 9d c0 ea e0 48 8d b3 28 02 00 00 48 83 c9 ff 31 d2 48 89 df e8 29 c5 ff ff 48 83 bb 80 02 00 00 00 74 02 <0f> ff 48 83 bb 88 02 00 00 00 74 02 0f ff 48 83 bb d8 02 00 00
       [95794.887980] ---[ end trace e95877675c6ec00a ]---
       [95794.888739] ------------[ cut here ]------------
       [95794.889405] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:5832 btrfs_free_block_groups+0x241/0x36a [btrfs]
       [95794.891020] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.897551] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.898509] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.899685] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.900592] RIP: 0010:btrfs_free_block_groups+0x241/0x36a [btrfs]
       [95794.901387] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.902300] RAX: 0000000080000000 RBX: ffff88006145c000 RCX: 0000000000000001
       [95794.903260] RDX: 00000001810af668 RSI: 0000000000000002 RDI: 00000000ffffffff
       [95794.904332] RBP: ffffc90001737d98 R08: 0000000000000000 R09: ffffffff817e22b9
       [95794.905300] R10: ffffc90001737c80 R11: 00000000000337fd R12: 0000000000000000
       [95794.906439] R13: ffff88006145c0c0 R14: ffff88021b61a800 R15: ffff88006145c100
       [95794.907459] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.908625] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.909511] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.910630] Call Trace:
       [95794.911153]  close_ctree+0x1db/0x2b8 [btrfs]
       [95794.911837]  ? evict_inodes+0x132/0x141
       [95794.912344]  btrfs_put_super+0x15/0x17 [btrfs]
       [95794.912975]  generic_shutdown_super+0x6a/0x10b
       [95794.913788]  kill_anon_super+0x12/0x1c
       [95794.914424]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.915142]  deactivate_locked_super+0x30/0x68
       [95794.915831]  deactivate_super+0x36/0x39
       [95794.916433]  cleanup_mnt+0x49/0x67
       [95794.917045]  __cleanup_mnt+0x12/0x14
       [95794.917665]  task_work_run+0x82/0xa6
       [95794.918309]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.919021]  syscall_return_slowpath+0x18c/0x1af
       [95794.919722]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.920426] RIP: 0033:0x7fa678cb99a7
       [95794.921039] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.922303] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.923335] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.924364] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.925435] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.926533] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.927557] Code: 48 8d b3 28 02 00 00 48 83 c9 ff 31 d2 48 89 df e8 29 c5 ff ff 48 83 bb 80 02 00 00 00 74 02 0f ff 48 83 bb 88 02 00 00 00 74 02 <0f> ff 48 83 bb d8 02 00 00 00 74 02 0f ff 48 83 bb e0 02 00 00
       [95794.930166] ---[ end trace e95877675c6ec00b ]---
       [95794.930961] ------------[ cut here ]------------
       [95794.931727] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:9953 btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.932729] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.938394] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.939842] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.941455] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.942336] RIP: 0010:btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.943268] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.944127] RAX: ffff8802004fd0e8 RBX: ffff88006145c000 RCX: 0000000000000001
       [95794.945211] RDX: 00000001810af668 RSI: 0000000000000002 RDI: 00000000ffffffff
       [95794.946316] RBP: ffffc90001737d98 R08: 0000000000000000 R09: ffffffff817e22b9
       [95794.947271] R10: ffffc90001737c80 R11: 00000000000337fd R12: ffff8802004fd0e8
       [95794.948219] R13: ffff88006145c0c0 R14: ffff88006145e598 R15: ffff88006145c100
       [95794.949193] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.950495] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.951338] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.952361] Call Trace:
       [95794.952811]  close_ctree+0x1db/0x2b8 [btrfs]
       [95794.953522]  ? evict_inodes+0x132/0x141
       [95794.954543]  btrfs_put_super+0x15/0x17 [btrfs]
       [95794.955231]  generic_shutdown_super+0x6a/0x10b
       [95794.955916]  kill_anon_super+0x12/0x1c
       [95794.956414]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.956953]  deactivate_locked_super+0x30/0x68
       [95794.957635]  deactivate_super+0x36/0x39
       [95794.958256]  cleanup_mnt+0x49/0x67
       [95794.958701]  __cleanup_mnt+0x12/0x14
       [95794.959181]  task_work_run+0x82/0xa6
       [95794.959635]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.960182]  syscall_return_slowpath+0x18c/0x1af
       [95794.960731]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.961438] RIP: 0033:0x7fa678cb99a7
       [95794.961990] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.963111] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.963975] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.964680] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.965763] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.966868] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.967800] Code: 00 00 00 4c 8b a3 98 25 00 00 49 83 bc 24 60 ff ff ff 00 75 16 49 83 bc 24 68 ff ff ff 00 75 0b 49 83 bc 24 70 ff ff ff 00 74 16 <0f> ff 49 8d b4 24 18 ff ff ff 31 c9 31 d2 48 89 df e8 93 7a ff
       [95794.970629] ---[ end trace e95877675c6ec00c ]---
       [95794.971451] BTRFS info (device sdi): space_info 1 has 7680000 free, is not full
       [95794.972351] BTRFS info (device sdi): space_info total=8388608, used=704512, pinned=0, reserved=0, may_use=4096, readonly=0
       [95794.973595] ------------[ cut here ]------------
       [95794.974353] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:9953 btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.980163] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.986461] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.987591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.988929] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.989922] RIP: 0010:btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.990715] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.991431] RAX: ffff88020f6e70e8 RBX: ffff88006145c000 RCX: ffffffff8115a906
       [95794.992455] RDX: ffffffff8115a902 RSI: ffff880075aa0b40 RDI: ffff880075aa0b40
       [95794.993535] RBP: ffffc90001737d98 R08: 0000000000000020 R09: fffffffffffffff7
       [95794.994573] R10: 00000000ffffffc4 R11: ffff8800633b1bc0 R12: ffff88020f6e70e8
       [95794.996250] R13: 0000000000000038 R14: ffff88006145e598 R15: 0000000000000000
       [95794.997233] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.998592] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.999484] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95795.000542] Call Trace:
       [95795.001138]  close_ctree+0x1db/0x2b8 [btrfs]
       [95795.001885]  ? evict_inodes+0x132/0x141
       [95795.002407]  btrfs_put_super+0x15/0x17 [btrfs]
       [95795.003093]  generic_shutdown_super+0x6a/0x10b
       [95795.003720]  kill_anon_super+0x12/0x1c
       [95795.004353]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95795.005095]  deactivate_locked_super+0x30/0x68
       [95795.005716]  deactivate_super+0x36/0x39
       [95795.006388]  cleanup_mnt+0x49/0x67
       [95795.006939]  __cleanup_mnt+0x12/0x14
       [95795.007512]  task_work_run+0x82/0xa6
       [95795.008124]  prepare_exit_to_usermode+0xe1/0x10c
       [95795.008994]  syscall_return_slowpath+0x18c/0x1af
       [95795.009831]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95795.010610] RIP: 0033:0x7fa678cb99a7
       [95795.011193] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95795.012327] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95795.013432] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95795.014558] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95795.015577] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95795.016569] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95795.017662] Code: 00 00 00 4c 8b a3 98 25 00 00 49 83 bc 24 60 ff ff ff 00 75 16 49 83 bc 24 68 ff ff ff 00 75 0b 49 83 bc 24 70 ff ff ff 00 74 16 <0f> ff 49 8d b4 24 18 ff ff ff 31 c9 31 d2 48 89 df e8 93 7a ff
       [95795.020538] ---[ end trace e95877675c6ec00d ]---
       [95795.021259] BTRFS info (device sdi): space_info 4 has 1072775168 free, is not full
       [95795.022390] BTRFS info (device sdi): space_info total=1073741824, used=114688, pinned=0, reserved=0, may_use=786432, readonly=65536
      
      Fix this by ensuring the zero range operation does not call
      btrfs_truncate_block() if the corresponding extent is an unwritten one
      (it's pointless anyway, since reading from an unwritten extent yields
      zeroes).
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Tested-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      81fdf638
    • F
      Btrfs: fix missing inode i_size update after zero range operation · 9f13ce74
      Filipe Manana 提交于
      For a fallocate's zero range operation that targets a range with an end
      that is not aligned to the sector size, we can end up not updating the
      inode's i_size. This happens when the last page of the range maps to an
      unwritten (prealloc) extent and before that last page we have either a
      hole or a written extent. This is because in this scenario we relied
      on a call to btrfs_prealloc_file_range() to update the inode's i_size,
      however it can only update the i_size to the "down aligned" end of the
      range.
      
      Example:
      
       $ mkfs.btrfs -f /dev/sdc
       $ mount /dev/sdc /mnt
       $ xfs_io -f -c "pwrite -S 0xff 0 428K" /mnt/foobar
       $ xfs_io -c "falloc -k 428K 4K" /mnt/foobar
       $ xfs_io -c "fzero 0 430K" /mnt/foobar
       $ du --bytes /mnt/foobar
       438272	/mnt/foobar
      
      The inode's i_size was left as 428Kb (438272 bytes) when it should have
      been updated to 430Kb (440320 bytes).
      Fix this by always updating the inode's i_size explicitly after zeroing
      the range.
      
      Fixes: ba6d5887946ff86d93dc ("Btrfs: add support for fallocate's zero range operation")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      9f13ce74
    • F
      Btrfs: use cached state when dirtying pages during buffered write · 94f45071
      Filipe Manana 提交于
      During a buffered IO write, we can have an extent state that we got when
      we locked the range (if the range starts at an offset lower than eof), so
      always pass it to btrfs_dirty_pages() so that setting the delalloc bit
      in the range does not need to do a full search in the inode's io tree,
      saving time and reducing the amount of time we hold the io tree's lock.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      94f45071
    • F
      Btrfs: add support for fallocate's zero range operation · f27451f2
      Filipe Manana 提交于
      This implements support the zero range operation of fallocate. For now
      at least it's as simple as possible while reusing most of the existing
      fallocate and hole punching infrastructure.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f27451f2
    • L
      Btrfs: do not merge rbios if their fail stripe index are not identical · cc54ff62
      Liu Bo 提交于
      Since fail stripe index in rbio would be used to decide which
      algorithm reconstruction would be run, we cannot merge rbios if
      their's fail striped indexes are different, otherwise, one of the two
      reconstructions would fail.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      cc54ff62
    • L
      Btrfs: remove redundant check in rbio_can_merge · db34be19
      Liu Bo 提交于
      Given the above
      '
      if (last->operation != cur->operation)
      	return 0;
      ',
      it's guaranteed that two operations are same.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      db34be19
    • A
      btrfs: minor style cleanups in btrfs_scan_one_device · 05a5c55d
      Anand Jain 提交于
      Assign ret = -EINVAL where it is actually required.
      Remove { } around single line if else code.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      05a5c55d
    • A
      btrfs: simplify mutex unlocking code in btrfs_commit_transaction · c1f32b7c
      Anand Jain 提交于
      No functional change rearrange the mutex_unlock.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      [ edit subject ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c1f32b7c
    • A
      btrfs: rename btrfs_device::scrub_device to scrub_ctx · cadbc0a0
      Anand Jain 提交于
      btrfs_device::scrub_device is not a device which is being scrubbed,
      but it holds the scrub context, so rename to reflect the same. No
      functional changes here.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      cadbc0a0
    • A
      btrfS: collapse btrfs_handle_error() into __btrfs_handle_fs_error() · 922ea899
      Anand Jain 提交于
      There is no other consumer for btrfs_handle_error() other than
      __btrfs_handle_fs_error(), further this function quite small.
      Merge it into its parent.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      [ reformat comment ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      922ea899
    • A
      btrfs: remove check for BTRFS_FS_STATE_ERROR which we just set · 61ecda68
      Anand Jain 提交于
      __btrfs_handle_fs_error() sets BTRFS_FS_STATE_ERROR, and calls
      btrfs_handle_error() so no need to check if the BTRFS_FS_STATE_ERROR
      is set in btrfs_handle_error(). And there is no other user of
      btrfs_handle_error() as well.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      61ecda68
    • L
      Btrfs: make raid6 rebuild retry more · 8810f751
      Liu Bo 提交于
      There is a scenario that can end up with rebuild process failing to
      return good content, i.e.
      suppose that all disks can be read without problems and if the content
      that was read out doesn't match its checksum, currently for raid6
      btrfs at most retries twice,
      
      - the 1st retry is to rebuild with all other stripes, it'll eventually
        be a raid5 xor rebuild,
      - if the 1st fails, the 2nd retry will deliberately fail parity p so
        that it will do raid6 style rebuild,
      
      however, the chances are that another non-parity stripe content also
      has something corrupted, so that the above retries are not able to
      return correct content, and users will think of this as data loss.
      More seriouly, if the loss happens on some important internal btree
      roots, it could refuse to mount.
      
      This extends btrfs to do more retries and each retry fails only one
      stripe.  Since raid6 can tolerate 2 disk failures, if there is one
      more failure besides the failure on which we're recovering, this can
      always work.
      
      The worst case is to retry as many times as the number of raid6 disks,
      but given the fact that such a scenario is really rare in practice,
      it's still acceptable.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      8810f751
    • L
      Btrfs: fix scrub to repair raid6 corruption · 762221f0
      Liu Bo 提交于
      The raid6 corruption is that,
      suppose that all disks can be read without problems and if the content
      that was read out doesn't match its checksum, currently for raid6
      btrfs at most retries twice,
      
      - the 1st retry is to rebuild with all other stripes, it'll eventually
        be a raid5 xor rebuild,
      - if the 1st fails, the 2nd retry will deliberately fail parity p so
        that it will do raid6 style rebuild,
      
      however, the chances are that another non-parity stripe content also
      has something corrupted, so that the above retries are not able to
      return correct content.
      
      We've fixed normal reads to rebuild raid6 correctly with more retries
      in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix
      scrub to do the exactly same rebuild process.
      
      [1]: https://patchwork.kernel.org/patch/10091755/Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      762221f0
    • A
      btrfs: factor btrfs_check_rw_degradable() to check given device · 6528b99d
      Anand Jain 提交于
      Update btrfs_check_rw_degradable() to check against the given device if
      its lost.
      
      We can use this function to know if the volume is going to be in
      degraded mode OR failed state, when the given device fails.  Which is
      needed when we are handling the device failed state.
      
      A preparatory patch does not affect the flow as such.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      [ enhance comment ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6528b99d
    • D
      btrfs: sink unlock_extent parameter gfp_flags · e43bbe5e
      David Sterba 提交于
      All callers pass either GFP_NOFS or GFP_KERNEL now, so we can sink the
      parameter to the function, though we lose some of the slightly better
      semantics of GFP_KERNEL in some places, it's worth cleaning up the
      callchains.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e43bbe5e
    • D
      btrfs: add separate helper for unlock_extent_cached with GFP_ATOMIC · d810a4be
      David Sterba 提交于
      There's only one instance where we pass different gfp mask to
      unlock_extent_cached. Add a separate helper for that and then we can
      drop the gfp parameter from unlock_extent_cached.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d810a4be
    • D
      btrfs: drop unused parameters from mount_subvol · 5bedc48a
      David Sterba 提交于
      Recent patches reworking the mount path left some unused parameters. We
      pass a vfsmount to mount_subvol, the flags and data (ie. mount options)
      have been already applied and we will not need them.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5bedc48a
    • M
      btrfs: cleanup unnecessary string dup in btrfs_parse_options() · e215772c
      Misono, Tomohiro 提交于
      Long ago, commit edf24abe ("btrfs: sanity mount option parsing and
      early mount code") split the btrfs_parse_options() into two parts
      (btrfs_parse_early_options() and btrfs_parse_options()). As a result,
      btrfs_parse_optins no longer gets called twice and is the last one to
      parse mount option string. Therefore there is no need to dup it.
      Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e215772c
    • L
      Btrfs: remove unused wait in btrfs_stripe_hash · 203e02d9
      Liu Bo 提交于
      In fact nobody is waiting on @wait's waitqueue, it can be safely
      removed.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      203e02d9
    • N
      btrfs: Remove redundant pair of bio_get/set in __btrfs_submit_dio_bio · 36f7894f
      Nikolay Borisov 提交于
      The bio is not referenced after it has been submitted and the endio is
      going to consume the sole reference on successful submission. On error,
      the callers of __btrfs_submit_dio_bio do invoke bio_put so we don't
      leak it either.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      36f7894f
    • N
      btrfs: Remove redundant bio_get/bio_set pair from submit_one_bio · ffc9c8dd
      Nikolay Borisov 提交于
      The bio is never referenced after it has been submitted so there is no
      point in getting an extra reference.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ffc9c8dd
    • N
      btrfs: Remove redundant bio_get/set from submit_dio_repair_bio · ea057f6d
      Nikolay Borisov 提交于
      The bio that is passsed is the newly created repair bio which already
      has a reference count of 1, which is going to be consumed by the
      endio routine on successful submission. On error the handler also
      calls bio_put.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ea057f6d
    • N
      btrfs: Remove redundant bio_get/set calls in compressed read/write paths · 32506af5
      Nikolay Borisov 提交于
      bio_get/set is necessary only if the bio is going to be referenced
      following submissions. In the code paths where such calls are made
      we don't really need them since the bio is referenced only if
      btrfs_map_bio returns an error. And this function can return an error
      prior to submission only. So referencing the bio is safe. Furthermore
      we do call bio_endio which will consume the last reference. So let's
      remove the redundant calls.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      32506af5
    • N
      btrfs: Improve btrfs_search_slot description · 4271ecea
      Nikolay Borisov 提交于
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4271ecea
    • D
      btrfs: heuristic: call get4bits directly · 36243c91
      David Sterba 提交于
      As it's a single instance and local to the file, we don't need to pass
      it as an argument.
      Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      36243c91
    • D
      btrfs: heuristic: open code copy_call callback of radix sort · 7add17be
      David Sterba 提交于
      The callback is trivial and we don't need the abstraction for our
      purposes. Let's open code it.
      Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      7add17be
    • D
      btrfs: heuristic: open code get_num callback of radix sort · 23ae8c63
      David Sterba 提交于
      The callback is trivial and we don't need the abstraction for our
      purposes. Let's open code it and also make the array types explicit.
      Reviewed-by: NTimofey Titovets <nefelim4ag@gmail.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      23ae8c63
    • M
      btrfs: remove unused arg from parse_subvol_options() · 78f6beac
      Misono, Tomohiro 提交于
      Remove unused arg 'holder' from parse_subvol_options(), which has been
      forgotten to be cleaned in the commit b99beb110e2d ("btrfs: split
      parse_early_options() in two").
      Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      78f6beac
    • M
      btrfs: remove unused setup_root_args() · 83085935
      Misono, Tomohiro 提交于
      Since setup_root_args() is not used anymore, just remove it.
      Signed-off-by: NTomohiro Misono <misono.tomohiro@jp.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      83085935