1. 22 1月, 2018 1 次提交
  2. 28 11月, 2017 1 次提交
    • L
      Btrfs: fix list_add corruption and soft lockups in fsync · ebb70442
      Liu Bo 提交于
      Xfstests btrfs/146 revealed this corruption,
      
      [   58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read
      [   58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
      [   58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88).
      [   58.153518] ------------[ cut here ]------------
      [   58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0
      ...
      [   58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0
      ...
      [   58.161956] Call Trace:
      [   58.162264]  btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs]
      [   58.163583]  btrfs_log_dentry_safe+0x60/0x80 [btrfs]
      [   58.164003]  btrfs_sync_file+0x4c2/0x6f0 [btrfs]
      [   58.164393]  vfs_fsync_range+0x5f/0xd0
      [   58.164898]  do_fsync+0x5a/0x90
      [   58.165170]  SyS_fsync+0x10/0x20
      [   58.165395]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      ...
      
      It turns out that we could record btrfs_log_ctx:io_err in
      log_one_extents when IO fails, but make log_one_extents() return '0'
      instead of -EIO, so the IO error is not acknowledged by the callers,
      i.e.  btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list
      from list head 'root->log_ctxs'.  Since btrfs_log_ctx is allocated
      from stack memory, it'd get freed with a object alive on the
      list. then a future list_add will throw the above warning.
      
      This returns the correct error in the above case.
      
      Jeff also reported this while testing against his fsync error
      patch set[1].
      
      [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html
      "btrfs list corruption and soft lockups while testing writeback error handling"
      
      Fixes: 8407f553 ("Btrfs: fix data corruption after fast fsync and writeback error")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ebb70442
  3. 30 10月, 2017 3 次提交
  4. 26 9月, 2017 1 次提交
    • J
      btrfs: log csums for all modified extents · 8c6c5928
      Josef Bacik 提交于
      Amir reported a bug discovered by his cleaned up version of my
      dm-log-writes xfstests where we were missing csums at certain replay
      points.  This is because fsx was doing an msync(), which essentially
      fsync()'s a specific range of a file.  We will log all modified extents,
      but only search for the checksums in the range we are being asked to
      sync.  We cannot simply log the extents in the range we're being asked
      because we are logging the inode item as it is currently, which if it
      has had a i_size update before the msync means we will miss extents when
      replaying.  We could possibly get around this by marking the inode with
      the transaction that extended the i_size to see if we have this case,
      but this would be racy and we'd have to lock the whole range of the
      inode to make sure we didn't have an ordered extent outside of our range
      that was in the middle of completing.
      
      Fix this simply by keeping track of the modified extents range and
      logging the csums for the entire range of extents that we are logging.
      This makes the xfstest pass.
      Reported-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      8c6c5928
  5. 21 8月, 2017 1 次提交
  6. 18 8月, 2017 2 次提交
    • F
      Btrfs: fix assertion failure during fsync in no-holes mode · 6399fb5a
      Filipe Manana 提交于
      When logging an inode in full mode that has an inline compressed extent
      that represents a range with a size matching the sector size (currently
      the same as the page size), has a trailing hole and the no-holes feature
      is enabled, we end up failing an assertion leading to a trace like the
      following:
      
      [141812.031528] assertion failed: len == i_size, file: fs/btrfs/tree-log.c, line: 4453
      [141812.033069] ------------[ cut here ]------------
      [141812.034330] kernel BUG at fs/btrfs/ctree.h:3452!
      [141812.035137] invalid opcode: 0000 [#1] PREEMPT SMP
      [141812.035932] Modules linked in: btrfs dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_flakey dm_mod dax ppdev evdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 tpm_tis psmouse crypto_simd parport_pc sg pcspkr tpm_tis_core cryptd parport serio_raw glue_helper tpm i2c_piix4 i2c_core button sunrpc loop autofs4 ext4 crc16 jbd2 mbcache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod ata_generic virtio_scsi ata_piix floppy crc32c_intel libata scsi_mod virtio_pci virtio_ring e1000 virtio [last unloaded: btrfs]
      [141812.036790] CPU: 3 PID: 845 Comm: fdm-stress Tainted: G    B   W       4.12.3-btrfs-next-52+ #1
      [141812.036790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
      [141812.036790] task: ffff8801e6694180 task.stack: ffffc90009004000
      [141812.036790] RIP: 0010:assfail.constprop.18+0x1c/0x1e [btrfs]
      [141812.036790] RSP: 0018:ffffc90009007bc0 EFLAGS: 00010282
      [141812.036790] RAX: 0000000000000046 RBX: ffff88017512c008 RCX: 0000000000000001
      [141812.036790] RDX: ffff88023fd95201 RSI: ffffffff8182264c RDI: 00000000ffffffff
      [141812.036790] RBP: ffffc90009007bc0 R08: 0000000000000001 R09: 0000000000000001
      [141812.036790] R10: 0000000000001000 R11: ffffffff82f5a0c9 R12: ffff88014e5947e8
      [141812.036790] R13: 00000000000b4000 R14: ffff8801b234d008 R15: 0000000000000000
      [141812.036790] FS:  00007fdba6ffd700(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
      [141812.036790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [141812.036790] CR2: 00007fdb9c000010 CR3: 000000016efa2000 CR4: 00000000001406e0
      [141812.036790] Call Trace:
      [141812.036790]  btrfs_log_inode+0x9f0/0xd3d [btrfs]
      [141812.036790]  ? __mutex_lock+0x120/0x3ce
      [141812.036790]  btrfs_log_inode_parent+0x224/0x685 [btrfs]
      [141812.036790]  ? lock_acquire+0x16b/0x1af
      [141812.036790]  btrfs_log_dentry_safe+0x60/0x7b [btrfs]
      [141812.036790]  btrfs_sync_file+0x32e/0x3f8 [btrfs]
      [141812.036790]  vfs_fsync_range+0x8a/0x9d
      [141812.036790]  vfs_fsync+0x1c/0x1e
      [141812.036790]  do_fsync+0x31/0x4a
      [141812.036790]  SyS_fdatasync+0x13/0x17
      [141812.036790]  entry_SYSCALL_64_fastpath+0x18/0xad
      [141812.036790] RIP: 0033:0x7fdbac41a47d
      [141812.036790] RSP: 002b:00007fdba6ffce30 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
      [141812.036790] RAX: ffffffffffffffda RBX: ffffffff81092c9f RCX: 00007fdbac41a47d
      [141812.036790] RDX: 0000004cf0160a40 RSI: 0000000000000000 RDI: 0000000000000006
      [141812.036790] RBP: ffffc90009007f98 R08: 0000000000000000 R09: 0000000000000010
      [141812.036790] R10: 00000000000002e8 R11: 0000000000000293 R12: ffffffff8110cd90
      [141812.036790] R13: ffffc90009007f78 R14: 0000000000000000 R15: 0000000000000000
      [141812.036790]  ? time_hardirqs_off+0x9/0x14
      [141812.036790]  ? trace_hardirqs_off_caller+0x1f/0xa3
      [141812.036790] Code: c7 d6 61 6b a0 48 89 e5 e8 ba ef a8 e0 0f 0b 55 89 f1 48 c7 c2 6d 65 6b a0 48 89 fe 48 c7 c7 81 65 6b a0 48 89 e5 e8 9c ef a8 e0 <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89
      [141812.036790] RIP: assfail.constprop.18+0x1c/0x1e [btrfs] RSP: ffffc90009007bc0
      [141812.084448] ---[ end trace 44e472684c7a32cc ]---
      
      Which happens because the code that logs a trailing hole when the no-holes
      feature is enabled, did not consider that a compressed inline extent can
      represent a range with a size matching the sector size, in which case
      expanding the inode's i_size, through a truncate operation, won't lead
      to padding with zeroes the page that represents the inline extent, and
      therefore the inline extent remains after the truncation.
      
      Fix this by adapting the assertion to accept inline extents representing
      data with a sector size length if, and only if, the inline extents are
      compressed.
      
      A sample and trivial reproducer (for systems with a 4K page size) for this
      issue:
      
        mkfs.btrfs -O no-holes -f /dev/sdc
        mount -o compress /dev/sdc /mnt
        xfs_io -f -c "pwrite -S 0xab 0 4K" /mnt/foobar
        sync
        xfs_io -c "truncate 32K" /mnt/foobar
        xfs_io -c "fsync" /mnt/foobar
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6399fb5a
    • C
      btrfs: remove redundant check on ret being non-zero · 938e1c77
      Colin Ian King 提交于
      The error return variable ret is initialized to zero and then is
      checked to see if it is non-zero in the if-block that follows it.
      It is therefore impossible for ret to be non-zero after the if-block
      hence the check is redundant and can be removed.
      
      Detected by CoverityScan, CID#1021040 ("Logically dead code")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      938e1c77
  7. 20 7月, 2017 1 次提交
  8. 22 6月, 2017 4 次提交
    • S
      btrfs: Check name_len in btrfs_check_ref_name_override · 3c1d4184
      Su Yue 提交于
      In btrfs_log_inode, btrfs_search_forward gets the buffer and then
      btrfs_check_ref_name_override will read name from ref/extref for the
      first time.
      
      Call btrfs_is_name_len_valid before reading name.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3c1d4184
    • S
      btrfs: Verify dir_item in replay_xattr_deletes · 8ee8c2d6
      Su Yue 提交于
      replay_xattr_deletes calls btrfs_search_slot to get buffer and reads
      name.
      
      Call verify_dir_item to check name_len in replay_xattr_deletes to avoid
      reading out of boundary.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      8ee8c2d6
    • S
      btrfs: Check name_len on add_inode_ref call path · 26a836ce
      Su Yue 提交于
      replay_one_buffer first reads buffers and dispatches items accroding to
      the item type.
      In this patch, add_inode_ref handles inode_ref and inode_extref.
      Then add_inode_ref calls ref_get_fields and extref_get_fields to read
      ref/extref name for the first time.
      So checking name_len before reading those two is fine.
      
      add_inode_ref also calls inode_in_dir to match ref/extref in parent_dir.
      The call graph includes btrfs_match_dir_item_name to read dir_item name
      in the parent dir.
      Checking first dir_item is not enough. Change it to verify every
      dir_item while doing matches.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      26a836ce
    • S
      btrfs: Check name_len with boundary in verify dir_item · e79a3327
      Su Yue 提交于
      Originally, verify_dir_item verifies name_len of dir_item with fixed
      values but not item boundary.
      If corrupted name_len was not bigger than the fixed value, for example
      255, the function will think the dir_item is fine. And then reading
      beyond boundary will cause crash.
      
      Example:
      	1. Corrupt one dir_item name_len to be 255.
              2. Run 'ls -lar /mnt/test/ > /dev/null'
      dmesg:
      [   48.451449] BTRFS info (device vdb1): disk space caching is enabled
      [   48.451453] BTRFS info (device vdb1): has skinny extents
      [   48.489420] general protection fault: 0000 [#1] SMP
      [   48.489571] Modules linked in: ext4 jbd2 mbcache btrfs xor raid6_pq
      [   48.489716] CPU: 1 PID: 2710 Comm: ls Not tainted 4.10.0-rc1 #5
      [   48.489853] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
      [   48.490008] task: ffff880035df1bc0 task.stack: ffffc90004800000
      [   48.490008] RIP: 0010:read_extent_buffer+0xd2/0x190 [btrfs]
      [   48.490008] RSP: 0018:ffffc90004803d98 EFLAGS: 00010202
      [   48.490008] RAX: 000000000000001b RBX: 000000000000001b RCX: 0000000000000000
      [   48.490008] RDX: ffff880079dbf36c RSI: 0005080000000000 RDI: ffff880079dbf368
      [   48.490008] RBP: ffffc90004803dc8 R08: ffff880078e8cc48 R09: ffff880000000000
      [   48.490008] R10: 0000160000000000 R11: 0000000000001000 R12: ffff880079dbf288
      [   48.490008] R13: ffff880078e8ca88 R14: 0000000000000003 R15: ffffc90004803e20
      [   48.490008] FS:  00007fef50c60800(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
      [   48.490008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   48.490008] CR2: 000055f335ac2ff8 CR3: 000000007356d000 CR4: 00000000001406e0
      [   48.490008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   48.490008] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   48.490008] Call Trace:
      [   48.490008]  btrfs_real_readdir+0x3b7/0x4a0 [btrfs]
      [   48.490008]  iterate_dir+0x181/0x1b0
      [   48.490008]  SyS_getdents+0xa7/0x150
      [   48.490008]  ? fillonedir+0x150/0x150
      [   48.490008]  entry_SYSCALL_64_fastpath+0x18/0xad
      [   48.490008] RIP: 0033:0x7fef5032546b
      [   48.490008] RSP: 002b:00007ffeafcdb830 EFLAGS: 00000206 ORIG_RAX: 000000000000004e
      [   48.490008] RAX: ffffffffffffffda RBX: 00007fef5061db38 RCX: 00007fef5032546b
      [   48.490008] RDX: 0000000000008000 RSI: 000055f335abaff0 RDI: 0000000000000003
      [   48.490008] RBP: 00007fef5061dae0 R08: 00007fef5061db48 R09: 0000000000000000
      [   48.490008] R10: 000055f335abafc0 R11: 0000000000000206 R12: 00007fef5061db38
      [   48.490008] R13: 0000000000008040 R14: 00007fef5061db38 R15: 000000000000270e
      [   48.490008] RIP: read_extent_buffer+0xd2/0x190 [btrfs] RSP: ffffc90004803d98
      [   48.499455] ---[ end trace 321920d8e8339505 ]---
      
      Fix it by adding a parameter @slot and check name_len with item boundary
      by calling btrfs_is_name_len_valid.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      rev
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e79a3327
  9. 18 4月, 2017 1 次提交
  10. 28 2月, 2017 5 次提交
  11. 24 2月, 2017 1 次提交
    • F
      Btrfs: do not create explicit holes when replaying log tree if NO_HOLES enabled · 3168021c
      Filipe Manana 提交于
      We log holes explicitly by using file extent items, however when replaying
      a log tree, if a logged file extent item corresponds to a hole and the
      NO_HOLES feature is enabled we do not need to copy the file extent item
      into the fs/subvolume tree, as the absence of such file extent items is
      the purpose of the NO_HOLES feature. So skip the copying of file extent
      items representing holes when the NO_HOLES feature is enabled.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      3168021c
  12. 17 2月, 2017 3 次提交
  13. 14 2月, 2017 16 次提交