1. 28 6月, 2014 1 次提交
  2. 17 6月, 2014 1 次提交
  3. 14 6月, 2014 7 次提交
    • E
      btrfs: fix error handling in create_pending_snapshot · 47a306a7
      Eric Sandeen 提交于
      fcebe456 cut and pasted some code to a later point
      in create_pending_snapshot(), but didn't switch
      to the appropriate error handling for this stage
      of the function.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      47a306a7
    • E
      btrfs: fix use of uninit "ret" in end_extent_writepage() · 3e2426bd
      Eric Sandeen 提交于
      If this condition in end_extent_writepage() is false:
      
      	if (tree->ops && tree->ops->writepage_end_io_hook)
      
      we will then test an uninitialized "ret" at:
      
      	ret = ret < 0 ? ret : -EIO;
      
      The test for ret is for the case where ->writepage_end_io_hook
      failed, and we'd choose that ret as the error; but if
      there is no ->writepage_end_io_hook, nothing sets ret.
      
      Initializing ret to 0 should be sufficient; if
      writepage_end_io_hook wasn't set, (!uptodate) means
      non-zero err was passed in, so we choose -EIO in that case.
      Signed-of-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3e2426bd
    • E
      btrfs: free ulist in qgroup_shared_accounting() error path · d7372780
      Eric Sandeen 提交于
      If tmp = ulist_alloc(GFP_NOFS) fails, we return without
      freeing the previously allocated qgroups = ulist_alloc(GFP_NOFS)
      and cause a memory leak.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d7372780
    • F
      Btrfs: fix qgroups sanity test crash or hang · b050f9f6
      Filipe Manana 提交于
      Often when running the qgroups sanity test, a crash or a hang happened.
      This is because the extent buffer the test uses for the root node doesn't
      have an header level explicitly set, making it have a random level value.
      This is a problem when it's not zero for the btrfs_search_slot() calls
      the test ends up doing, resulting in crashes or hangs such as the following:
      
      [ 6454.127192] Btrfs loaded, debug=on, assert=on, integrity-checker=on
      (...)
      [ 6454.127760] BTRFS: selftest: Running qgroup tests
      [ 6454.127964] BTRFS: selftest: Running test_test_no_shared_qgroup
      [ 6454.127966] BTRFS: selftest: Qgroup basic add
      [ 6480.152005] BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:5383]
      [ 6480.152005] Modules linked in: btrfs(+) xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc i2c_piix4 i2c_core pcspkr evbug psmouse serio_raw e1000 [last unloaded: btrfs]
      [ 6480.152005] irq event stamp: 188448
      [ 6480.152005] hardirqs last  enabled at (188447): [<ffffffff8168ef5c>] restore_args+0x0/0x30
      [ 6480.152005] hardirqs last disabled at (188448): [<ffffffff81698e6a>] apic_timer_interrupt+0x6a/0x80
      [ 6480.152005] softirqs last  enabled at (188446): [<ffffffff810516cf>] __do_softirq+0x1cf/0x450
      [ 6480.152005] softirqs last disabled at (188441): [<ffffffff81051c25>] irq_exit+0xb5/0xc0
      [ 6480.152005] CPU: 0 PID: 5383 Comm: modprobe Not tainted 3.15.0-rc8-fdm-btrfs-next-33+ #4
      [ 6480.152005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [ 6480.152005] task: ffff8802146125a0 ti: ffff8800d0d00000 task.ti: ffff8800d0d00000
      [ 6480.152005] RIP: 0010:[<ffffffff81349a63>]  [<ffffffff81349a63>] __write_lock_failed+0x13/0x20
      [ 6480.152005] RSP: 0018:ffff8800d0d038e8  EFLAGS: 00000287
      [ 6480.152005] RAX: 0000000000000000 RBX: ffffffff8168ef5c RCX: 000005deb8525852
      [ 6480.152005] RDX: 0000000000000000 RSI: 0000000000001d45 RDI: ffff8802105000b8
      [ 6480.152005] RBP: ffff8800d0d038e8 R08: fffffe12710f63db R09: ffffffffa03196fb
      [ 6480.152005] R10: ffff8802146125a0 R11: ffff880214612e28 R12: ffff8800d0d03858
      [ 6480.152005] R13: 0000000000000000 R14: ffff8800d0d00000 R15: ffff8802146125a0
      [ 6480.152005] FS:  00007f14ff804700(0000) GS:ffff880215e00000(0000) knlGS:0000000000000000
      [ 6480.152005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 6480.152005] CR2: 00007fff4df0dac8 CR3: 00000000d1796000 CR4: 00000000000006f0
      [ 6480.152005] Stack:
      [ 6480.152005]  ffff8800d0d03908 ffffffff810ae967 0000000000000001 ffff8802105000b8
      [ 6480.152005]  ffff8800d0d03938 ffffffff8168e57e ffffffffa0319c16 0000000000000007
      [ 6480.152005]  ffff880210500000 ffff880210500100 ffff8800d0d039b8 ffffffffa0319c16
      [ 6480.152005] Call Trace:
      [ 6480.152005]  [<ffffffff810ae967>] do_raw_write_lock+0x47/0xa0
      [ 6480.152005]  [<ffffffff8168e57e>] _raw_write_lock+0x5e/0x80
      [ 6480.152005]  [<ffffffffa0319c16>] ? btrfs_tree_lock+0x116/0x270 [btrfs]
      [ 6480.152005]  [<ffffffffa0319c16>] btrfs_tree_lock+0x116/0x270 [btrfs]
      [ 6480.152005]  [<ffffffffa02b2acb>] btrfs_lock_root_node+0x3b/0x50 [btrfs]
      [ 6480.152005]  [<ffffffffa02b81a6>] btrfs_search_slot+0x916/0xa20 [btrfs]
      [ 6480.152005]  [<ffffffff811a727f>] ? create_object+0x23f/0x300
      [ 6480.152005]  [<ffffffffa02b9958>] btrfs_insert_empty_items+0x78/0xd0 [btrfs]
      [ 6480.152005]  [<ffffffffa036041a>] insert_normal_tree_ref.constprop.4+0xa2/0x19a [btrfs]
      [ 6480.152005]  [<ffffffffa03605c3>] test_no_shared_qgroup+0xb1/0x1ca [btrfs]
      [ 6480.152005]  [<ffffffff8108cad6>] ? local_clock+0x16/0x30
      [ 6480.152005]  [<ffffffffa035ef8e>] btrfs_test_qgroups+0x1ae/0x1d7 [btrfs]
      [ 6480.152005]  [<ffffffffa03a69d2>] ? ftrace_define_fields_btrfs_space_reservation+0xfd/0xfd [btrfs]
      [ 6480.152005]  [<ffffffffa03a6a86>] init_btrfs_fs+0xb4/0x153 [btrfs]
      [ 6480.152005]  [<ffffffff81000352>] do_one_initcall+0x102/0x150
      [ 6480.152005]  [<ffffffff8103d223>] ? set_memory_nx+0x43/0x50
      [ 6480.152005]  [<ffffffff81682668>] ? set_section_ro_nx+0x6d/0x74
      [ 6480.152005]  [<ffffffff810d91cc>] load_module+0x1cdc/0x2630
      (...)
      
      Therefore initialize the extent buffer as an empty leaf (level 0).
      
      Issue easy to reproduce when btrfs is built as a module via:
      
          $ for ((i = 1; i <= 1000000; i++)); do rmmod btrfs; modprobe btrfs; done
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      b050f9f6
    • S
      btrfs: prevent RCU warning when dereferencing radix tree slot · f1e3c289
      Sasha Levin 提交于
      Mark the dereference as protected by lock. Not doing so triggers
      an RCU warning since the radix tree assumed that RCU is in use.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f1e3c289
    • W
      Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting · 5fbc7c59
      Wang Shilong 提交于
      Steps to reproduce:
      
       # mkfs.btrfs -f /dev/sd[b-f] -m raid5 -d raid5
       # mkfs.ext4 /dev/sdc --->corrupt one of btrfs device
       # mount /dev/sdb /mnt -o degraded
       # btrfs scrub start -BRd /mnt
      
      This is because readahead would skip missing device, this is not true
      for RAID5/6, because REQ_GET_READ_MIRRORS return 1 for RAID5/6 block
      mapping. If expected data locates in missing device, readahead thread
      would not call __readahead_hook() which makes event @rc->elems=0
      wait forever.
      
      Fix this problem by checking return value of btrfs_map_block(),we
      can only skip missing device safely if there are several mirrors.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5fbc7c59
    • G
      btrfs: new ioctl TREE_SEARCH_V2 · cc68a8a5
      Gerhard Heift 提交于
      This new ioctl call allows the user to supply a buffer of varying size in which
      a tree search can store its results. This is much more flexible if you want to
      receive items which are larger than the current fixed buffer of 3992 bytes or
      if you want to fetch more items at once. Items larger than this buffer are for
      example some of the type EXTENT_CSUM.
      Signed-off-by: NGerhard Heift <Gerhard@Heift.Name>
      Signed-off-by: NChris Mason <clm@fb.com>
      Acked-by: NDavid Sterba <dsterba@suse.cz>
      cc68a8a5
  4. 13 6月, 2014 6 次提交
  5. 12 6月, 2014 8 次提交
  6. 11 6月, 2014 4 次提交
  7. 10 6月, 2014 13 次提交
    • M
      NFS: populate ->net in mount data when remounting · a914722f
      Mateusz Guzik 提交于
      Otherwise the kernel oopses when remounting with IPv6 server because
      net is dereferenced in dev_get_by_name.
      
      Use net ns of current thread so that dev_get_by_name does not operate on
      foreign ns. Changing the address is prohibited anyway so this should not
      affect anything.
      Signed-off-by: NMateusz Guzik <mguzik@redhat.com>
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org # 3.4+
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      a914722f
    • W
      pnfs: fix lockup caused by pnfs_generic_pg_test · c5e20cb7
      Weston Andros Adamson 提交于
      end_offset and req_offset both return u64 - avoid casting to u32
      until it's needed, when it's less than the (u32) size returned by
      nfs_generic_pg_test.
      
      Also, fix the comments in pnfs_generic_pg_test.
      
      Running the cthon04 special tests caused this lockup in the
      "write/read at 2GB, 4GB edges" test when running against a file layout server:
      
      BUG: soft lockup - CPU#0 stuck for 22s! [bigfile2:823]
      Modules linked in: nfs_layout_nfsv41_files rpcsec_gss_krb5 nfsv4 nfs fscache ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ppdev crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd serio_raw e1000 shpchp i2c_piix4 i2c_core parport_pc parport nfsd auth_rpcgss oid_registry exportfs nfs_acl lockd sunrpc btrfs xor zlib_deflate raid6_pq mptspi scsi_transport_spi mptscsih mptbase ata_generic floppy autofs4
      irq event stamp: 205958
      hardirqs last  enabled at (205957): [<ffffffff814a62dc>] restore_args+0x0/0x30
      hardirqs last disabled at (205958): [<ffffffff814ad96a>] apic_timer_interrupt+0x6a/0x80
      softirqs last  enabled at (205956): [<ffffffff8103ffb2>] __do_softirq+0x1ea/0x2ab
      softirqs last disabled at (205951): [<ffffffff8104026d>] irq_exit+0x44/0x9a
      CPU: 0 PID: 823 Comm: bigfile2 Not tainted 3.15.0-rc1-branch-pgio_plus+ #3
      Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
      task: ffff8800792ec480 ti: ffff880078c4e000 task.ti: ffff880078c4e000
      RIP: 0010:[<ffffffffa02ce51f>]  [<ffffffffa02ce51f>] nfs_page_group_unlock+0x3e/0x4b [nfs]
      RSP: 0018:ffff880078c4fab0  EFLAGS: 00000202
      RAX: 0000000000000fff RBX: ffff88006bf83300 RCX: 0000000000000000
      RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88006bf83300
      RBP: ffff880078c4fab8 R08: 0000000000000001 R09: 0000000000000000
      R10: ffffffff8249840c R11: 0000000000000000 R12: 0000000000000035
      R13: ffff88007ffc72d8 R14: 0000000000000001 R15: 0000000000000000
      FS:  00007f45f11b7740(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f3a8cb632d0 CR3: 000000007931c000 CR4: 00000000001407f0
      Stack:
       ffff88006bf832c0 ffff880078c4fb00 ffffffffa02cec22 ffff880078c4fad8
       00000fff810f9d99 ffff880078c4fca0 ffff88006bf832c0 ffff88006bf832c0
       ffff880078c4fca0 ffff880078c4fd60 ffff880078c4fb28 ffffffffa02cee34
      Call Trace:
       [<ffffffffa02cec22>] __nfs_pageio_add_request+0x298/0x34f [nfs]
       [<ffffffffa02cee34>] nfs_pageio_add_request+0x1f/0x42 [nfs]
       [<ffffffffa02d1722>] nfs_do_writepage+0x1b5/0x1e4 [nfs]
       [<ffffffffa02d1764>] nfs_writepages_callback+0x13/0x25 [nfs]
       [<ffffffffa02d1751>] ? nfs_do_writepage+0x1e4/0x1e4 [nfs]
       [<ffffffff810eb32d>] write_cache_pages+0x254/0x37f
       [<ffffffffa02d1751>] ? nfs_do_writepage+0x1e4/0x1e4 [nfs]
       [<ffffffff8149cf9e>] ? printk+0x54/0x56
       [<ffffffff810eacca>] ? __set_page_dirty_nobuffers+0x22/0xe9
       [<ffffffffa016d864>] ? put_rpccred+0x38/0x101 [sunrpc]
       [<ffffffffa02d1ae1>] nfs_writepages+0xb4/0xf8 [nfs]
       [<ffffffff810ec59c>] do_writepages+0x21/0x2f
       [<ffffffff810e36e8>] __filemap_fdatawrite_range+0x55/0x57
       [<ffffffff810e374a>] filemap_write_and_wait_range+0x2d/0x5b
       [<ffffffffa030ba0a>] nfs4_file_fsync+0x3a/0x98 [nfsv4]
       [<ffffffff8114ee3c>] vfs_fsync_range+0x18/0x20
       [<ffffffff810e40c2>] generic_file_aio_write+0xa7/0xbd
       [<ffffffffa02c5c6b>] nfs_file_write+0xf0/0x170 [nfs]
       [<ffffffff81129215>] do_sync_write+0x59/0x78
       [<ffffffff8112956c>] vfs_write+0xab/0x107
       [<ffffffff81129c8b>] SyS_write+0x49/0x7f
       [<ffffffff814acd12>] system_call_fastpath+0x16/0x1b
      Reported-by: NAnna Schumaker <Anna.Schumaker@netapp.com>
      Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      c5e20cb7
    • L
      Btrfs: fix scrub_print_warning to handle skinny metadata extents · 6eda71d0
      Liu Bo 提交于
      The skinny extents are intepreted incorrectly in scrub_print_warning(),
      and end up hitting the BUG() in btrfs_extent_inline_ref_size.
      Reported-by: NKonstantinos Skarlatos <k.skarlatos@gmail.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      6eda71d0
    • F
      Btrfs: make fsync work after cloning into a file · 7ffbb598
      Filipe Manana 提交于
      When cloning into a file, we were correctly replacing the extent
      items in the target range and removing the extent maps. However
      we weren't replacing the extent maps with new ones that point to
      the new extents - as a consequence, an incremental fsync (when the
      inode doesn't have the full sync flag) was a NOOP, since it relies
      on the existence of extent maps in the modified list of the inode's
      extent map tree, which was empty. Therefore add new extent maps to
      reflect the target clone range.
      
      A test case for xfstests follows.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      7ffbb598
    • L
      Btrfs: use right type to get real comparison · cd857dd6
      Liu Bo 提交于
      We want to make sure the point is still within the extent item, not to verify
      the memory it's pointing to.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      cd857dd6
    • J
      Btrfs: don't check nodes for extent items · 8a56457f
      Josef Bacik 提交于
      The backref code was looking at nodes as well as leaves when we tried to
      populate extent item entries.  This is not good, and although we go away with it
      for the most part because we'd skip where disk_bytenr != random_memory,
      sometimes random_memory would match and suddenly boom.  This fixes that problem.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      8a56457f
    • F
      Btrfs: don't release invalid page in btrfs_page_exists_in_range() · 6fdef6d4
      Filipe Manana 提交于
      In inode.c:btrfs_page_exists_in_range(), if the page we got from
      the radix tree is an exception entry, which can't be retried, we
      exit the loop with a non-NULL page and then call page_cache_release
      against it, which is not ok since it's not a valid page. This could
      also make us return true when we shouldn't.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      6fdef6d4
    • F
      Btrfs: make sure we retry if page is a retriable exception · 809f9016
      Filipe Manana 提交于
      In inode.c:btrfs_page_exists_in_range(), if the page we get from the
      radix tree is an exception which should make us retry, set page to
      NULL in order to really retry, because otherwise we don't get another
      loop iteration executed (page != NULL makes the while loop exit).
      This also was making us call page_cache_release after exiting the loop,
      which isn't correct because page doesn't point to a valid page, and
      possibly return true from the function when we shouldn't.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      809f9016
    • F
      Btrfs: make sure we retry if we couldn't get the page · 91405151
      Filipe Manana 提交于
      In inode.c:btrfs_page_exists_in_range(), if we can't get the page
      we need to retry. However we weren't retrying because we weren't
      setting page to NULL, which makes the while loop exit immediately
      and will make us call page_cache_release after exiting the loop
      which is incorrect because our page get didn't succeed. This could
      also make us return true when we shouldn't.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      91405151
    • G
      btrfs: replace EINVAL with EOPNOTSUPP for dev_replace raid56 · c81d5767
      Gui Hecheng 提交于
      To return EOPNOTSUPP is more user friendly than to return EINVAL,
      and then user-space tool will show that the dev_replace operation
      for raid56 is not currently supported rather than showing that
      there is an invalid argument.
      Signed-off-by: NGui Hecheng <guihc.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c81d5767
    • A
      trivial: fs/btrfs/ioctl.c: fix typo s/substract/subtract/ · 93915584
      Antonio Ospite 提交于
      Signed-off-by: NAntonio Ospite <ao2@ao2.it>
      Cc: Chris Mason <clm@fb.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: linux-btrfs@vger.kernel.org
      Signed-off-by: NChris Mason <clm@fb.com>
      93915584
    • L
      Btrfs: fix leaf corruption after __btrfs_drop_extents · 0b43e04f
      Liu Bo 提交于
      Several reports about leaf corruption has been floating on the list, one of them
      points to __btrfs_drop_extents(), and we find that the leaf becomes corrupted
      after __btrfs_drop_extents(), it's really a rare case but it does exist.
      
      The problem turns out to be btrfs_next_leaf() called in __btrfs_drop_extents().
      
      So in btrfs_next_leaf(), we release the current path to re-search the last key of
      the leaf for locating next leaf, and we've taken it into account that there might
      be balance operations between leafs during this 'unlock and re-lock' dance, so
      we check the path again and advance it if there are now more items available.
      But things are a bit different if that last key happens to be removed and balance
      gets a bigger key as the last one, and btrfs_search_slot will return it with
      ret > 0, IOW, nothing change in this leaf except the new last key, then we think
      we're okay because there is no more item balanced in, fine, we thinks we can
      go to the next leaf.
      
      However, we should return that bigger key, otherwise we deserve leaf corruption,
      for example, in endio, skipping that key means that __btrfs_drop_extents() thinks
      it has dropped all extent matched the required range and finish_ordered_io can
      safely insert a new extent, but it actually doesn't and ends up a leaf
      corruption.
      
      One may be asking that why our locking on extent io tree doesn't work as
      expected, ie. it should avoid this kind of race situation.  But in
      __btrfs_drop_extents(), we don't always find extents which are included within
      our locking range, IOW, extents can start before our searching start, in this
      case locking on extent io tree doesn't protect us from the race.
      
      This takes the special case into account.
      Reviewed-by: NFilipe Manana <fdmanana@gmail.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      0b43e04f
    • F
      Btrfs: ensure btrfs_prev_leaf doesn't miss 1 item · 337c6f68
      Filipe Manana 提交于
      We might have had an item with the previous key in the tree right
      before we released our path. And after we released our path, that
      item might have been pushed to the first slot (0) of the leaf we
      were holding due to a tree balance. Alternatively, an item with the
      previous key can exist as the only element of a leaf (big fat item).
      Therefore account for these 2 cases, so that our callers (like
      btrfs_previous_item) don't miss an existing item with a key matching
      the previous key we computed above.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      337c6f68