1. 20 6月, 2014 4 次提交
    • M
      Btrfs: fix wrong error handle when the device is missing or is not writeable · 8408c716
      Miao Xie 提交于
      The original bio might be submitted, so we shoud increase bi_remaining to
      account for it when we deal with the error that the device is missing or
      is not writeable, or we would skip the endio handle.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      8408c716
    • M
      Btrfs: fix deadlock when mounting a degraded fs · c55f1396
      Miao Xie 提交于
      The deadlock happened when we mount degraded filesystem, the reproduced
      steps are following:
       # mkfs.btrfs -f -m raid1 -d raid1 <dev0> <dev1>
       # echo 1 > /sys/block/`basename <dev0>`/device/delete
       # mount -o degraded <dev1> <mnt>
      
      The reason was that the counter -- bi_remaining was wrong. If the missing
      or unwriteable device was the last device in the mapping array, we would
      not submit the original bio, so we shouldn't increase bi_remaining of it
      in btrfs_end_bio(), or we would skip the final endio handle.
      
      Fix this problem by adding a flag into btrfs bio structure. If we submit
      the original bio, we will set the flag, and we increase bi_remaining counter,
      or we don't.
      
      Though there is another way to fix it -- decrease bi_remaining counter of the
      original bio when we make sure the original bio is not submitted, this method
      need add more check and is easy to make mistake.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c55f1396
    • M
      Btrfs: use bio_endio_nodec instead of open code · e990f167
      Miao Xie 提交于
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e990f167
    • W
      Btrfs: fix NULL pointer crash when running balance and scrub concurrently · 298a8f9c
      Wang Shilong 提交于
      While running balance, scrub, fsstress concurrently we hit the
      following kernel crash:
      
      [56561.448845] BTRFS info (device sde): relocating block group 11005853696 flags 132
      [56561.524077] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
      [56561.524237] IP: [<ffffffffa038956d>] scrub_chunk.isra.12+0xdd/0x130 [btrfs]
      [56561.524297] PGD 9be28067 PUD 7f3dd067 PMD 0
      [56561.524325] Oops: 0000 [#1] SMP
      [....]
      [56561.527237] Call Trace:
      [56561.527309]  [<ffffffffa038980e>] scrub_enumerate_chunks+0x24e/0x490 [btrfs]
      [56561.527392]  [<ffffffff810abe00>] ? abort_exclusive_wait+0x50/0xb0
      [56561.527476]  [<ffffffffa038add4>] btrfs_scrub_dev+0x1a4/0x530 [btrfs]
      [56561.527561]  [<ffffffffa0368107>] btrfs_ioctl+0x13f7/0x2a90 [btrfs]
      [56561.527639]  [<ffffffff811c82f0>] do_vfs_ioctl+0x2e0/0x4c0
      [56561.527712]  [<ffffffff8109c384>] ? vtime_account_user+0x54/0x60
      [56561.527788]  [<ffffffff810f768c>] ? __audit_syscall_entry+0x9c/0xf0
      [56561.527870]  [<ffffffff811c8551>] SyS_ioctl+0x81/0xa0
      [56561.527941]  [<ffffffff815707f7>] tracesys+0xdd/0xe2
      [...]
      [56561.528304] RIP  [<ffffffffa038956d>] scrub_chunk.isra.12+0xdd/0x130 [btrfs]
      [56561.528395]  RSP <ffff88004c0f5be8>
      [56561.528454] CR2: 0000000000000078
      
      This is because in btrfs_relocate_chunk(), we will free @bdev directly while
      scrub may still hold extent mapping, and may access freed memory.
      
      Fix this problem by wrapping freeing @bdev work into free_extent_map() which
      is based on reference count.
      Reported-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      298a8f9c
  2. 10 6月, 2014 9 次提交
  3. 08 4月, 2014 1 次提交
  4. 11 3月, 2014 4 次提交
  5. 29 1月, 2014 1 次提交
  6. 09 1月, 2014 1 次提交
  7. 04 12月, 2013 1 次提交
  8. 24 11月, 2013 1 次提交
    • K
      block: Abstract out bvec iterator · 4f024f37
      Kent Overstreet 提交于
      Immutable biovecs are going to require an explicit iterator. To
      implement immutable bvecs, a later patch is going to add a bi_bvec_done
      member to this struct; for now, this patch effectively just renames
      things.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Cc: Benny Halevy <bhalevy@tonian.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: xfs@oss.sgi.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: "Roger Pau Monné" <roger.pau@citrix.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchand@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Peng Tao <tao.peng@emc.com>
      Cc: Andy Adamson <andros@netapp.com>
      Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Pankaj Kumar <pankaj.km@samsung.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Mel Gorman <mgorman@suse.de>6
      4f024f37
  9. 21 11月, 2013 1 次提交
  10. 12 11月, 2013 8 次提交
  11. 05 10月, 2013 1 次提交
  12. 21 9月, 2013 2 次提交
  13. 01 9月, 2013 6 次提交
    • F
      Btrfs: fix deadlock in uuid scan kthread · f45388f3
      Filipe David Borba Manana 提交于
      If there's an ongoing transaction when the uuid scan kthread attempts
      to create one, the kthread will block, waiting for that transaction to
      finish while it's keeping locks on the tree root, and in turn the existing
      transaction is waiting for those locks to be free.
      
      The stack trace reported by the kernel follows.
      
      [36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
      [36700.671602] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [36700.671602] btrfs-uuid      D 0000000000000000     0 15480      2 0x00000000
      [36700.671604]  ffff880710bd5b88 0000000000000046 ffff8803d36ba850 0000000000030000
      [36700.671605]  ffff8806d76dc530 ffff880710bd5fd8 ffff880710bd5fd8 ffff880710bd5fd8
      [36700.671607]  ffff8808098ac530 ffff8806d76dc530 ffff880710bd5b98 ffff8805e4508e40
      [36700.671608] Call Trace:
      [36700.671610]  [<ffffffff816f36b9>] schedule+0x29/0x70
      [36700.671620]  [<ffffffffa05a3bdf>] wait_current_trans.isra.33+0xbf/0x120 [btrfs]
      [36700.671623]  [<ffffffff81066760>] ? add_wait_queue+0x60/0x60
      [36700.671629]  [<ffffffffa05a5b06>] start_transaction+0x3d6/0x530 [btrfs]
      [36700.671636]  [<ffffffffa05bb1f4>] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
      [36700.671642]  [<ffffffffa05a5fbb>] btrfs_start_transaction+0x1b/0x20 [btrfs]
      [36700.671649]  [<ffffffffa05c8a81>] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
      [36700.671655]  [<ffffffffa05c8870>] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
      [36700.671657]  [<ffffffff81065fa0>] kthread+0xc0/0xd0
      [36700.671659]  [<ffffffff81065ee0>] ? flush_kthread_worker+0xb0/0xb0
      [36700.671661]  [<ffffffff816fcd1c>] ret_from_fork+0x7c/0xb0
      [36700.671662]  [<ffffffff81065ee0>] ? flush_kthread_worker+0xb0/0xb0
      [36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
      [36700.671664] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [36700.671665] btrfs           D 0000000000000000     0 15481  15212 0x00000004
      [36700.671666]  ffff880248cbf4c8 0000000000000086 ffff8803d36ba700 ffff8801dbd5c280
      [36700.671668]  ffff880807815c40 ffff880248cbffd8 ffff880248cbffd8 ffff880248cbffd8
      [36700.671669]  ffff8805e86a0000 ffff880807815c40 ffff880248cbf4d8 ffff8801dbd5c280
      [36700.671670] Call Trace:
      [36700.671672]  [<ffffffff816f36b9>] schedule+0x29/0x70
      [36700.671679]  [<ffffffffa05d9b0d>] btrfs_tree_lock+0x6d/0x230 [btrfs]
      [36700.671680]  [<ffffffff81066760>] ? add_wait_queue+0x60/0x60
      [36700.671685]  [<ffffffffa0582829>] btrfs_search_slot+0x999/0xb00 [btrfs]
      [36700.671691]  [<ffffffffa05bd9de>] ? btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
      [36700.671698]  [<ffffffffa05e3e54>] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
      [36700.671704]  [<ffffffffa05e4362>] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
      [36700.671710]  [<ffffffffa05c4441>] ? free_extent_buffer+0x61/0xc0 [btrfs]
      [36700.671716]  [<ffffffffa0594c82>] btrfs_write_dirty_block_groups+0x562/0x650 [btrfs]
      [36700.671723]  [<ffffffffa0610092>] commit_cowonly_roots+0x171/0x24b [btrfs]
      [36700.671729]  [<ffffffffa05a4dde>] btrfs_commit_transaction+0x4fe/0xa10 [btrfs]
      [36700.671735]  [<ffffffffa0610af3>] create_subvol+0x5c0/0x636 [btrfs]
      [36700.671742]  [<ffffffffa05d49ff>] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
      [36700.671747]  [<ffffffffa05d4bf2>] btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
      [36700.671752]  [<ffffffffa05d4c6c>] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
      [36700.671757]  [<ffffffffa05d4c9e>] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
      [36700.671759]  [<ffffffff8113a764>] ? handle_pte_fault+0x84/0x920
      [36700.671764]  [<ffffffffa05d87eb>] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
      [36700.671766]  [<ffffffff8113c120>] ? handle_mm_fault+0x210/0x310
      [36700.671768]  [<ffffffff816f83a4>] ? __do_page_fault+0x284/0x4e0
      [36700.671770]  [<ffffffff81180aa6>] do_vfs_ioctl+0x96/0x550
      [36700.671772]  [<ffffffff81170fe3>] ? __sb_end_write+0x33/0x70
      [36700.671774]  [<ffffffff81180ff1>] SyS_ioctl+0x91/0xb0
      [36700.671775]  [<ffffffff816fcdc2>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      f45388f3
    • I
      Btrfs: stop refusing the relocation of chunk 0 · 795a3321
      Ilya Dryomov 提交于
      AFAICT chunk 0 is no longer special, and so it should be restriped just
      like every other chunk.  One reason for this change is us refusing the
      relocation can lead to filesystems that can only be mounted ro, and
      never rw -- see the bugzilla [1] for details.  The other reason is that
      device removal code is already doing this: it will happily relocate
      chunk 0 is part of shrinking the device.
      
      [1] https://bugzilla.kernel.org/show_bug.cgi?id=60594Reported-by: NXavier Bassery <xavier@bartica.org>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      795a3321
    • J
      Btrfs: adjust the fs_devices->missing count on unmount · 726551eb
      Josef Bacik 提交于
      I noticed that if I tried to mount a file system with -o degraded after having
      done it once already we would fail to mount.  This is because the
      fs_devices->missing count was getting bumped everytime we mounted, but not
      getting reset whenever we unmounted.  To fix this we just drop the missing count
      as we're closing devices to make sure this doesn't happen.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      726551eb
    • F
      Btrfs: fix race conditions in BTRFS_IOC_FS_INFO ioctl · f7171750
      Filipe David Borba Manana 提交于
      The handler for the ioctl BTRFS_IOC_FS_INFO was reading the
      number of devices before acquiring the device list mutex.
      
      This could lead to inconsistent results because the update of
      the device list and the number of devices counter (amongst other
      counters related to the device list) are updated in volumes.c
      while holding the device list mutex - except for 2 places, one
      was volumes.c:btrfs_prepare_sprout() and the other was
      volumes.c:device_list_add().
      
      For example, if we have 2 devices, with IDs 1 and 2 and then add
      a new device, with ID 3, and while adding the device is in progress
      an BTRFS_IOC_FS_INFO ioctl arrives, it could return a number of
      devices of 2 and a max dev id of 3. This would be incorrect.
      
      Also, this ioctl handler was reading the fsid while it can be
      updated concurrently. This can happen when while a new device is
      being added and the current filesystem is in seeding mode.
      Example:
      
      $ mkfs.btrfs -f /dev/sdb1
      $ mkfs.btrfs -f /dev/sdb2
      $ btrfstune -S 1 /dev/sdb1
      $ mount /dev/sdb1 /mnt/test
      $ btrfs device add /dev/sdb2 /mnt/test
      
      If during the last step a BTRFS_IOC_FS_INFO ioctl was requested, it
      could read an fsid that was never valid (some bits part of the old
      fsid and others part of the new fsid). Also, it could read a number
      of devices that doesn't match the number of devices in the list and
      the max device id, as explained before.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      f7171750
    • F
      Btrfs: fix race between removing a dev and writing sbs · d7306801
      Filipe David Borba Manana 提交于
      This change fixes an issue when removing a device and writing
      all super blocks run simultaneously. Here's the steps necessary
      for the issue to happen:
      
      1) disk-io.c:write_all_supers() gets a number of N devices from the
         super_copy, so it will not panic if it fails to write super blocks
         for N - 1 devices;
      
      2) Then it tries to acquire the device_list_mutex, but blocks because
         volumes.c:btrfs_rm_device() got it first;
      
      3) btrfs_rm_device() removes the device from the list, then unlocks the
         mutex and after the unlock it updates the number of devices in
         super_copy to N - 1.
      
      4) write_all_supers() finally acquires the mutex, iterates over all the
         devices in the list and gets N - 1 errors, that is, it failed to write
         super blocks to all the devices;
      
      5) Because write_all_supers() thinks there are a total of N devices, it
         considers N - 1 errors to be ok, and therefore won't panic.
      
      So this change just makes sure that write_all_supers() reads the number
      of devices from super_copy after it acquires the device_list_mutex.
      Conversely, it changes btrfs_rm_device() to update the number of devices
      in super_copy before it releases the device list mutex.
      
      The code path to add a new device (volumes.c:btrfs_init_new_device),
      already has the right behaviour: it updates the number of devices in
      super_copy while holding the device_list_mutex.
      
      The only code path that doesn't lock the device list mutex
      before updating the number of devices in the super copy is
      disk-io.c:next_root_backup(), called by open_ctree() during
      mount time where concurrency issues can't happen.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      d7306801
    • G
      Btrfs: Make btrfs_dev_extent_chunk_tree_uuid() return unsigned long · 231e88f4
      Geert Uytterhoeven 提交于
      Internally, btrfs_dev_extent_chunk_tree_uuid() calculates an unsigned long,
      but casts it to a pointer, while all callers cast it to unsigned long
      again.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      231e88f4