1. 03 4月, 2015 7 次提交
  2. 05 3月, 2015 1 次提交
  3. 05 10月, 2014 1 次提交
    • M
      block: disable entropy contributions for nonrot devices · b277da0a
      Mike Snitzer 提交于
      Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set
      QUEUE_FLAG_NONROT.
      
      Historically, all block devices have automatically made entropy
      contributions.  But as previously stated in commit e2e1a148 ("block: add
      sysfs knob for turning off disk entropy contributions"):
          - On SSD disks, the completion times aren't as random as they
            are for rotational drives. So it's questionable whether they
            should contribute to the random pool in the first place.
          - Calling add_disk_randomness() has a lot of overhead.
      
      There are more reliable sources for randomness than non-rotational block
      devices.  From a security perspective it is better to err on the side of
      caution than to allow entropy contributions from unreliable "random"
      sources.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b277da0a
  4. 07 6月, 2014 1 次提交
  5. 18 4月, 2014 1 次提交
  6. 02 4月, 2014 1 次提交
  7. 24 11月, 2013 2 次提交
    • K
      block: Immutable bio vecs · 4550dd6c
      Kent Overstreet 提交于
      This adds a mechanism by which we can advance a bio by an arbitrary
      number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done
      indicates the number of bytes completed in the current bvec.
      
      Various driver code still needs to be updated to not refer to the bvec
      directly before we can use this for interesting things, like efficient
      bio splitting.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Cc: drbd-user@lists.linbit.com
      Cc: nbd-general@lists.sourceforge.net
      4550dd6c
    • K
      block: Convert bio_for_each_segment() to bvec_iter · 7988613b
      Kent Overstreet 提交于
      More prep work for immutable biovecs - with immutable bvecs drivers
      won't be able to use the biovec directly, they'll need to use helpers
      that take into account bio->bi_iter.bi_bvec_done.
      
      This updates callers for the new usage without changing the
      implementation yet.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Cc: Jim Paris <jim@jtan.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
      Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
      Cc: support@lsi.com
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: drbd-user@lists.linbit.com
      Cc: nbd-general@lists.sourceforge.net
      Cc: cbe-oss-dev@lists.ozlabs.org
      Cc: xen-devel@lists.xensource.com
      Cc: virtualization@lists.linux-foundation.org
      Cc: linux-raid@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: DL-MPTFusionLinux@lsi.com
      Cc: linux-scsi@vger.kernel.org
      Cc: devel@driverdev.osuosl.org
      Cc: linux-fsdevel@vger.kernel.org
      Cc: cluster-devel@redhat.com
      Cc: linux-mm@kvack.org
      Acked-by: NGeoff Levand <geoff@infradead.org>
      7988613b
  8. 04 7月, 2013 3 次提交
  9. 01 5月, 2013 1 次提交
  10. 28 2月, 2013 4 次提交
    • A
      nbd: fix sparse warning · 398eb085
      Alex Elder 提交于
      I just fixed this in "drivers/block/rbd.c" and I noticed that
      "drivers/block/nbd.c" has the same problem.  Fix a warning issued by
      sparse by adding some lockdep annotations to indicate the queue lock gets
      dropped (because it's held when do_nbd_request() is called) and
      re-acquired within the function.
      Signed-off-by: NAlex Elder <elder@inktank.com>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Cc: Paul Clements <paul.clements@us.sios.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      398eb085
    • P
      nbd: show read-only state in sysfs · a83e814b
      Paolo Bonzini 提交于
      Pass the read-only flag to set_device_ro, so that it will be visible to
      the block layer and in sysfs.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Cc: Alex Bligh <alex@alex.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a83e814b
    • P
      nbd: fsync and kill block device on shutdown · 3a2d63f8
      Paolo Bonzini 提交于
      There are two problems with shutdown in the NBD driver.
      
      1: Receiving the NBD_DISCONNECT ioctl does not sync the filesystem.
      
         This patch adds the sync operation into __nbd_ioctl()'s
         NBD_DISCONNECT handler.  This is useful because BLKFLSBUF is restricted
         to processes that have CAP_SYS_ADMIN, and the NBD client may not
         possess it (fsync of the block device does not sync the filesystem,
         either).
      
      2: Once we clear the socket we have no guarantee that later reads will
         come from the same backing storage.
      
         The patch adds calls to kill_bdev() in __nbd_ioctl()'s socket
         clearing code so the page cache is cleaned, lest reads that hit on the
         page cache will return stale data from the previously-accessible disk.
      
      Example:
      
          # qemu-nbd -r -c/dev/nbd0 /dev/sr0
          # file -s /dev/nbd0
          /dev/stdin: # UDF filesystem data (version 1.5) etc.
          # qemu-nbd -d /dev/nbd0
          # qemu-nbd -r -c/dev/nbd0 /dev/sda
          # file -s /dev/nbd0
          /dev/stdin: # UDF filesystem data (version 1.5) etc.
      
      While /dev/sda has:
      
          # file -s /dev/sda
          /dev/sda: x86 boot sector; etc.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NPaul Clements <Paul.Clements@steeleye.com>
      Cc: Alex Bligh <alex@alex.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a2d63f8
    • A
      nbd: support FLUSH requests · 75f187ab
      Alex Bligh 提交于
      Currently, the NBD device does not accept flush requests from the Linux
      block layer.  If the NBD server opened the target with neither O_SYNC nor
      O_DSYNC, however, the device will be effectively backed by a writeback
      cache.  Without issuing flushes properly, operation of the NBD device will
      not be safe against power losses.
      
      The NBD protocol has support for both a cache flush command and a FUA
      command flag; the server will also pass a flag to note its support for
      these features.  This patch adds support for the cache flush command and
      flag.  In the kernel, we receive the flags via the NBD_SET_FLAGS ioctl,
      and map NBD_FLAG_SEND_FLUSH to the argument of blk_queue_flush.  When the
      flag is active the block layer will send REQ_FLUSH requests, which we
      translate to NBD_CMD_FLUSH commands.
      
      FUA support is not included in this patch because all free software
      servers implement it with a full fdatasync; thus it has no advantage over
      supporting flush only.  Because I [Paolo] cannot really benchmark it in a
      realistic scenario, I cannot tell if it is a good idea or not.  It is also
      not clear if it is valid for an NBD server to support FUA but not flush.
      The Linux block layer gives a warning for this combination, the NBD
      protocol documentation says nothing about it.
      
      The patch also fixes a small problem in the handling of flags: nbd->flags
      must be cleared at the end of NBD_DO_IT, but the driver was not doing
      that.  The bug manifests itself as follows.  Suppose you two different
      client/server pairs to start the NBD device.  Suppose also that the first
      client supports NBD_SET_FLAGS, and the first server sends
      NBD_FLAG_SEND_FLUSH; the second pair instead does neither of these two
      things.  Before this patch, the second invocation of NBD_DO_IT will use a
      stale value of nbd->flags, and the second server will issue an error every
      time it receives an NBD_CMD_FLUSH command.
      
      This bug is pre-existing, but it becomes much more important after this
      patch; flush failures make the device pretty much unusable, unlike
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAlex Bligh <alex@alex.org.uk>
      Acked-by: NPaul Clements <Paul.Clements@steeleye.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75f187ab
  11. 23 2月, 2013 1 次提交
  12. 06 10月, 2012 2 次提交
  13. 18 9月, 2012 1 次提交
    • P
      nbd: clear waiting_queue on shutdown · fded4e09
      Paul Clements 提交于
      Fix a serious but uncommon bug in nbd which occurs when there is heavy
      I/O going to the nbd device while, at the same time, a failure (server,
      network) or manual disconnect of the nbd connection occurs.
      
      There is a small window between the time that the nbd_thread is stopped
      and the socket is shutdown where requests can continue to be queued to
      nbd's internal waiting_queue.  When this happens, those requests are
      never completed or freed.
      
      The fix is to clear the waiting_queue on shutdown of the nbd device, in
      the same way that the nbd request queue (queue_head) is already being
      cleared.
      Signed-off-by: NPaul Clements <paul.clements@steeleye.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fded4e09
  14. 01 8月, 2012 1 次提交
    • M
      nbd: set SOCK_MEMALLOC for access to PFMEMALLOC reserves · 7f338fe4
      Mel Gorman 提交于
      Set SOCK_MEMALLOC on the NBD socket to allow access to PFMEMALLOC reserves
      so pages backed by NBD, particularly if swap related, can be cleaned to
      prevent the machine being deadlocked.  It is still possible that the
      PFMEMALLOC reserves get depleted resulting in deadlock but this can be
      resolved by the administrator by increasing min_free_kbytes.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: David Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f338fe4
  15. 31 7月, 2012 1 次提交
  16. 29 3月, 2012 2 次提交
  17. 19 8月, 2011 6 次提交
  18. 28 5月, 2011 3 次提交
    • N
      nbd: adjust 'max_part' according to part_shift · 5988ce23
      Namhyung Kim 提交于
      The 'max_part' parameter determines how many partitions are supported
      on each nbd device. However the actual number can be changed to the
      power of 2 minus 1 form during the module initialization as
      alloc_disk() is called with (1 << part_shift) for some reason.
      
      So adjust 'max_part' also at least for consistency with loop and brd.
      It is exported via sysfs already, and a user should check this value
      after module loading if [s]he wants to use that number correctly
      (i.e. fdisk or something).
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      5988ce23
    • N
      nbd: limit module parameters to a sane value · 3b271082
      Namhyung Kim 提交于
      The 'max_part' parameter controls the number of maximum partition
      a nbd device can have. However if a user specifies very large
      value it would exceed the limitation of device minor number and
      can cause a kernel oops (or, at least, produce invalid device
      nodes in some cases).
      
      In addition, specifying large 'nbds_max' value causes same
      problem for the same reason.
      
      On my desktop, following command results to the kernel bug:
      
      $ sudo modprobe nbd max_part=100000
       kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
       invalid opcode: 0000 [#1] SMP
       last sysfs file: /sys/devices/virtual/block/nbd4/range
       CPU 1
       Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom
      
       Pid: 2522, comm: modprobe Tainted: G        W   2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO
       RIP: 0010:[<ffffffff8115aa08>]  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
       RSP: 0018:ffff8801009f1de8  EFLAGS: 00010246
       RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3
       RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478
       RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478
       R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0
       R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400
       FS:  00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0)
       Stack:
        ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80
        ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468
        0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a
       Call Trace:
        [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4
        [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43
        [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15
        [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16
        [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd
        [<ffffffff811f3bdf>] add_disk+0xe4/0x29c
        [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd]
        [<ffffffffa007e000>] ? 0xffffffffa007dfff
        [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e
        [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3
        [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b
       Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83
        7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49
       RIP  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
        RSP <ffff8801009f1de8>
       ---[ end trace 753285ffbf72c57c ]---
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      3b271082
    • N
      nbd: pass MSG_* flags to kernel_recvmsg() · 35fbf5bc
      Namhyung Kim 提交于
      Unlike kernel_sendmsg(), kernel_recvmsg() requires passing flags explicitly
      via last parameter instead of struct msghdr.msg_flags. Therefore calls to
      sock_xmit(lo, 0, ..., MSG_WAITALL) have not been processed properly by tcp
      layer wrt. the flag. Fix it.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      35fbf5bc
  19. 12 2月, 2011 1 次提交
    • S
      nbd: remove module-level ioctl mutex · de1f016f
      Soren Hansen 提交于
      Commit 2a48fc0a ("block: autoconvert trivial BKL users to private
      mutex") replaced uses of the BKL in the nbd driver with mutex
      operations.  Since then, I've been been seeing these lock ups:
      
       INFO: task qemu-nbd:16115 blocked for more than 120 seconds.
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       qemu-nbd      D 0000000000000001     0 16115  16114 0x00000004
        ffff88007d775d98 0000000000000082 ffff88007d775fd8 ffff88007d774000
        0000000000013a80 ffff8800020347e0 ffff88007d775fd8 0000000000013a80
        ffff880133730000 ffff880002034440 ffffea0004333db8 ffffffffa071c020
       Call Trace:
        [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180
        [<ffffffff815b93eb>] mutex_lock+0x2b/0x50
        [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd]
        [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730
        [<ffffffff811967a1>] block_ioctl+0x41/0x50
        [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370
        [<ffffffff81175f61>] sys_ioctl+0x81/0xa0
        [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b
      
      Instrumenting the nbd module's ioctl handler with some extra logging
      clearly shows the NBD_DO_IT ioctl being invoked which is a long-lived
      ioctl in the sense that it doesn't return until another ioctl asks the
      driver to disconnect.  However, that other ioctl blocks, waiting for the
      module-level mutex that replaced the BKL, and then we're stuck.
      
      This patch removes the module-level mutex altogether.  It's clearly
      wrong, and as far as I can see, it's entirely unnecessary, since the nbd
      driver maintains per-device mutexes, and I don't see anything that would
      require a module-level (or kernel-level, for that matter) mutex.
      Signed-off-by: NSoren Hansen <soren@linux2go.dk>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NPaul Clements <paul.clements@steeleye.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@kernel.org>		[2.6.37.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de1f016f