1. 10 3月, 2011 10 次提交
    • J
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe 提交于
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      721a9602
    • J
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe 提交于
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7eaceacc
    • T
      pktcdvd: Convert to bdops->check_events() · 3c0d2060
      Tejun Heo 提交于
      Convert from ->media_changed() to ->check_events().
      
      pktcdvd needs to forward all event related operations to the
      underlying device.  Forward ->check_events() instead of
      ->media_changed() and inherit disk->[async_]events.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Peter Osterlund <petero2@telia.com>
      3c0d2060
    • T
      umem: Drop dummy ->media_changed() · 6fac80e3
      Tejun Heo 提交于
      umem doesn't implement media changed detection and there's no need to
      implement dummy callback anymore.  Remove it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      6fac80e3
    • T
      xsysace: Convert to bdops->check_events() · 3a200911
      Tejun Heo 提交于
      Convert from ->media_changed() to ->check_events().
      
      xsysace buffers media changed state and clears it on revalidation.  It
      will behave correctly with kernel event polling.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      3a200911
    • T
      ub: Convert to bdops->check_events() · aaa7c015
      Tejun Heo 提交于
      Convert from ->media_changed() to ->check_events().
      
      ub buffers media changed state and clears it on revalidation.  It will
      behave correctly with kernel event polling.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      aaa7c015
    • T
      swim[3]: Convert to bdops->check_events() · 4bbde777
      Tejun Heo 提交于
      Convert from ->media_changed() to ->check_events().
      
      Both swim and swim3 buffer media changed state and clear it on
      revalidation.  They will behave correctly with kernel event polling.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Laurent Vivier <laurent@lvivier.info>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      4bbde777
    • T
      dac960: Convert to bdops->check_events() · 507daea2
      Tejun Heo 提交于
      Convert from ->media_changed() to ->check_events().
      
      DAC960 media change notification seems to be one way (once set, never
      cleared) and will generate spurious events when polled once the
      condition triggers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      507daea2
    • T
      paride: Convert to bdops->check_events() · b1b56b93
      Tejun Heo 提交于
      Convert paride drivers from ->media_changed() to ->check_events().
      
      pcd and pd buffer and clear events after reporting; however, pf
      unconditionally reports MEDIA_CHANGE and will generate spurious events
      when polled.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Tim Waugh <tim@cyberelk.net>
      b1b56b93
    • T
      floppy,{ami|ata}flop: Convert to bdops->check_events() · 1a8a74f0
      Tejun Heo 提交于
      Convert the floppy drivers from ->media_changed() to ->check_events().
      Both floppy and ataflop buffer media changed state bit and clear them
      on revalidation and will behave correctly with kernel event polling.
      
      I can't tell how amiflop clears its event and it's possible that it
      may generate spurious events when polled.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      1a8a74f0
  2. 04 3月, 2011 1 次提交
  3. 03 3月, 2011 1 次提交
  4. 24 2月, 2011 1 次提交
    • N
      Fix over-zealous flush_disk when changing device size. · 93b270f7
      NeilBrown 提交于
      There are two cases when we call flush_disk.
      In one, the device has disappeared (check_disk_change) so any
      data will hold becomes irrelevant.
      In the oter, the device has changed size (check_disk_size_change)
      so data we hold may be irrelevant.
      
      In both cases it makes sense to discard any 'clean' buffers,
      so they will be read back from the device if needed.
      
      In the former case it makes sense to discard 'dirty' buffers
      as there will never be anywhere safe to write the data.  In the
      second case it *does*not* make sense to discard dirty buffers
      as that will lead to file system corruption when you simply enlarge
      the containing devices.
      
      flush_disk calls __invalidate_devices.
      __invalidate_device calls both invalidate_inodes and invalidate_bdev.
      
      invalidate_inodes *does* discard I_DIRTY inodes and this does lead
      to fs corruption.
      
      invalidate_bev *does*not* discard dirty pages, but I don't really care
      about that at present.
      
      So this patch adds a flag to __invalidate_device (calling it
      __invalidate_device2) to indicate whether dirty buffers should be
      killed, and this is passed to invalidate_inodes which can choose to
      skip dirty inodes.
      
      flusk_disk then passes true from check_disk_change and false from
      check_disk_size_change.
      
      dm avoids tripping over this problem by calling i_size_write directly
      rathher than using check_disk_size_change.
      
      md does use check_disk_size_change and so is affected.
      
      This regression was introduced by commit 608aeef1 which causes
      check_disk_size_change to call flush_disk, so it is suitable for any
      kernel since 2.6.27.
      
      Cc: stable@kernel.org
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Andrew Patterson <andrew.patterson@hp.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      93b270f7
  5. 12 2月, 2011 1 次提交
    • S
      nbd: remove module-level ioctl mutex · de1f016f
      Soren Hansen 提交于
      Commit 2a48fc0a ("block: autoconvert trivial BKL users to private
      mutex") replaced uses of the BKL in the nbd driver with mutex
      operations.  Since then, I've been been seeing these lock ups:
      
       INFO: task qemu-nbd:16115 blocked for more than 120 seconds.
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       qemu-nbd      D 0000000000000001     0 16115  16114 0x00000004
        ffff88007d775d98 0000000000000082 ffff88007d775fd8 ffff88007d774000
        0000000000013a80 ffff8800020347e0 ffff88007d775fd8 0000000000013a80
        ffff880133730000 ffff880002034440 ffffea0004333db8 ffffffffa071c020
       Call Trace:
        [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180
        [<ffffffff815b93eb>] mutex_lock+0x2b/0x50
        [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd]
        [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730
        [<ffffffff811967a1>] block_ioctl+0x41/0x50
        [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370
        [<ffffffff81175f61>] sys_ioctl+0x81/0xa0
        [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b
      
      Instrumenting the nbd module's ioctl handler with some extra logging
      clearly shows the NBD_DO_IT ioctl being invoked which is a long-lived
      ioctl in the sense that it doesn't return until another ioctl asks the
      driver to disconnect.  However, that other ioctl blocks, waiting for the
      module-level mutex that replaced the BKL, and then we're stuck.
      
      This patch removes the module-level mutex altogether.  It's clearly
      wrong, and as far as I can see, it's entirely unnecessary, since the nbd
      driver maintains per-device mutexes, and I don't see anything that would
      require a module-level (or kernel-level, for that matter) mutex.
      Signed-off-by: NSoren Hansen <soren@linux2go.dk>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NPaul Clements <paul.clements@steeleye.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@kernel.org>		[2.6.37.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de1f016f
  6. 19 1月, 2011 4 次提交
  7. 13 1月, 2011 1 次提交
  8. 11 1月, 2011 1 次提交
  9. 06 1月, 2011 1 次提交
  10. 24 12月, 2010 2 次提交
  11. 21 12月, 2010 1 次提交
  12. 20 12月, 2010 1 次提交
  13. 17 12月, 2010 2 次提交
  14. 16 12月, 2010 1 次提交
  15. 02 12月, 2010 1 次提交
  16. 28 11月, 2010 2 次提交
  17. 18 11月, 2010 1 次提交
  18. 17 11月, 2010 2 次提交
    • J
      cciss: fix build for PROC_FS disabled · bbe425cd
      Jens Axboe 提交于
      The recent patch to fix the removal of a non-existing proc
      directory introduced this build problem for !CONFIG_PROC_FS:
      
      drivers/block/cciss.c:4929: error: 'proc_cciss' undeclared (first use in this function)
      
      Fix it by moving proc_cciss outside of the CONFIG_PROC_FS scope.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      bbe425cd
    • J
      SCSI host lock push-down · f281233d
      Jeff Garzik 提交于
      Move the mid-layer's ->queuecommand() invocation from being locked
      with the host lock to being unlocked to facilitate speeding up the
      critical path for drivers who don't need this lock taken anyway.
      
      The patch below presents a simple SCSI host lock push-down as an
      equivalent transformation.  No locking or other behavior should change
      with this patch.  All existing bugs and locking orders are preserved.
      
      Additionally, add one parameter to queuecommand,
      	struct Scsi_Host *
      and remove one parameter from queuecommand,
      	void (*done)(struct scsi_cmnd *)
      
      Scsi_Host* is a convenient pointer that most host drivers need anyway,
      and 'done' is redundant to struct scsi_cmnd->scsi_done.
      
      Minimal code disturbance was attempted with this change.  Most drivers
      needed only two one-line modifications for their host lock push-down.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      Acked-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f281233d
  19. 16 11月, 2010 1 次提交
    • V
      block: fix amiga and atari floppy driver compile warning · 3e9bb2a0
      Vivek Goyal 提交于
      Geert, my crosstool don't produce warning below. I guess this has to do
      something with compiler version.
      
      - Geert noticed following warning during compilation.
      
        drivers/block/amiflop.c:1344: warning: ‘rq’ may be used uninitialized in
        this function
        drivers/block/ataflop.c:1402: warning: ‘rq’ may be used uninitialized in
        this function
      
      - Initialize rq to NULL to fix the warning. If we can't find a suitable request
        to dispatch, this function should return NULL instead of a possibly garbage
        pointer.
      
      - Cross compile tested only. Don't have hardware to test it.
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      3e9bb2a0
  20. 13 11月, 2010 2 次提交
    • T
      block: clean up blkdev_get() wrappers and their users · d4d77629
      Tejun Heo 提交于
      After recent blkdev_get() modifications, open_by_devnum() and
      open_bdev_exclusive() are simple wrappers around blkdev_get().
      Replace them with blkdev_get_by_dev() and blkdev_get_by_path().
      
      blkdev_get_by_dev() is identical to open_by_devnum().
      blkdev_get_by_path() is slightly different in that it doesn't
      automatically add %FMODE_EXCL to @mode.
      
      All users are converted.  Most conversions are mechanical and don't
      introduce any behavior difference.  There are several exceptions.
      
      * btrfs now sets FMODE_EXCL in btrfs_device->mode, so there's no
        reason to OR it explicitly on blkdev_put().
      
      * gfs2, nilfs2 and the generic mount_bdev() now set FMODE_EXCL in
        sb->s_mode.
      
      * With the above changes, sb->s_mode now always should contain
        FMODE_EXCL.  WARN_ON_ONCE() added to kill_block_super() to detect
        errors.
      
      The new blkdev_get_*() functions are with proper docbook comments.
      While at it, add function description to blkdev_get() too.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Joern Engel <joern@lazybastard.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: reiserfs-devel@vger.kernel.org
      Cc: xfs-masters@oss.sgi.com
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      d4d77629
    • T
      block: make blkdev_get/put() handle exclusive access · e525fd89
      Tejun Heo 提交于
      Over time, block layer has accumulated a set of APIs dealing with bdev
      open, close, claim and release.
      
      * blkdev_get/put() are the primary open and close functions.
      
      * bd_claim/release() deal with exclusive open.
      
      * open/close_bdev_exclusive() are combination of open and claim and
        the other way around, respectively.
      
      * bd_link/unlink_disk_holder() to create and remove holder/slave
        symlinks.
      
      * open_by_devnum() wraps bdget() + blkdev_get().
      
      The interface is a bit confusing and the decoupling of open and claim
      makes it impossible to properly guarantee exclusive access as
      in-kernel open + claim sequence can disturb the existing exclusive
      open even before the block layer knows the current open if for another
      exclusive access.  Reorganize the interface such that,
      
      * blkdev_get() is extended to include exclusive access management.
        @holder argument is added and, if is @FMODE_EXCL specified, it will
        gain exclusive access atomically w.r.t. other exclusive accesses.
      
      * blkdev_put() is similarly extended.  It now takes @mode argument and
        if @FMODE_EXCL is set, it releases an exclusive access.  Also, when
        the last exclusive claim is released, the holder/slave symlinks are
        removed automatically.
      
      * bd_claim/release() and close_bdev_exclusive() are no longer
        necessary and either made static or removed.
      
      * bd_link_disk_holder() remains the same but bd_unlink_disk_holder()
        is no longer necessary and removed.
      
      * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev()
        and blkdev_get().  It also has an unexpected extra bdev_read_only()
        test which probably should be moved into blkdev_get().
      
      * open_by_devnum() is modified to take @holder argument and pass it to
        blkdev_get().
      
      Most of bdev open/close operations are unified into blkdev_get/put()
      and most exclusive accesses are tested atomically at the open time (as
      it should).  This cleans up code and removes some, both valid and
      invalid, but unnecessary all the same, corner cases.
      
      open_bdev_exclusive() and open_by_devnum() can use further cleanup -
      rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop
      special features.  Well, let's leave them for another day.
      
      Most conversions are straight-forward.  drbd conversion is a bit more
      involved as there was some reordering, but the logic should stay the
      same.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Brown <neilb@suse.de>
      Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Acked-by: NPhilipp Reisner <philipp.reisner@linbit.com>
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: dm-devel@redhat.com
      Cc: drbd-dev@lists.linbit.com
      Cc: Leo Chen <leochen@broadcom.com>
      Cc: Scott Branden <sbranden@broadcom.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
      Cc: Joern Engel <joern@logfs.org>
      Cc: reiserfs-devel@vger.kernel.org
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      e525fd89
  21. 10 11月, 2010 3 次提交
    • C
      block: remove REQ_HARDBARRIER · 02e031cb
      Christoph Hellwig 提交于
      REQ_HARDBARRIER is dead now, so remove the leftovers.  What's left
      at this point is:
      
       - various checks inside the block layer.
       - sanity checks in bio based drivers.
       - now unused bio_empty_barrier helper.
       - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
         but Xen really needs to sort out it's barrier situaton.
       - setting of ordered tags in uas - dead code copied from old scsi
         drivers.
       - scsi different retry for barriers - it's dead and should have been
         removed when flushes were converted to FS requests.
       - blktrace handling of barriers - removed.  Someone who knows blktrace
         better should add support for REQ_FLUSH and REQ_FUA, though.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      02e031cb
    • M
      block: read i_size with i_size_read() · 77304d2a
      Mike Snitzer 提交于
      Convert direct reads of an inode's i_size to using i_size_read().
      
      i_size_{read,write} use a seqcount to protect reads from accessing
      incomple writes.  Concurrent i_size_write()s require mutual exclussion
      to protect the seqcount that is used by i_size_{read,write}.  But
      i_size_read() callers do not need to use additional locking.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Acked-by: NLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      77304d2a
    • J
      cciss: fix proc warning on attempt to remove non-existant directory · 90fdb0b9
      Jens Axboe 提交于
      Randy reports that he gets the following stack trace when
      removing the cciss module:
      
      [  109.164277] Pid: 3463, comm: rmmod Not tainted 2.6.37-rc1 #7
      [  109.164280] Call Trace:
      [  109.164292]  [<ffffffff8107eb8d>] warn_slowpath_common+0xc6/0xf3
      [  109.164299]  [<ffffffff8107ecaa>] warn_slowpath_fmt+0x5b/0x6b
      [  109.164307]  [<ffffffff8155175b>] ? _raw_spin_unlock+0x40/0x4b
      [  109.164313]  [<ffffffff8123dd1e>] remove_proc_entry+0x156/0x35e
      [  109.164320]  [<ffffffff812cd91b>] ? do_raw_spin_unlock+0xff/0x10f
      [  109.164327]  [<ffffffff8113823d>] ? trace_hardirqs_on+0x10/0x4a
      [  109.164333]  [<ffffffff8155162d>] ? _raw_spin_unlock_irq+0x4c/0x7b
      [  109.164339]  [<ffffffff8154d4d1>] ? wait_for_common+0x145/0x15e
      [  109.164345]  [<ffffffff81075337>] ? default_wake_function+0x0/0x22
      [  109.164357]  [<ffffffffa0615a8f>] cciss_cleanup+0xa9/0xc7 [cciss]
      [  109.164365]  [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
      [  109.164371]  [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
      [  109.164377]  [<ffffffff810fdfaf>] ? audit_syscall_entry+0x172/0x1a5
      [  109.164383]  [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [  109.164389]  [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
      [  109.164394] ---[ end trace 88e8568246ed0b1d ]---
      
      which will happen if you don't actually have an HP CISS adapter,
      since it'll do an uncondional removal of a proc directory it
      never attempted to create in that case.
      Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Tested-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      90fdb0b9