1. 10 9月, 2010 13 次提交
    • M
      dm: convey that all flushes are processed as empty · b372d360
      Mike Snitzer 提交于
      Rename __clone_and_map_flush to __clone_and_map_empty_flush for added
      clarity.
      
      Simplify logic associated with REQ_FLUSH conditionals.
      
      Introduce a BUG_ON() and add a few more helpful comments to the code
      so that it is clear that all flushes are empty.
      
      Cleanup __split_and_process_bio() so that an empty flush isn't processed
      by a 'sector_count' focused while loop.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      b372d360
    • K
      dm: fix locking context in queue_io() · 05447420
      Kiyoshi Ueda 提交于
      Now queue_io() is called from dec_pending(), which may be called with
      interrupts disabled, so queue_io() must not enable interrupts
      unconditionally and must save/restore the current interrupts status.
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      05447420
    • T
      dm: relax ordering of bio-based flush implementation · 6a8736d1
      Tejun Heo 提交于
      Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't mandate any ordering
      against other bio's.  This patch relaxes ordering around flushes.
      
      * A flush bio is no longer deferred to workqueue directly.  It's
        processed like other bio's but __split_and_process_bio() uses
        md->flush_bio as the clone source.  md->flush_bio is initialized to
        empty flush during md initialization and shared for all flushes.
      
      * As a flush bio now travels through the same execution path as other
        bio's, there's no need for dedicated error handling path either.  It
        can use the same error handling path in dec_pending().  Dedicated
        error handling removed along with md->flush_error.
      
      * When dec_pending() detects that a flush has completed, it checks
        whether the original bio has data.  If so, the bio is queued to the
        deferred list w/ REQ_FLUSH cleared; otherwise, it's completed.
      
      * As flush sequencing is handled in the usual issue/completion path,
        dm_wq_work() no longer needs to handle flushes differently.  Now its
        only responsibility is re-issuing deferred bio's the same way as
        _dm_request() would.  REQ_FLUSH handling logic including
        process_flush() is dropped.
      
      * There's no reason for queue_io() and dm_wq_work() write lock
        dm->io_lock.  queue_io() now only uses md->deferred_lock and
        dm_wq_work() read locks dm->io_lock.
      
      * bio's no longer need to be queued on the deferred list while a flush
        is in progress making DMF_QUEUE_IO_TO_THREAD unncessary.  Drop it.
      
      This avoids stalling the device during flushes and simplifies the
      implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6a8736d1
    • T
      dm: implement REQ_FLUSH/FUA support for request-based dm · 29e4013d
      Tejun Heo 提交于
      This patch converts request-based dm to support the new REQ_FLUSH/FUA.
      
      The original request-based flush implementation depended on
      request_queue blocking other requests while a barrier sequence is in
      progress, which is no longer true for the new REQ_FLUSH/FUA.
      
      In general, request-based dm doesn't have infrastructure for cloning
      one source request to multiple targets, but the original flush
      implementation had a special mostly independent path which can issue
      flushes to multiple targets and sequence them.  However, the
      capability isn't currently in use and adds a lot of complexity.
      Moreoever, it's unlikely to be useful in its current form as it
      doesn't make sense to be able to send out flushes to multiple targets
      when write requests can't be.
      
      This patch rips out special flush code path and deals handles
      REQ_FLUSH/FUA requests the same way as other requests.  The only
      special treatment is that REQ_FLUSH requests use the block address 0
      when finding target, which is enough for now.
      
      * added BUG_ON(!dm_target_is_valid(ti)) in dm_request_fn() as
        suggested by Mike Snitzer
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Tested-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      29e4013d
    • T
      dm: implement REQ_FLUSH/FUA support for bio-based dm · d87f4c14
      Tejun Heo 提交于
      This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
      now deprecated REQ_HARDBARRIER.
      
      * -EOPNOTSUPP handling logic dropped.
      
      * Preflush is handled as before but postflush is dropped and replaced
        with passing down REQ_FUA to member request_queues.  This replaces
        one array wide cache flush w/ member specific FUA writes.
      
      * __split_and_process_bio() now calls __clone_and_map_flush() directly
        for flushes and guarantees all FLUSH bio's going to targets are zero
      `  length.
      
      * It's now guaranteed that all FLUSH bio's which are passed onto dm
        targets are zero length.  bio_empty_barrier() tests are replaced
        with REQ_FLUSH tests.
      
      * Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.
      
      * Dropped unlikely() around REQ_FLUSH tests.  Flushes are not unlikely
        enough to be marked with unlikely().
      
      * Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
        doesn't support cache flushing.  Advertise REQ_FLUSH | REQ_FUA
        capability.
      
      * Request based dm isn't converted yet.  dm_init_request_based_queue()
        resets flush support to 0 for now.  To avoid disturbing request
        based dm code, dm->flush_error is added for bio based dm while
        requested based dm continues to use dm->barrier_error.
      
      Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
      proceed with caution as I'm not familiar with the code base.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: dm-devel@redhat.com
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      d87f4c14
    • T
      md: implment REQ_FLUSH/FUA support · e9c7469b
      Tejun Heo 提交于
      This patch converts md to support REQ_FLUSH/FUA instead of now
      deprecated REQ_HARDBARRIER.  In the core part (md.c), the following
      changes are notable.
      
      * Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA don't interfere with
        processing of other requests and thus there is no reason to mark the
        queue congested while FLUSH/FUA is in progress.
      
      * REQ_FLUSH/FUA failures are final and its users don't need retry
        logic.  Retry logic is removed.
      
      * Preflush needs to be issued to all member devices but FUA writes can
        be handled the same way as other writes - their processing can be
        deferred to request_queue of member devices.  md_barrier_request()
        is renamed to md_flush_request() and simplified accordingly.
      
      For linear, raid0 and multipath, the core changes are enough.  raid1,
      5 and 10 need the following conversions.
      
      * raid1: Handling of FLUSH/FUA bio's can simply be deferred to
        request_queues of member devices.  Barrier related logic removed.
      
      * raid5: Queue draining logic dropped.  FUA bit is propagated through
        biodrain and stripe resconstruction such that all the updated parts
        of the stripe are written out with FUA writes if any of the dirtying
        writes was FUA.  preread_active_stripes handling in make_request()
        is updated as suggested by Neil Brown.
      
      * raid10: FUA bit needs to be propagated to write clones.
      
      linear, raid0, 1, 5 and 10 tested.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      e9c7469b
    • T
      virtio_blk: drop REQ_HARDBARRIER support · 02c42b7a
      Tejun Heo 提交于
      Remove now unused REQ_HARDBARRIER support.  virtio_blk already
      supports REQ_FLUSH and the usefulness of REQ_FUA for virtio_blk is
      questionable at this point, so there's nothing else to do to support
      new REQ_FLUSH/FUA interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      02c42b7a
    • T
      block/loop: implement REQ_FLUSH/FUA support · 6259f284
      Tejun Heo 提交于
      Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead.  Also,
      instead of checking file->f_op->fsync() directly, look at the value of
      vfs_fsync() and ignore -EINVAL return.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6259f284
    • T
      block: remove spurious uses of REQ_HARDBARRIER · 9cbbdca4
      Tejun Heo 提交于
      REQ_HARDBARRIER is deprecated.  Remove spurious uses in the following
      users.  Please note that other than osdblk, all other uses were
      already spurious before deprecation.
      
      * osdblk: osdblk_rq_fn() won't receive any request with
        REQ_HARDBARRIER set.  Remove the test for it.
      
      * pktcdvd: use of REQ_HARDBARRIER in pkt_generic_packet() doesn't mean
        anything.  Removed.
      
      * aic7xxx_old: Setting MSG_ORDERED_Q_TAG on REQ_HARDBARRIER is
        spurious.  Removed.
      
      * sas_scsi_host: Setting TASK_ATTR_ORDERED on REQ_HARDBARRIER is
        spurious.  Removed.
      
      * scsi_tcq: The ordered tag path wasn't being used anyway.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: James Bottomley <James.Bottomley@suse.de>
      Cc: Peter Osterlund <petero2@telia.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      9cbbdca4
    • T
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo 提交于
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4913efe4
    • T
      block: kill QUEUE_ORDERED_BY_TAG · 6958f145
      Tejun Heo 提交于
      Nobody is making meaningful use of ORDERED_BY_TAG now and queue
      draining for barrier requests will be removed soon which will render
      the advantage of tag ordering moot.  Kill ORDERED_BY_TAG.  The
      following users are affected.
      
      * brd: converted to ORDERED_DRAIN.
      * virtio_blk: ORDERED_TAG path was already marked deprecated.  Removed.
      * xen-blkfront: ORDERED_TAG case dropped.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6958f145
    • T
      block/loop: queue ordered mode should be DRAIN_FLUSH · 589d7ed0
      Tejun Heo 提交于
      loop implements FLUSH using fsync but was incorrectly setting its
      ordered mode to DRAIN.  Change it to DRAIN_FLUSH.  In practice, this
      doesn't change anything as loop doesn't make use of the block layer
      ordered implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      589d7ed0
    • T
      ide: remove unnecessary blk_queue_flushing() test in do_ide_request() · 0da2f509
      Tejun Heo 提交于
      Unplugging from a request function doesn't really help much (it's
      already in the request_fn) and soon block layer will be updated to mix
      barrier sequence with other commands, so there's no need to treat
      queue flushing any differently.
      
      ide was the only user of blk_queue_flushing().  Remove it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      0da2f509
  2. 22 8月, 2010 8 次提交
  3. 21 8月, 2010 7 次提交
  4. 19 8月, 2010 6 次提交
  5. 18 8月, 2010 6 次提交
    • N
      tty: fix fu_list abuse · d996b62a
      Nick Piggin 提交于
      tty: fix fu_list abuse
      
      tty code abuses fu_list, which causes a bug in remount,ro handling.
      
      If a tty device node is opened on a filesystem, then the last link to the inode
      removed, the filesystem will be allowed to be remounted readonly. This is
      because fs_may_remount_ro does not find the 0 link tty inode on the file sb
      list (because the tty code incorrectly removed it to use for its own purpose).
      This can result in a filesystem with errors after it is marked "clean".
      
      Taking idea from Christoph's initial patch, allocate a tty private struct
      at file->private_data and put our required list fields in there, linking
      file and tty. This makes tty nodes behave the same way as other device nodes
      and avoid meddling with the vfs, and avoids this bug.
      
      The error handling is not trivial in the tty code, so for this bugfix, I take
      the simple approach of using __GFP_NOFAIL and don't worry about memory errors.
      This is not a problem because our allocator doesn't fail small allocs as a rule
      anyway. So proper error handling is left as an exercise for tty hackers.
      
      [ Arguably filesystem's device inode would ideally be divorced from the
      driver's pseudo inode when it is opened, but in practice it's not clear whether
      that will ever be worth implementing. ]
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d996b62a
    • N
      fs: cleanup files_lock locking · ee2ffa0d
      Nick Piggin 提交于
      fs: cleanup files_lock locking
      
      Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
      manipulate the per-sb files list; unexport the files_lock spinlock.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ee2ffa0d
    • N
      fs: fs_struct rwlock to spinlock · 2a4419b5
      Nick Piggin 提交于
      fs: fs_struct rwlock to spinlock
      
      struct fs_struct.lock is an rwlock with the read-side used to protect root and
      pwd members while taking references to them. Taking a reference to a path
      typically requires just 2 atomic ops, so the critical section is very small.
      Parallel read-side operations would have cacheline contention on the lock, the
      dentry, and the vfsmount cachelines, so the rwlock is unlikely to ever give a
      real parallelism increase.
      
      Replace it with a spinlock to avoid one or two atomic operations in typical
      path lookup fastpath.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2a4419b5
    • A
      pxa3xx: fix ns2cycle equation · 93b352fc
      Axel Lin 提交于
      Test on a PXA310 platform with Samsung K9F2G08X0B NAND flash,
      with tCH=5 and clk is 156MHz, ns2cycle(5, 156000000) returns -1.
      
      ns2cycle returns negtive value will break NDTR0_tXX macros.
      
      After checking the commit log, I found the problem is introduced by
      commit 5b0d4d7c
      "[MTD] [NAND] pxa3xx: convert from ns to clock ticks more accurately"
      
      To get num of clock cycles, we use below equation:
      num of clock cycles = time (ns) / one clock cycle (ns) + 1
      We need to add 1 cycle here because integer division will truncate the result.
      It is possible the developers set the Min values in SPEC for timing settings.
      Thus the truncate may cause problem, and it is safe to add an extra cycle here.
      
      The various fields in NDTR{01} are in units of clock ticks minus one,
      thus we should subtract 1 cycle then.
      
      Thus the correct equation should be:
      num of clock cycles = time (ns) / one clock cycle (ns) + 1 - 1
                          = time (ns) / one clock cycle (ns)
      Signed-off-by: NAxel Lin <axel.lin@gmail.com>
      Signed-off-by: NLei Wen <leiwen@marvell.com>
      Acked-by: NEric Miao <eric.y.miao@gmail.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: stable@kernel.org
      93b352fc
    • N
      md raid-1/10 Fix bio_rw bit manipulations again · 2c7d46ec
      NeilBrown 提交于
      commit 7b6d91da changed the behaviour
      of a few variables in raid1 and raid10 from flags to bit-sets, but
      left them as type 'bool' so they did not work.
      
      Change them (back) to unsigned long.
      (historical note: see 1ef04fef)
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reported-by: Jiri Slaby <jslaby@suse.cz> and many others
      2c7d46ec
    • G
      m68knommu: include sched.h in ColdFire/SPI driver · 5e1c5335
      Greg Ungerer 提交于
      Using the coldfire qspi driver, I get the following error:
      
      drivers/spi/coldfire_qspi.c: In function 'mcfqspi_irq_handler':
      drivers/spi/coldfire_qspi.c:166: error: 'TASK_NORMAL' undeclared (first use in this function)
      drivers/spi/coldfire_qspi.c:166: error: (Each undeclared identifier is reported only once
      
      It is solved by adding the following include to coldfire_sqpi.c:
      
          #include <linux/sched.h>
      
      Fix suggested by Jate Sujjavanich <jsujjavanich@syntech-fuelmaster.com>
      Signed-off-by: NGreg Ungerer <gerg@uclinux.org>
      5e1c5335