1. 29 4月, 2008 4 次提交
  2. 21 4月, 2008 2 次提交
    • A
      block: fix memory hotplug and bouncing in block layer · 2472892a
      Andi Kleen 提交于
      Only noticed this while hacking something else, no test case.
      
      blk_max_low_pfn is initialized once at bootup by the block layer from
      max_low_pfn.  But max_low_pfn is not necessarily constant over the runtime of
      the system when you consider memory hotplug.  What could happen if that
      someone adds memory later the block layer wouldn't get updated and then start
      bouncing memory unnecessarily.
      
      Also on 64bit blk_max_low_pfn actually isn't needed because it just disables
      bouncing essentially and there is no highmem.  And nobody can pass pfns >
      max_low_pfn to the block layer, because those wouldn't have a struct page and
      I suspect block layer wouldn't be very happy without that.
      
      So set BLK_BOUNCE_HIGH to infinity (-1ULL) on 64bit.  That avoids the problem
      of having to update it on memory hotadd.
      
      On 32bit I kept the same behaviour because at least on i386
      memory hotadd only adds HIGHMEM, never lowmem.
      
      BLK_BOUNCE_ANY is always set to infinity on both 32 and 64bit.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Acked-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      2472892a
    • F
      block: move the padding adjustment to blk_rq_map_sg · f18573ab
      FUJITA Tomonori 提交于
      blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule
      that req->data_len (the true data length) is equal to sum(bio). It
      broke the scsi command completion code.
      
      commit e97a294e was introduced to fix
      the above issue. However, the partial completion code doesn't work
      with it. The commit is also a layer violation (scsi mid-layer should
      not know about the block layer's padding).
      
      This patch moves the padding adjustment to blk_rq_map_sg (suggested by
      James). The padding works like the drain buffer. This patch breaks the
      rule that req->data_len is equal to sum(sg), however, the drain buffer
      already broke it. So this patch just restores the rule that
      req->data_len is equal to sub(bio) without breaking anything new.
      
      Now when a low level driver needs padding, blk_rq_map_user and
      blk_rq_map_user_iov guarantee there's enough room for padding.
      blk_rq_map_sg can safely extend the last entry of a scatter list.
      
      blk_rq_map_sg must extend the last entry of a scatter list only for a
      request that got through bio_copy_user_iov. This patches introduces
      new REQ_COPY_USER flag.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f18573ab
  3. 04 3月, 2008 2 次提交
  4. 19 2月, 2008 2 次提交
    • T
      block: implement request_queue->dma_drain_needed · 2fb98e84
      Tejun Heo 提交于
      Draining shouldn't be done for commands where overflow may indicate
      data integrity issues.  Add dma_drain_needed callback to
      request_queue.  Drain buffer is appened iff this function returns
      non-zero.
      Signed-off-by: NTejun Heo <htejun@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      2fb98e84
    • T
      block: add request->raw_data_len · 6b00769f
      Tejun Heo 提交于
      With padding and draining moved into it, block layer now may extend
      requests as directed by queue parameters, so now a request has two
      sizes - the original request size and the extended size which matches
      the size of area pointed to by bios and later by sgs.  The latter size
      is what lower layers are primarily interested in when allocating,
      filling up DMA tables and setting up the controller.
      
      Both padding and draining extend the data area to accomodate
      controller characteristics.  As any controller which speaks SCSI can
      handle underflows, feeding larger data area is safe.
      
      So, this patch makes the primary data length field, request->data_len,
      indicate the size of full data area and add a separate length field,
      request->raw_data_len, for the unmodified request size.  The latter is
      used to report to higher layer (userland) and where the original
      request size should be fed to the controller or device.
      Signed-off-by: NTejun Heo <htejun@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      6b00769f
  5. 08 2月, 2008 1 次提交
    • J
      block: fixup rq_init() a bit · 63a71386
      Jens Axboe 提交于
      Rearrange fields in cache order and initialize some fields that
      we didn't previously init. Remove init of ->completion_data, it's
      part of a union with ->hash. Luckily clearing the rb node is the same
      as setting it to null!
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      63a71386
  6. 01 2月, 2008 2 次提交
  7. 30 1月, 2008 1 次提交
  8. 28 1月, 2008 10 次提交
    • J
      block: implement drain buffers · fa0ccd83
      James Bottomley 提交于
      These DMA drain buffer implementations in drivers are pretty horrible
      to do in terms of manipulating the scatterlist.  Plus they're being
      done at least in drivers/ide and drivers/ata, so we now have code
      duplication.
      
      The one use case for this, as I understand it is AHCI controllers doing
      PIO mode to mmc devices but translating this to DMA at the controller
      level.
      
      So, what about adding a callback to the block layer that permits the
      adding of the drain buffer for the problem devices.  The idea is that
      you'd do this in slave_configure after you find one of these devices.
      
      The beauty of doing it in the block layer is that it quietly adds the
      drain buffer to the end of the sg list, so it automatically gets mapped
      (and unmapped) without anything unusual having to be done to the
      scatterlist in driver/scsi or drivers/ata and without any alteration to
      the transfer length.
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      fa0ccd83
    • J
      io context sharing: preliminary support · d38ecf93
      Jens Axboe 提交于
      Detach task state from ioc, instead keep track of how many processes
      are accessing the ioc.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d38ecf93
    • J
      ioprio: move io priority from task_struct to io_context · fd0928df
      Jens Axboe 提交于
      This is where it belongs and then it doesn't take up space for a
      process that doesn't do IO.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      fd0928df
    • K
      blk_end_request: cleanup 'uptodate' related code (take 4) · 5450d3e1
      Kiyoshi Ueda 提交于
      This patch converts 'uptodate' arguments of no longer exported
      interfaces, end_that_request_first/last, to 'error', and removes
      internal conversions for it in blk_end_request interfaces.
      
      Also, this patch removes no longer needed end_io_error().
      
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5450d3e1
    • K
      blk_end_request: remove/unexport end_that_request_* (take 4) · 3bcddeac
      Kiyoshi Ueda 提交于
      This patch removes the following functions:
        o end_that_request_first()
        o end_that_request_chunk()
      and stops exporting the functions below:
        o end_that_request_last()
      
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      3bcddeac
    • K
      blk_end_request: add bidi completion interface (take 4) · e3a04fe3
      Kiyoshi Ueda 提交于
      This patch adds a variant of the interface, blk_end_bidi_request(),
      which completes a bidi request.
      
      Bidi request must be completed as a whole, both rq and rq->next_rq
      at once.  So the interface has 2 arguments for completion size.
      
      As for ->end_io, only rq->end_io is called (rq->next_rq->end_io is not
      called).  So if special completion handling is needed, the handler
      must be set to rq->end_io.
      And the handler must take care of freeing next_rq too, since
      the interface doesn't care of it if rq->end_io is not NULL.
      
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e3a04fe3
    • K
      blk_end_request: add callback feature (take 4) · e19a3ab0
      Kiyoshi Ueda 提交于
      This patch adds a variant of the interface, blk_end_request_callback(),
      which has driver callback feature.
      
      Drivers may need to do special works between end_that_request_first()
      and end_that_request_last().
      For such drivers, blk_end_request_callback() allows it to pass
      a callback function which is called between end_that_request_first()
      and end_that_request_last().
      
      This interface is only for fallback of other blk_end_request interfaces.
      Drivers should avoid their tricky behaviors and use other interfaces
      as much as possible.
      
      Currently, only one driver, ide-cd, needs this interface.
      So this interface should/will be removed, after the driver removes
      such tricky behaviors.
      
      o ide-cd (cdrom_newpc_intr())
        In PIO mode, cdrom_newpc_intr() needs to defer end_that_request_last()
        until the device clears DRQ_STAT and raises an interrupt after
        end_that_request_first().
        So end_that_request_first() and end_that_request_last() are called
        separately in cdrom_newpc_intr().
      
        This means blk_end_request_callback() has to return without
        completing request even if no leftover in the request.
        To satisfy the requirement, callback function has return value
        so that drivers can tell blk_end_request_callback() to return
        without completing request.
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e19a3ab0
    • K
      blk_end_request: add/export functions to get request size (take 4) · 3b11313a
      Kiyoshi Ueda 提交于
      This patch adds/exports functions to get the size of request in bytes.
      They are useful because blk_end_request interfaces take bytes
      as a completed I/O size instead of sectors.
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      3b11313a
    • K
      blk_end_request: add new request completion interface (take 4) · 336cdb40
      Kiyoshi Ueda 提交于
      This patch adds 2 new interfaces for request completion:
        o blk_end_request()   : called without queue lock
        o __blk_end_request() : called with queue lock held
      
      blk_end_request takes 'error' as an argument instead of 'uptodate',
      which current end_that_request_* take.
      The meanings of values are below and the value is used when bio is
      completed.
          0 : success
        < 0 : error
      
      Some device drivers call some generic functions below between
      end_that_request_{first/chunk} and end_that_request_last().
        o add_disk_randomness()
        o blk_queue_end_tag()
        o blkdev_dequeue_request()
      These are called in the blk_end_request interfaces as a part of
      generic request completion.
      So all device drivers become to call above functions.
      To decide whether to call blkdev_dequeue_request(), blk_end_request
      uses list_empty(&rq->queuelist) (blk_queued_rq() macro is added for it).
      So drivers must re-initialize it using list_init() or so before calling
      blk_end_request if drivers use it for its specific purpose.
      (Currently, there is no driver which completes request without
       re-initializing the queuelist after used it.  So rq->queuelist
       can be used for the purpose above.)
      
      "Normal" drivers can be converted to use blk_end_request()
      in a standard way shown below.
      
       a) end_that_request_{chunk/first}
          spin_lock_irqsave()
          (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
          end_that_request_last()
          spin_unlock_irqrestore()
          => blk_end_request()
      
       b) spin_lock_irqsave()
          end_that_request_{chunk/first}
          (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
          end_that_request_last()
          spin_unlock_irqrestore()
          => spin_lock_irqsave()
             __blk_end_request()
             spin_unlock_irqsave()
      
       c) spin_lock_irqsave()
          (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
          end_that_request_last()
          spin_unlock_irqrestore()
          => blk_end_request()   or   spin_lock_irqsave()
                                      __blk_end_request()
                                      spin_unlock_irqrestore()
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      336cdb40
    • P
      block: allow queue dma_alignment of zero · 482eb689
      Pete Wyckoff 提交于
      Let queue_dma_alignment return 0 if it was specifically set to 0.
      This permits devices with no particular alignment restrictions to
      use arbitrary user space buffers without copying.
      Signed-off-by: NPete Wyckoff <pw@osc.edu>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      482eb689
  9. 27 1月, 2008 1 次提交
  10. 26 1月, 2008 1 次提交
  11. 12 1月, 2008 1 次提交
  12. 09 11月, 2007 1 次提交
  13. 29 10月, 2007 1 次提交
    • J
      [BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps · 6eca9004
      Jens Axboe 提交于
      For the locking to work, only the tag map and tag bit map may be shared
      (incidentally, I was just explaining this to Nick yesterday, but I
      apparently didn't review the code well enough myself). But we also share
      the busy list!  The busy_list must be queue private, or we need a
      block_queue_tag covering lock as well.
      
      So we have to move the busy_list to the queue. This'll work fine, and
      it'll actually also fix a problem with blk_queue_invalidate_tags() which
      will invalidate tags across all shared queues. This is a bit confusing,
      the low level driver should call it for each queue seperately since
      otherwise you cannot kill tags on just a single queue for eg a hard
      drive that stops responding. Since the function has no callers
      currently, it's not an issue.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      6eca9004
  14. 16 10月, 2007 3 次提交
  15. 12 10月, 2007 1 次提交
  16. 10 10月, 2007 5 次提交
    • N
      Remove flush_dry_bio_endio · d24517d7
      NeilBrown 提交于
      The entire function of flush_dry_bio_endio is to undo the effects
      of bio_endio (when called on a barrier request).  So remove the
      function and the call to bio_endio.
      
      This allows us to remove "bi_size" from "struct request_queue".
      Signed-off-by: NNeil Brown <neilb@suse.de>
      
      ### Diffstat output
       ./block/ll_rw_blk.c      |   39 ++-------------------------------------
       ./include/linux/blkdev.h |    1 -
       2 files changed, 2 insertions(+), 38 deletions(-)
      
      diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d24517d7
    • J
      Fix warnings with !CONFIG_BLOCK · f5ff8422
      Jens Axboe 提交于
      Hide everything in blkdev.h with CONFIG_BLOCK isn't set, and fixup
      the (few) files that fail to build because they were relying on blkdev.h
      pulling in extra includes for them.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f5ff8422
    • N
      Stop exporting blk_rq_bio_prep · 66846572
      NeilBrown 提交于
      blk_rq_bio_prep is exported for use in exactly
      one place.  That place can benefit from using
      the new blk_rq_append_bio instead.
      So
        - change dm-emc to call blk_rq_append_bio
        - stop exporting blk_rq_bio_prep, and
        - initialise rq_disk in blk_rq_bio_prep,
             as dm-emc needs it.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      
      diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      66846572
    • N
      New function blk_req_append_bio · 3001ca77
      NeilBrown 提交于
      ll_back_merge_fn is currently exported to SCSI where is it used,
      together with blk_rq_bio_prep, in exactly the same way these
      functions are used in __blk_rq_map_user.
      
      So move the common code into a new function (blk_rq_append_bio), and
      don't export ll_back_merge_fn any longer.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      
      diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      3001ca77
    • N
      Introduce rq_for_each_segment replacing rq_for_each_bio · 5705f702
      NeilBrown 提交于
      Every usage of rq_for_each_bio wraps a usage of
      bio_for_each_segment, so these can be combined into
      rq_for_each_segment.
      
      We define "struct req_iterator" to hold the 'bio' and 'index' that
      are needed for the double iteration.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      
      Various compile fixes by me...
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5705f702
  17. 27 7月, 2007 1 次提交
  18. 24 7月, 2007 1 次提交