1. 11 1月, 2010 1 次提交
  2. 29 12月, 2009 1 次提交
  3. 21 12月, 2009 1 次提交
  4. 16 12月, 2009 1 次提交
    • J
      block: temporarily disable discard granularity · b568be62
      Jens Axboe 提交于
      Commit 86b37281 adds a check for
      misaligned stacking offsets, but it's buggy since the defaults are 0.
      Hence all dm devices that pass in a non-zero starting offset will
      be marked as misaligned amd dm will complain.
      
      A real fix is coming, in the mean time disable the discard granularity
      check so that users don't worry about dm reporting about misaligned
      devices.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      b568be62
  5. 03 12月, 2009 1 次提交
    • M
      block: Allow devices to indicate whether discarded blocks are zeroed · 98262f27
      Martin K. Petersen 提交于
      The discard ioctl is used by mkfs utilities to clear a block device
      prior to putting metadata down.  However, not all devices return zeroed
      blocks after a discard.  Some drives return stale data, potentially
      containing old superblocks.  It is therefore important to know whether
      discarded blocks are properly zeroed.
      
      Both ATA and SCSI drives have configuration bits that indicate whether
      zeroes are returned after a discard operation.  Implement a block level
      interface that allows this information to be bubbled up the stack and
      queried via a new block device ioctl.
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      98262f27
  6. 11 11月, 2009 1 次提交
  7. 10 11月, 2009 1 次提交
  8. 12 10月, 2009 1 次提交
  9. 02 10月, 2009 6 次提交
    • C
      block: allow large discard requests · 67efc925
      Christoph Hellwig 提交于
      Currently we set the bio size to the byte equivalent of the blocks to
      be trimmed when submitting the initial DISCARD ioctl.  That means it
      is subject to the max_hw_sectors limitation of the HBA which is
      much lower than the size of a DISCARD request we can support.
      Add a separate max_discard_sectors tunable to limit the size for discard
      requests.
      
      We limit the max discard request size in bytes to 32bit as that is the
      limit for bio->bi_size.  This could be much larger if we had a way to pass
      that information through the block layer.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      67efc925
    • C
      block: use normal I/O path for discard requests · c15227de
      Christoph Hellwig 提交于
      prepare_discard_fn() was being called in a place where memory allocation
      was effectively impossible.  This makes it inappropriate for all but
      the most trivial translations of Linux's DISCARD operation to the block
      command set.  Additionally adding a payload there makes the ownership
      of the bio backing unclear as it's now allocated by the device driver
      and not the submitter as usual.
      
      It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
      the queue supports discard operations or not.  blkdev_issue_discard now
      allocates a one-page, sector-length payload which is the right thing
      for the common ATA and SCSI implementations.
      
      The mtd implementation of prepare_discard_fn() is replaced with simply
      checking for the request being a discard.
      
      Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
      which did the prepare_discard_fn but not the different payload allocation
      yet.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c15227de
    • C
      block: allow large discard requests · ca80650c
      Christoph Hellwig 提交于
      Currently we set the bio size to the byte equivalent of the blocks to
      be trimmed when submitting the initial DISCARD ioctl.  That means it
      is subject to the max_hw_sectors limitation of the HBA which is
      much lower than the size of a DISCARD request we can support.
      Add a separate max_discard_sectors tunable to limit the size for discard
      requests.
      
      We limit the max discard request size in bytes to 32bit as that is the
      limit for bio->bi_size.  This could be much larger if we had a way to pass
      that information through the block layer.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      ca80650c
    • C
      block: use normal I/O path for discard requests · 1122a26f
      Christoph Hellwig 提交于
      prepare_discard_fn() was being called in a place where memory allocation
      was effectively impossible.  This makes it inappropriate for all but
      the most trivial translations of Linux's DISCARD operation to the block
      command set.  Additionally adding a payload there makes the ownership
      of the bio backing unclear as it's now allocated by the device driver
      and not the submitter as usual.
      
      It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
      the queue supports discard operations or not.  blkdev_issue_discard now
      allocates a one-page, sector-length payload which is the right thing
      for the common ATA and SCSI implementations.
      
      The mtd implementation of prepare_discard_fn() is replaced with simply
      checking for the request being a discard.
      
      Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
      which did the prepare_discard_fn but not the different payload allocation
      yet.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      1122a26f
    • M
      block: Do not clamp max_hw_sectors for stacking devices · 5dee2477
      Martin K. Petersen 提交于
      Stacking devices do not have an inherent max_hw_sector limit.  Set the
      default to INT_MAX so we are bounded only by capabilities of the
      underlying storage.
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5dee2477
    • M
      block: Set max_sectors correctly for stacking devices · 80ddf247
      Martin K. Petersen 提交于
      The topology changes unintentionally caused SAFE_MAX_SECTORS to be set
      for stacking devices.  Set the default limit to BLK_DEF_MAX_SECTORS and
      provide SAFE_MAX_SECTORS in blk_queue_make_request() for legacy hw
      drivers that depend on the old behavior.
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      80ddf247
  10. 14 9月, 2009 1 次提交
  11. 01 8月, 2009 4 次提交
  12. 28 7月, 2009 1 次提交
  13. 19 6月, 2009 1 次提交
  14. 18 6月, 2009 1 次提交
  15. 16 6月, 2009 2 次提交
  16. 12 6月, 2009 1 次提交
  17. 09 6月, 2009 2 次提交
  18. 03 6月, 2009 1 次提交
  19. 28 5月, 2009 1 次提交
  20. 23 5月, 2009 4 次提交
  21. 22 4月, 2009 1 次提交
    • T
      block: fix queue bounce limit setting · cd0aca2d
      Tejun Heo 提交于
      Impact: don't set GFP_DMA in q->bounce_gfp unnecessarily
      
      All DMA address limits are expressed in terms of the last addressable
      unit (byte or page) instead of one plus that.  However, when
      determining bounce_gfp for 64bit machines in blk_queue_bounce_limit(),
      it compares the specified limit against 0x100000000UL to determine
      whether it's below 4G ending up falsely setting GFP_DMA in
      q->bounce_gfp.
      
      As DMA zone is very small on x86_64, this makes larger SG_IO transfers
      very eager to trigger OOM killer.  Fix it.  While at it, rename the
      parameter to @dma_mask for clarity and convert comment to proper
      winged style.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      cd0aca2d
  22. 07 4月, 2009 1 次提交
  23. 29 12月, 2008 1 次提交
  24. 03 12月, 2008 1 次提交
    • M
      block: fix setting of max_segment_size and seg_boundary mask · 0e435ac2
      Milan Broz 提交于
      Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
      devices.
      
      When stacking devices (LVM over MD over SCSI) some of the request queue
      parameters are not set up correctly in some cases by default, namely
      max_segment_size and and seg_boundary mask.
      
      If you create MD device over SCSI, these attributes are zeroed.
      
      Problem become when there is over this mapping next device-mapper mapping
      - queue attributes are set in DM this way:
      
      request_queue   max_segment_size  seg_boundary_mask
      SCSI                65536             0xffffffff
      MD RAID1                0                      0
      LVM                 65536                 -1 (64bit)
      
      Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
      physical segments according to these parameters.
      
      During the generic_make_request() is segment cout recalculated and can
      increase bio->bi_phys_segments count over the allowed limit.  (After
      bio_clone() in stack operation.)
      
      Thi is specially problem in CCISS driver, where it produce OOPS here
      
          BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
      
      (MAXSEGENTRIES is 31 by default.)
      
      Sometimes even this command is enough to cause oops:
      
        dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10
      
      This command generates bios with 250 sectors, allocated in 32 4k-pages
      (last page uses only 1024 bytes).
      
      For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
      unfortunatelly on lower layer it is recalculated to 32 segments and this
      violates CCISS restriction and triggers BUG_ON().
      
      The patch tries to fix it by:
      
       * initializing attributes above in queue request constructor
         blk_queue_make_request()
      
       * make sure that blk_queue_stack_limits() inherits setting
      
       (DM uses its own function to set the limits because it
       blk_queue_stack_limits() was introduced later.  It should probably switch
       to use generic stack limit function too.)
      
       * sets the default seg_boundary value in one place (blkdev.h)
      
       * use this mask as default in DM (instead of -1, which differs in 64bit)
      
      Bugs related to this:
      https://bugzilla.redhat.com/show_bug.cgi?id=471639
      http://bugzilla.kernel.org/show_bug.cgi?id=8672Signed-off-by: NMilan Broz <mbroz@redhat.com>
      Reviewed-by: NAlasdair G Kergon <agk@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      0e435ac2
  25. 17 10月, 2008 1 次提交
  26. 09 10月, 2008 2 次提交
    • K
      block: add lld busy state exporting interface · ef9e3fac
      Kiyoshi Ueda 提交于
      This patch adds an new interface, blk_lld_busy(), to check lld's
      busy state from the block layer.
      blk_lld_busy() calls down into low-level drivers for the checking
      if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().
      
      This resolves a performance problem on request stacking devices below.
      
      Some drivers like scsi mid layer stop dispatching request when
      they detect busy state on its low-level device like host/target/device.
      It allows other requests to stay in the I/O scheduler's queue
      for a chance of merging.
      
      Request stacking drivers like request-based dm should follow
      the same logic.
      However, there is no generic interface for the stacked device
      to check if the underlying device(s) are busy.
      If the request stacking driver dispatches and submits requests to
      the busy underlying device, the requests will stay in
      the underlying device's queue without a chance of merging.
      This causes performance problem on burst I/O load.
      
      With this patch, busy state of the underlying device is exported
      via q->lld_busy_fn().  So the request stacking driver can check it
      and stop dispatching requests if busy.
      
      The underlying device driver must return the busy state appropriately:
          1: when the device driver can't process requests immediately.
          0: when the device driver can process requests immediately,
             including abnormal situations where the device driver needs
             to kill all requests.
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      ef9e3fac
    • J
      block: unify request timeout handling · 242f9dcb
      Jens Axboe 提交于
      Right now SCSI and others do their own command timeout handling.
      Move those bits to the block layer.
      
      Instead of having a timer per command, we try to be a bit more clever
      and simply have one per-queue. This avoids the overhead of having to
      tear down and setup a timer for each command, so it will result in a lot
      less timer fiddling.
      Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      242f9dcb