1. 21 1月, 2017 1 次提交
  2. 25 12月, 2016 1 次提交
  3. 09 12月, 2016 1 次提交
    • C
      block: improve handling of the magic discard payload · f9d03f96
      Christoph Hellwig 提交于
      Instead of allocating a single unused biovec for discard requests, send
      them down without any payload.  Instead we allow the driver to add a
      "special" payload using a biovec embedded into struct request (unioned
      over other fields never used while in the driver), and overloading
      the number of segments for this case.
      
      This has a couple of advantages:
      
       - we don't have to allocate the bio_vec
       - the amount of special casing for discard requests in the block
         layer is significantly reduced
       - using this same scheme for other request types is trivial,
         which will be important for implementing the new WRITE_ZEROES
         op on devices where it actually requires a payload (e.g. SCSI)
       - we can get rid of playing games with the request length, as
         we'll never touch it and completions will work just fine
       - it will allow us to support ranged discard operations in the
         future by merging non-contiguous discard bios into a single
         request
       - last but not least it removes a lot of code
      
      This patch is the common base for my WIP series for ranges discards and to
      remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES,
      so it would be good to get it in quickly.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f9d03f96
  4. 09 11月, 2016 1 次提交
  5. 28 10月, 2016 2 次提交
    • C
      block: better op and flags encoding · ef295ecf
      Christoph Hellwig 提交于
      Now that we don't need the common flags to overflow outside the range
      of a 32-bit type we can encode them the same way for both the bio and
      request fields.  This in addition allows us to place the operation
      first (and make some room for more ops while we're at it) and to
      stop having to shift around the operation values.
      
      In addition this allows passing around only one value in the block layer
      instead of two (and eventuall also in the file systems, but we can do
      that later) and thus clean up a lot of code.
      
      Last but not least this allows decreasing the size of the cmd_flags
      field in struct request to 32-bits.  Various functions passing this
      value could also be updated, but I'd like to avoid the churn for now.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      ef295ecf
    • C
      block: split out request-only flags into a new namespace · e8064021
      Christoph Hellwig 提交于
      A lot of the REQ_* flags are only used on struct requests, and only of
      use to the block layer and a few drivers that dig into struct request
      internals.
      
      This patch adds a new req_flags_t rq_flags field to struct request for
      them, and thus dramatically shrinks the number of common requests.  It
      also removes the unfortunate situation where we have to fit the fields
      from the same enum into 32 bits for struct bio and 64 bits for
      struct request.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e8064021
  6. 19 10月, 2016 1 次提交
    • H
      sd: Implement support for ZBC devices · 89d94756
      Hannes Reinecke 提交于
      Implement ZBC support functions to setup zoned disks, both
      host-managed and host-aware models. Only zoned disks that satisfy
      the following conditions are supported:
      1) All zones are the same size, with the exception of an eventual
         last smaller runt zone.
      2) For host-managed disks, reads are unrestricted (reads are not
         failed due to zone or write pointer alignement constraints).
      Zoned disks that do not satisfy these 2 conditions are setup with
      a capacity of 0 to prevent their use.
      
      The function sd_zbc_read_zones, called from sd_revalidate_disk,
      checks that the device satisfies the above two constraints. This
      function may also change the disk capacity previously set by
      sd_read_capacity for devices reporting only the capacity of
      conventional zones at the beginning of the LBA range (i.e. devices
      reporting rc_basis set to 0).
      
      The capacity message output was moved out of sd_read_capacity into
      a new function sd_print_capacity to include this eventual capacity
      change by sd_zbc_read_zones. This new function also includes a call
      to sd_zbc_print_zones to display the number of zones and zone size
      of the device.
      Signed-off-by: NHannes Reinecke <hare@suse.de>
      
      [Damien: * Removed zone cache support
               * Removed mapping of discard to reset write pointer command
               * Modified sd_zbc_read_zones to include checks that the
                 device satisfies the kernel constraints
               * Implemeted REPORT ZONES setup and post-processing based
                 on code from Shaun Tancheff <shaun.tancheff@seagate.com>
               * Removed confusing use of 512B sector units in functions
                 interface]
      Signed-off-by: NDamien Le Moal <damien.lemoal@hgst.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com>
      Tested-by: NShaun Tancheff <shaun.tancheff@seagate.com>
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      89d94756
  7. 15 9月, 2016 1 次提交
  8. 19 7月, 2016 1 次提交
  9. 28 6月, 2016 1 次提交
    • D
      block: convert to device_add_disk() · 0d52c756
      Dan Williams 提交于
      For block drivers that specify a parent device, convert them to use
      device_add_disk().
      
      This conversion was done with the following semantic patch:
      
          @@
          struct gendisk *disk;
          expression E;
          @@
      
          - disk->driverfs_dev = E;
          ...
          - add_disk(disk);
          + device_add_disk(E, disk);
      
          @@
          struct gendisk *disk;
          expression E1, E2;
          @@
      
          - disk->driverfs_dev = E1;
          ...
          E2 = disk;
          ...
          - add_disk(E2);
          + device_add_disk(E1, E2);
      
      ...plus some manual fixups for a few missed conversions.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0d52c756
  10. 08 6月, 2016 3 次提交
  11. 02 6月, 2016 1 次提交
  12. 23 5月, 2016 1 次提交
  13. 13 4月, 2016 2 次提交
  14. 05 4月, 2016 2 次提交
    • H
      scsi: Do not attach VPD to devices that don't support it · 5ddfe085
      Hannes Reinecke 提交于
      The patch "scsi: rescan VPD attributes" introduced a regression in which
      devices that don't support VPD were being scanned for VPD attributes
      anyway.  This could cause issues for some devices and should be avoided
      so the check for scsi_level has been moved out of scsi_add_lun and into
      scsi_attach_vpd so that all callers will not scan VPD for devices that
      don't support it.
      
      [mkp: Merge fix]
      
      Fixes: 09e2b0b1 ("scsi: rescan VPD attributes")
      Cc: <stable@vger.kernel.org> #v4.5+
      Suggested-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      5ddfe085
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  15. 01 4月, 2016 1 次提交
  16. 15 3月, 2016 1 次提交
  17. 09 3月, 2016 1 次提交
  18. 05 2月, 2016 1 次提交
  19. 27 1月, 2016 1 次提交
  20. 21 1月, 2016 1 次提交
  21. 22 12月, 2015 1 次提交
  22. 26 11月, 2015 2 次提交
    • M
      block/sd: Fix device-imposed transfer length limits · ca369d51
      Martin K. Petersen 提交于
      Commit 4f258a46 ("sd: Fix maximum I/O size for BLOCK_PC requests")
      had the unfortunate side-effect of removing an implicit clamp to
      BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer
      code. This caused problems for some SMR drives.
      
      Debugging this issue revealed a few problems with the existing
      infrastructure since the block layer didn't know how to deal with
      device-imposed limits, only limits set by the I/O controller.
      
       - Introduce a new queue limit, max_dev_sectors, which is used by the
         ULD to signal the maximum sectors for a REQ_TYPE_FS request.
      
       - Ensure that max_dev_sectors is correctly stacked and taken into
         account when overriding max_sectors through sysfs.
      
       - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer
         values for later processing.
      
       - In sd_revalidate() set the queue's max_dev_sectors based on the
         MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value
         is not reported, fall back to a cap based on the CDB TRANSFER LENGTH
         field size.
      
       - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits
         VPD--if reported and sane--to signal the preferred device transfer
         size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS.
      
       - blk_limits_max_hw_sectors() is no longer used and can be removed.
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: sweeneygj@gmx.com
      Tested-by: NArzeets <anatol.pomozov@gmail.com>
      Tested-by: NDavid Eisner <david.eisner@oriel.oxon.org>
      Tested-by: NMario Kicherer <dev@kicherer.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ca369d51
    • M
      sd: Make discard granularity match logical block size when LBPRZ=1 · 39773722
      Martin K. Petersen 提交于
      A device may report an OPTIMAL UNMAP GRANULARITY and UNMAP GRANULARITY
      ALIGNMENT in the Block Limits VPD. These parameters describe the
      device's internal provisioning allocation units. By default the block
      layer will round and align any discard requests based on these limits.
      
      If a device reports LBPRZ=1 to guarantee zeroes after discard, however,
      it is imperative that the block layer does not leave out any parts of
      the requested block range. Otherwise the device can not do the required
      zeroing of any partial allocation units and this can lead to data
      corruption.
      
      Since the dm thinp personality relies on the block layer's current
      behavior and is unable to deal with partial discard blocks we work
      around the problem by setting the granularity to match the logical block
      size when LBPRZ is enabled.
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      39773722
  23. 12 11月, 2015 1 次提交
    • G
      sd: Clear PS bit before Mode Select. · 2c5d16d6
      Gabriel Krisman Bertazi 提交于
      According to SPC-4, in a Mode Select, the PS bit in Mode Pages is
      reserved and must be set to 0 by the driver.  In the sd implementation,
      function cache_type_store does a Mode Sense, which might set the PS bit
      on the read buffer, followed by a Mode Select, which receives the same
      buffer, without explicitly clearing the PS bit.  So, in cases where
      target supports saving the Mode Page to a non-volatile location, we end
      up doing a Mode Select with the PS bit set, which could cause an illegal
      request error if the target is checking this.
      
      This was observed on a new firmware change, which was subsequently
      reverted, but this changes sd.c to be more compliant with SPC-4.
      
      This patch clears the PS bit in the buffer returned by Mode Select,
      right before it is used in the Mode Select command.
      Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      2c5d16d6
  24. 22 10月, 2015 2 次提交
  25. 13 8月, 2015 1 次提交
  26. 17 7月, 2015 1 次提交
  27. 25 5月, 2015 1 次提交
  28. 19 5月, 2015 1 次提交
  29. 17 4月, 2015 1 次提交
  30. 11 4月, 2015 1 次提交
    • J
      sd, mmc, virtio_blk, string_helpers: fix block size units · b9f28d86
      James Bottomley 提交于
      The current string_get_size() overflows when the device size goes over
      2^64 bytes because the string helper routine computes the suffix from
      the size in bytes.  However, the entirety of Linux thinks in terms of
      blocks, not bytes, so this will artificially induce an overflow on very
      large devices.  Fix this by making the function string_get_size() take
      blocks and the block size instead of bytes.  This should allow us to
      keep working until the current SCSI standard overflows.
      
      Also fix virtio_blk and mmc (both of which were also artificially
      multiplying by the block size to pass a byte side to string_get_size()).
      
      The mathematics of this is pretty simple:  we're taking a product of
      size in blocks (S) and block size (B) and trying to re-express this in
      exponential form: S*B = R*N^E (where N, the exponent is either 1000 or
      1024) and R < N.  Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB
      < N it's easy to see that S*B = RS*RB*N^(ES+EB).  However, if RS*BS > N,
      we can see that this can be re-expressed as RS*BS = R*N (where R =
      RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1)
      
      [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>]
      Acked-by: NUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJames Bottomley <JBottomley@Odin.com>
      b9f28d86
  31. 19 3月, 2015 1 次提交
  32. 02 2月, 2015 1 次提交
  33. 09 1月, 2015 1 次提交