1. 16 11月, 2020 2 次提交
  2. 08 10月, 2020 1 次提交
  3. 03 10月, 2020 1 次提交
    • M
      scsi: sd: Allow user to configure command retries · 0610959f
      Mike Christie 提交于
      Some iSCSI targets went with the traditional "export N ports" approach and
      then allowed the initiator to multipath over them. Other targets went the
      opposite direction and export a single port, and then software on the
      target side performs load balancing and failover to other targets via an
      iSCSI specific feature or IP takover.
      
      The problem for the 2nd type of config is we quickly run out of our five
      retries and get I/O errors. In these setups we want to reduce resource use
      on the initiator side so we only wanted the one session and no
      dm-multipath.  To handle traditional multipath operations like failover we
      do IP takover on the target side. So we would have an iSCSI target running
      on node1. Some monitoring software decides it's dead or the node is
      overloaded so it starts the iSCSI target on node2. The problem is for the
      failover case where we might have the equivalent of a dm-multipath
      temporary all paths down, or we just have to try more than 5 nodes before
      finding a good one.
      
      To handle this type of issue allow the user to configure the disk cmd
      retries from -1 to the current max of 5. -1 means infinite retries and
      should be used for setups where some other setting is going to control when
      to fail. For example iSCSI has the replacement/recovery timeout and fc
      (some users have used FC with NPIV and done something similar as IP
      takover) has dev_loss_tmo/fast_io_fail which will eventually expire and
      fail I/O.
      
      Link: https://lore.kernel.org/r/1601566554-26752-3-git-send-email-michael.christie@oracle.comReviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMike Christie <michael.christie@oracle.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      0610959f
  4. 16 9月, 2020 2 次提交
  5. 10 9月, 2020 1 次提交
  6. 02 9月, 2020 1 次提交
  7. 05 8月, 2020 1 次提交
    • D
      scsi: sd_zbc: Improve zone revalidation · a3d8a257
      Damien Le Moal 提交于
      Currently, for zoned disks, since blk_revalidate_disk_zones() requires the
      disk capacity to be set already to operate correctly, zones revalidation
      can only be done on the second revalidate scan once the gendisk capacity is
      set at the end of the first scan. As a result, if zone revalidation fails,
      there is no second chance to recover from the failure and the disk capacity
      is changed to 0, with the disk left unusable.
      
      This can be improved by shuffling around code, specifically, by moving the
      call to sd_zbc_revalidate_zones() from sd_zbc_read_zones() to the end of
      sd_revalidate_disk(), after set_capacity_revalidate_and_notify() is called
      to set the gendisk capacity. With this change, if sd_zbc_revalidate_zones()
      fails on the first scan, the second scan will call it again to recover, if
      possible.
      
      Using the new struct scsi_disk fields rev_nr_zones and rev_zone_blocks,
      sd_zbc_revalidate_zones() does actual work only if it detects a change with
      the disk zone configuration. This means that for a successful zones
      revalidation on the first scan, the second scan will not cause another
      heavy full check.
      
      While at it, remove the unecesary "extern" declaration of
      sd_zbc_read_zones().
      
      Link: https://lore.kernel.org/r/20200731054928.668547-1-damien.lemoal@wdc.comReviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a3d8a257
  8. 08 7月, 2020 1 次提交
  9. 20 5月, 2020 1 次提交
    • D
      scsi: sd: Add zoned capabilities device attribute · c5f88522
      Damien Le Moal 提交于
      Export through sysfs as a scsi_disk attribute the zoned capabilities of a
      disk ("zoned_cap" attribute file). This new attribute indicates in human
      readable form (i.e. a string) the zoned block capabilities implemented by
      the disk as found in the ZONED field of the disk block device
      characteristics VPD page. The possible values are:
      
       - "none": ZONED=00b (not reported), regular disk
      
       - "host-aware": ZONED=01b, host-aware ZBC disk
      
       - "drive-managed": ZONED=10b, drive-managed ZBC disk (regular disk
         interface)
      
      For completeness, also add the following value which is detected using the
      device type rather than the ZONED field:
      
       - "host-managed": device type = 0x14 (TYPE_ZBC), host-managed ZBC disk
      
      This new sysfs attribute is purely informational and complementary to the
      "zoned" device request queue sysfs attribute as it allows applications and
      user daemons (e.g.  udev) to easily differentiate regular disks from
      drive-managed SMR disks without the need for direct access tools such as
      provided by sg3utils.
      
      Link: https://lore.kernel.org/r/20200515054856.1408575-1-damien.lemoal@wdc.comReviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c5f88522
  10. 15 5月, 2020 1 次提交
  11. 13 5月, 2020 1 次提交
    • J
      scsi: sd_zbc: emulate ZONE_APPEND commands · 5795eb44
      Johannes Thumshirn 提交于
      Emulate ZONE_APPEND for SCSI disks using a regular WRITE(16) command
      with a start LBA set to the target zone write pointer position.
      
      In order to always know the write pointer position of a sequential write
      zone, the write pointer of all zones is tracked using an array of 32bits
      zone write pointer offset attached to the scsi disk structure. Each
      entry of the array indicate a zone write pointer position relative to
      the zone start sector. The write pointer offsets are maintained in sync
      with the device as follows:
      1) the write pointer offset of a zone is reset to 0 when a
         REQ_OP_ZONE_RESET command completes.
      2) the write pointer offset of a zone is set to the zone size when a
         REQ_OP_ZONE_FINISH command completes.
      3) the write pointer offset of a zone is incremented by the number of
         512B sectors written when a write, write same or a zone append
         command completes.
      4) the write pointer offset of all zones is reset to 0 when a
         REQ_OP_ZONE_RESET_ALL command completes.
      
      Since the block layer does not write lock zones for zone append
      commands, to ensure a sequential ordering of the regular write commands
      used for the emulation, the target zone of a zone append command is
      locked when the function sd_zbc_prepare_zone_append() is called from
      sd_setup_read_write_cmnd(). If the zone write lock cannot be obtained
      (e.g. a zone append is in-flight or a regular write has already locked
      the zone), the zone append command dispatching is delayed by returning
      BLK_STS_ZONE_RESOURCE.
      
      To avoid the need for write locking all zones for REQ_OP_ZONE_RESET_ALL
      requests, use a spinlock to protect accesses and modifications of the
      zone write pointer offsets. This spinlock is initialized from sd_probe()
      using the new function sd_zbc_init().
      Co-developed-by: NDamien Le Moal <Damien.LeMoal@wdc.com>
      Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5795eb44
  12. 25 3月, 2020 1 次提交
    • M
      scsi: sd: Fix optimal I/O size for devices that change reported values · ea697a8b
      Martin K. Petersen 提交于
      Some USB bridge devices will return a default set of characteristics during
      initialization. And then, once an attached drive has spun up, substitute
      the actual parameters reported by the drive. According to the SCSI spec,
      the device should return a UNIT ATTENTION in case any reported parameters
      change. But in this case the change is made silently after a small window
      where default values are reported.
      
      Commit a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple of
      physical block size") validated the reported optimal I/O size against the
      physical block size to overcome problems with devices reporting nonsensical
      transfer sizes. However, this validation did not account for the fact that
      aforementioned devices will return default values during a brief window
      during spin-up. The subsequent change in reported characteristics would
      invalidate the checking that had previously been performed.
      
      Unset a previously configured optimal I/O size should the sanity checking
      fail on subsequent revalidate attempts.
      
      Link: https://lore.kernel.org/r/33fb522e-4f61-1b76-914f-c9e6a3553c9b@gmail.com
      Cc: Bryan Gurney <bgurney@redhat.com>
      Cc: <stable@vger.kernel.org>
      Reported-by: NBernhard Sulzer <micraft.b@gmail.com>
      Tested-by: NBernhard Sulzer <micraft.b@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ea697a8b
  13. 19 3月, 2020 1 次提交
  14. 27 1月, 2020 1 次提交
  15. 10 1月, 2020 1 次提交
  16. 03 1月, 2020 1 次提交
  17. 27 11月, 2019 1 次提交
  18. 07 11月, 2019 1 次提交
  19. 25 10月, 2019 1 次提交
  20. 23 10月, 2019 1 次提交
    • A
      scsi: sd: enable compat ioctls for sed-opal · 142b2ac8
      Arnd Bergmann 提交于
      The sed_ioctl() function is written to be compatible between
      32-bit and 64-bit processes, however compat mode is only
      wired up for nvme, not for sd.
      
      Add the missing call to sed_ioctl() in sd_compat_ioctl().
      
      Fixes: d80210f2 ("sd: add support for TCG OPAL self encrypting disks")
      Cc: linux-scsi@vger.kernel.org
      Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      142b2ac8
  21. 01 10月, 2019 2 次提交
  22. 18 9月, 2019 1 次提交
  23. 08 9月, 2019 1 次提交
  24. 05 8月, 2019 1 次提交
  25. 19 6月, 2019 2 次提交
    • B
      scsi: sd: Inline sd_probe_part2() · 82a54da6
      Bart Van Assche 提交于
      Make sd_probe() easier to read by inlining sd_probe_part2(). This patch
      does not change any functionality.
      
      [mkp: applied by hand]
      
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      82a54da6
    • B
      scsi: sd: Rely on the driver core for asynchronous probing · f049cf1a
      Bart Van Assche 提交于
      As explained during the 2018 LSF/MM session about increasing SCSI disk
      probing concurrency, the problems with the current probing approach are as
      follows:
      
       - The driver core is unaware of asynchronous SCSI LUN probing.
         wait_for_device_probe() waits for all asynchronous probes except
         asynchronous SCSI disk probes.
      
       - There is unnecessary serialization between sd_probe() and sd_remove().
         This can lead to a deadlock.
      
      Hence this patch that modifies the sd driver such that it uses the driver
      core framework for asynchronous probing. The async domain and
      get_device()/put_device() pairs that became superfluous due to this change
      are removed.
      
      This patch does not affect the time needed for loading the scsi_debug
      kernel module with parameters delay=0 and max_luns=256.
      
      This patch depends on commit ef0ff683 ("driver core: Probe devices
      asynchronously instead of the driver") that went upstream in kernel version
      v5.1-rc1.
      
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      f049cf1a
  26. 21 5月, 2019 2 次提交
  27. 20 5月, 2019 1 次提交
    • M
      Revert "scsi: sd: Keep disk read-only when re-reading partition" · 8acf608e
      Martin K. Petersen 提交于
      This reverts commit 20bd1d02.
      
      This patch introduced regressions for devices that come online in
      read-only state and subsequently switch to read-write.
      
      Given how the partition code is currently implemented it is not
      possible to persist the read-only flag across a device revalidate
      call. This may need to get addressed in the future since it is common
      for user applications to proactively call BLKRRPART.
      
      Reverting this commit will re-introduce a regression where a
      device-initiated revalidate event will cause the admin state to be
      forgotten. A separate patch will address this issue.
      
      Fixes: 20bd1d02 ("scsi: sd: Keep disk read-only when re-reading partition")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      8acf608e
  28. 30 4月, 2019 2 次提交
  29. 13 4月, 2019 1 次提交
    • M
      block: disk_events: introduce event flags · c92e2f04
      Martin Wilck 提交于
      Currently, an empty disk->events field tells the block layer not to
      forward media change events to user space. This was done in commit
      7c88a168 ("block: don't propagate unlisted DISK_EVENTs to userland")
      in order to avoid events from "fringe" drivers to be forwarded to user
      space. By doing so, the block layer lost the information which events
      were supported by a particular block device, and most importantly,
      whether or not a given device supports media change events at all.
      
      Prepare for not interpreting the "events" field this way in the future
      any more. This is done by adding an additional field "event_flags" to
      struct gendisk, and two flag bits that can be set to have the device
      treated like one that had the "events" field set to a non-zero value
      before. This applies only to the sd and sr drivers, which are changed to
      set the new flags.
      
      The new flags are DISK_EVENT_FLAG_POLL to enforce polling of the device
      for synchronous events, and DISK_EVENT_FLAG_UEVENT to tell the
      blocklayer to generate udev events from kernel events.
      
      In order to add the event_flags field to struct gendisk, the events
      field is converted to an "unsigned short"; it doesn't need to hold
      values bigger than 2 anyway.
      
      This patch doesn't change behavior.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMartin Wilck <mwilck@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c92e2f04
  30. 07 4月, 2019 1 次提交
    • C
      block: remove CONFIG_LBDAF · 72deb455
      Christoph Hellwig 提交于
      Currently support for 64-bit sector_t and blkcnt_t is optional on 32-bit
      architectures.  These types are required to support block device and/or
      file sizes larger than 2 TiB, and have generally defaulted to on for
      a long time.  Enabling the option only increases the i386 tinyconfig
      size by 145 bytes, and many data structures already always use
      64-bit values for their in-core and on-disk data structures anyway,
      so there should not be a large change in dynamic memory usage either.
      
      Dropping this option removes a somewhat weird non-default config that
      has cause various bugs or compiler warnings when actually used.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      72deb455
  31. 28 3月, 2019 4 次提交
    • M
      scsi: sd: Quiesce warning if device does not report optimal I/O size · 1d5de5bd
      Martin K. Petersen 提交于
      Commit a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple
      of physical block size") split one conditional into several separate
      statements in an effort to provide more accurate warning messages when
      a device reports a nonsensical value. However, this reorganization
      accidentally dropped the precondition of the reported value being
      larger than zero. This lead to a warning getting emitted on devices
      that do not report an optimal I/O size at all.
      
      Remain silent if a device does not report an optimal I/O size.
      
      Fixes: a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple of physical block size")
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: <stable@vger.kernel.org>
      Reported-by: NHussam Al-Tayeb <ht990332@gmx.com>
      Tested-by: NHussam Al-Tayeb <ht990332@gmx.com>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      1d5de5bd
    • B
      scsi: sd: Fix a race between closing an sd device and sd I/O · c14a5726
      Bart Van Assche 提交于
      The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and
      hence needs the disk->private_data pointer. Avoid that that pointer is
      cleared before all affected I/O requests have finished. This patch avoids
      that the following crash occurs:
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      Call trace:
       scsi_mq_uninit_cmd+0x1c/0x30
       scsi_end_request+0x7c/0x1b8
       scsi_io_completion+0x464/0x668
       scsi_finish_command+0xbc/0x160
       scsi_eh_flush_done_q+0x10c/0x170
       sas_scsi_recover_host+0x84c/0xa98 [libsas]
       scsi_error_handler+0x140/0x5b0
       kthread+0x100/0x12c
       ret_from_fork+0x10/0x18
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Jason Yan <yanaijie@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Reported-by: NJason Yan <yanaijie@huawei.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c14a5726
    • B
      scsi: sd: Inline sd_probe_part2() · d16ece57
      Bart Van Assche 提交于
      Make sd_probe() easier to read by inlining sd_probe_part2(). This patch
      does not change any functionality.
      
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d16ece57
    • B
      scsi: sd: Rely on the driver core for asynchronous probing · 21e6ba3f
      Bart Van Assche 提交于
      As explained during the 2018 LSF/MM session about increasing SCSI disk
      probing concurrency, the problems with the current probing approach are as
      follows:
      
      - The driver core is unaware of asynchronous SCSI LUN probing.
        wait_for_device_probe() waits for all asynchronous probes except
        asynchronous SCSI disk probes.
      
      - There is unnecessary serialization between sd_probe() and sd_remove().
        This can lead to a deadlock.
      
      Hence this patch that modifies the sd driver such that it uses the driver
      core framework for asynchronous probing. The async domains and
      get_device()/put_device() pairs that became superfluous due to this change
      are removed.
      
      This patch does not affect the time needed for loading the scsi_debug
      kernel module with parameters delay=0 and max_luns=256.
      
      This patch depends on commit ef0ff683 ("driver core: Probe devices
      asynchronously instead of the driver") that went upstream in kernel version
      v5.1-rc1.
      
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      21e6ba3f